IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Count number of duplicate records


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
nayanishpatil

New User


Joined: 16 Aug 2007
Posts: 14
Location: INDIA

PostPosted: Mon Sep 24, 2007 11:56 pm
Reply with quote

Hi,

Is there a way to count the number of duplicate records in an input file.

For example, input file are having the following records:
AAAAAAAAA
VVVVVVVVV
GGGGGGGG
AAAAAAAAA
HHHHHHHHH
AAAAAAAAA
GGGGGGGG
VVVVVVVVV

Then, as we can see that AAAAAAAAA record is occuring 3 times,
VVVVVVVVV record is occuring 2 times and GGGGGGGG records is occuring 2 times.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Sep 25, 2007 12:00 am
Reply with quote

Hello,

Please post what you want your output to look like.
Back to top
View user's profile Send private message
nayanishpatil

New User


Joined: 16 Aug 2007
Posts: 14
Location: INDIA

PostPosted: Tue Sep 25, 2007 12:23 am
Reply with quote

If the input record AAAAAAAAA occurs 3 times, then AAAAAAAAA record should be written in one output file,

VVVVVVVVV output record occurs 2 times, as well as GGGGGGGG record occurs 2 times. Then VVVVVVVVV and GGGGGGGG records must be written to another file.

Similarly, the HHHHHHHHH record must be written to a separate file.
Back to top
View user's profile Send private message
CICS Guy

Senior Member


Joined: 18 Jul 2007
Posts: 2146
Location: At my coffee table

PostPosted: Tue Sep 25, 2007 1:13 am
Reply with quote

Wow, where you got "Count number of duplicate records", I just do not understand....

What happens if XXXXXXXXX occsur 102 times?
Back to top
View user's profile Send private message
Craq Giegerich

Senior Member


Joined: 19 May 2007
Posts: 1512
Location: Virginia, USA

PostPosted: Tue Sep 25, 2007 1:19 am
Reply with quote

I suppose you want to do this without sorting the input file.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Sep 25, 2007 2:14 am
Reply with quote

Hello,

Quote:
If the input record AAAAAAAAA occurs 3 times, then AAAAAAAAA record should be written in one output file,

VVVVVVVVV output record occurs 2 times, as well as GGGGGGGG record occurs 2 times. Then VVVVVVVVV and GGGGGGGG records must be written to another file.

Similarly, the HHHHHHHHH record must be written to a separate file.


Are you saying you want to create a separate file for each unique "count" of the key value and all of the records having that "count" go into the same file?

If there are 900 different "counts" should 900 separate output files be created?

If you explain what you are trying to accomplish, we may be able to offer suggestions.
Back to top
View user's profile Send private message
krisprems

Active Member


Joined: 27 Nov 2006
Posts: 649
Location: India

PostPosted: Tue Sep 25, 2007 1:22 pm
Reply with quote

nayanishpatil

This DFSORT/ICETOOL JOB, counts the occurance of the key that have duplicate records.
Code:
//*******************************************************               
//STEP001  EXEC PGM=ICETOOL                                             
//TOOLMSG  DD SYSOUT=*                                                 
//DFSMSG   DD SYSOUT=*                                                 
//IN1      DD *                                                         
AAAAAAAAA                                                               
VVVVVVVVV                                                               
GGGGGGGG                                                               
AAAAAAAAA                                                               
HHHHHHHHH                                                               
AAAAAAAAA                                                               
GGGGGGGG                                                               
VVVVVVVVV                                                               
/*                                                                     
//TMP1     DD DSN=&&TEMP1,DISP=(MOD,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA 
//OUT      DD SYSOUT=*                                                 
//TOOLIN   DD *                                                         
 OCCUR FROM(IN1) LIST(TMP1)-                                           
 ON(1,9,CH) ON(VALCNT)                                                 
 COPY FROM(TMP1) TO(OUT) USING(CP01)                                   
/*                                                                     
//CP01CNTL DD *                                                         
  INCLUDE COND=(25,15,ZD,GT,1)                                         
/*                                                                     

OUT contains:
Code:
---+----1----+----2----+----3----+----4
(1,9,CH)               VALUE COUNT     
AAAAAAAAA              000000000000003
GGGGGGGG               000000000000002
VVVVVVVVV              000000000000002
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Tue Sep 25, 2007 2:34 pm
Reply with quote

krisprems,

If I understood the original post properly, Nayanish wants to write all the records depending on the occurences in one file.

Let me put this way, AAAAAAAAA record occured thrice. So anyother record set (eg ZZZZZZZZZ etc...) which has repeated 3 times should go with AAAAAAAAA in one file (say this file OCCUR3).

Whereas VVVVVVVVV & GGGGGGGGG record occured twice. So he wanted to write these records in one file (say this as OCCUR2) etc.....

The record HHHHHHHH should go to another file (say OCCUR1).
Back to top
View user's profile Send private message
krisprems

Active Member


Joined: 27 Nov 2006
Posts: 649
Location: India

PostPosted: Tue Sep 25, 2007 3:25 pm
Reply with quote

This DFSORT/ICETOOL JCL, writes the records having the key with one
occurance into one file, and 2 occurance in to 2nf file, and 3 occurance in to third file.
Code:
//*******************************************************               
//STEP001  EXEC PGM=ICETOOL                                             
//TOOLMSG  DD SYSOUT=*                                                 
//DFSMSG   DD SYSOUT=*                                                 
//IN1      DD *                                                         
AAAAAAAAA                                                               
VVVVVVVVV                                                               
GGGGGGGG                                                               
AAAAAAAAA                                                               
HHHHHHHHH                                                               
AAAAAAAAA                                                               
GGGGGGGG                                                               
VVVVVVVVV                                                               
/*                                                                     
//TMP1     DD DSN=&&TEMP1,DISP=(MOD,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA 
//OCCUR1   DD SYSOUT=*                                                 
//OCCUR2   DD SYSOUT=*                                                 
//OCCUR3   DD SYSOUT=*                                                 
//TOOLIN   DD *                                                         
 OCCUR FROM(IN1) LIST(OCCUR1) EQUAL(1)-                                 
 ON(1,9,CH) ON(VALCNT,N04)                                             
 OCCUR FROM(IN1) LIST(OCCUR2) EQUAL(2)-   
 ON(1,9,CH) ON(VALCNT,N04)                                             
 OCCUR FROM(IN1) LIST(OCCUR3) EQUAL(3)-                                 
 ON(1,9,CH) ON(VALCNT,N04)                                             
/*                                                 

OCCUR1 contains:
Code:
---+----1----+----2----+
(1,9,CH)    VALUE COUNT
HHHHHHHHH             1

OCCUR2 contains:
Code:
---+----1----+----2----+
(1,9,CH)    VALUE COUNT
GGGGGGGG              2
VVVVVVVVV             2

OCCUR3 conatains:
Code:
---+----1----+----2----+
(1,9,CH)    VALUE COUNT
AAAAAAAAA             3


nayanishpatil: To help you better, show the sample as to how o/p should look like.
murmohk1: Hope my understanding is correct, after your comment
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Tue Sep 25, 2007 6:02 pm
Reply with quote

Kris,

Quote:
Hope my understanding is correct, after your comment

Lets hope I got the requirement correctly.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Sep 25, 2007 6:27 pm
Reply with quote

Hello,

Let's say you have the understood the requirement. . .

What happens when there is a count other than 1, 2, or 3? As i asked earlier, what happens if there are 900 (ok, that's too many, so let's say 300) different counts? That is too many dd statements for one step.

I'd be interested in how this output wouild be used and maybe we can offer more alternatives.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Sep 25, 2007 9:18 pm
Reply with quote

You don't need three passes over the input to do this. You can do it in one pass with a DFSORT job like the following:

Code:

//S1 EXEC PGM=ICEMAN
//SYSOUT  DD SYSOUT=*
//SORTIN DD *
AAAAAAAAA
VVVVVVVVV
GGGGGGGG
AAAAAAAAA
HHHHHHHHH
AAAAAAAAA
GGGGGGGG
VVVVVVVVV
/*
//CT1 DD SYSOUT=*
//CT2 DD SYSOUT=*
//CT3 DD SYSOUT=*
//SYSIN  DD *
  OPTION ZDPRINT
  INREC OVERLAY=(81:C'00000001')
  SORT FIELDS=(1,9,CH,A)
  SUM FIELDS=(81,8,ZD)
  OUTFIL FNAMES=CT1,INCLUDE=(81,8,ZD,EQ,+1),BUILD=(1,80)
  OUTFIL FNAMES=CT2,INCLUDE=(81,8,ZD,EQ,+2),BUILD=(1,80)
  OUTFIL FNAMES=CT3,INCLUDE=(81,8,ZD,EQ,+3),BUILD=(1,80)
/*


CT1 will have:

HHHHHHHHH

CT2 will have:

GGGGGGGG
VVVVVVVVV

CT3 will have:

AAAAAAAAA
Back to top
View user's profile Send private message
spath12

New User


Joined: 22 Jul 2009
Posts: 2
Location: Gurgaon

PostPosted: Thu Nov 19, 2009 2:19 pm
Reply with quote

I have tried the first code for getting the duplicate records count but records are coming from second column.

=COLS> ----+----1----+----2----+----3----+----4
****** ***************************** Top of Dat
000001 1(1,10,CH) VALUE COUNT
000002 ABCDFFF 000000000000007
000003 KUMAR 000000000000006
000004 AAAAAAA 000000000000007

Please suggest that how to get the records from first column and in first line char 1 will not come with (1,10,CH).
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Nov 19, 2009 9:22 pm
Reply with quote

Hello,

You have replied to a topic that has been inactive for over 2 years.

You have also not very clearly described what you want to do. Describe the "rules" for getting from your input to the desired output.

Please post the output you want from the sample input. If a more representative sample is needed, add some more data to show the possible situations. Once the input sample has been built, show the output you want from the sample input.

Also mention the recfm and lrecl of all files.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Nov 19, 2009 11:08 pm
Reply with quote

spath12,

If I understand you correctly, you do not want the ANSI carriage control character that DFSORT's OCCUR usually puts in the output of the report (e.g' 1' in column 1 for page eject).

You can eliminate that character by using the NOCC operand, e.g.


OCCUR NOCC FROM(...

For complete details on the OCCUR operator of DFSORT's ICETOOL, see:

publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA40/6.10?DT=20090527161936
Back to top
View user's profile Send private message
spath12

New User


Joined: 22 Jul 2009
Posts: 2
Location: Gurgaon

PostPosted: Fri Nov 20, 2009 6:35 pm
Reply with quote

Hi Frank,

Yes, you are correct and I wanted to remove the ANCI carriage control character. I was not aware of this feature of DFSORT so getting the 1 in column 1 for page eject.

After using the NOCC i got the desired output.

Thanks Alot icon_smile.gif
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 3
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts To get the count of rows for every 1 ... DB2 3
No new posts Duplicate transid's declared using CEDA CICS 3
No new posts Compare only first records of the fil... SYNCSORT 7
Search our Forums:

Back to Top