Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

Count number of duplicate records

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
nayanishpatil

New User


Joined: 16 Aug 2007
Posts: 14
Location: INDIA

PostPosted: Mon Sep 24, 2007 11:56 pm    Post subject: Count number of duplicate records
Reply with quote

Hi,

Is there a way to count the number of duplicate records in an input file.

For example, input file are having the following records:
AAAAAAAAA
VVVVVVVVV
GGGGGGGG
AAAAAAAAA
HHHHHHHHH
AAAAAAAAA
GGGGGGGG
VVVVVVVVV

Then, as we can see that AAAAAAAAA record is occuring 3 times,
VVVVVVVVV record is occuring 2 times and GGGGGGGG records is occuring 2 times.
Back to top
View user's profile Send private message

dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue Sep 25, 2007 12:00 am    Post subject:
Reply with quote

Hello,

Please post what you want your output to look like.
Back to top
View user's profile Send private message
nayanishpatil

New User


Joined: 16 Aug 2007
Posts: 14
Location: INDIA

PostPosted: Tue Sep 25, 2007 12:23 am    Post subject:
Reply with quote

If the input record AAAAAAAAA occurs 3 times, then AAAAAAAAA record should be written in one output file,

VVVVVVVVV output record occurs 2 times, as well as GGGGGGGG record occurs 2 times. Then VVVVVVVVV and GGGGGGGG records must be written to another file.

Similarly, the HHHHHHHHH record must be written to a separate file.
Back to top
View user's profile Send private message
CICS Guy

Senior Member


Joined: 18 Jul 2007
Posts: 2150
Location: At my coffee table

PostPosted: Tue Sep 25, 2007 1:13 am    Post subject:
Reply with quote

Wow, where you got "Count number of duplicate records", I just do not understand....

What happens if XXXXXXXXX occsur 102 times?
Back to top
View user's profile Send private message
Craq Giegerich

Senior Member


Joined: 19 May 2007
Posts: 1512
Location: Virginia, USA

PostPosted: Tue Sep 25, 2007 1:19 am    Post subject:
Reply with quote

I suppose you want to do this without sorting the input file.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue Sep 25, 2007 2:14 am    Post subject:
Reply with quote

Hello,

Quote:
If the input record AAAAAAAAA occurs 3 times, then AAAAAAAAA record should be written in one output file,

VVVVVVVVV output record occurs 2 times, as well as GGGGGGGG record occurs 2 times. Then VVVVVVVVV and GGGGGGGG records must be written to another file.

Similarly, the HHHHHHHHH record must be written to a separate file.


Are you saying you want to create a separate file for each unique "count" of the key value and all of the records having that "count" go into the same file?

If there are 900 different "counts" should 900 separate output files be created?

If you explain what you are trying to accomplish, we may be able to offer suggestions.
Back to top
View user's profile Send private message
krisprems

Active Member


Joined: 27 Nov 2006
Posts: 649
Location: India

PostPosted: Tue Sep 25, 2007 1:22 pm    Post subject:
Reply with quote

nayanishpatil

This DFSORT/ICETOOL JOB, counts the occurance of the key that have duplicate records.
Code:
//*******************************************************               
//STEP001  EXEC PGM=ICETOOL                                             
//TOOLMSG  DD SYSOUT=*                                                 
//DFSMSG   DD SYSOUT=*                                                 
//IN1      DD *                                                         
AAAAAAAAA                                                               
VVVVVVVVV                                                               
GGGGGGGG                                                               
AAAAAAAAA                                                               
HHHHHHHHH                                                               
AAAAAAAAA                                                               
GGGGGGGG                                                               
VVVVVVVVV                                                               
/*                                                                     
//TMP1     DD DSN=&&TEMP1,DISP=(MOD,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA 
//OUT      DD SYSOUT=*                                                 
//TOOLIN   DD *                                                         
 OCCUR FROM(IN1) LIST(TMP1)-                                           
 ON(1,9,CH) ON(VALCNT)                                                 
 COPY FROM(TMP1) TO(OUT) USING(CP01)                                   
/*                                                                     
//CP01CNTL DD *                                                         
  INCLUDE COND=(25,15,ZD,GT,1)                                         
/*                                                                     

OUT contains:
Code:
---+----1----+----2----+----3----+----4
(1,9,CH)               VALUE COUNT     
AAAAAAAAA              000000000000003
GGGGGGGG               000000000000002
VVVVVVVVV              000000000000002
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1439
Location: Bangalore,India

PostPosted: Tue Sep 25, 2007 2:34 pm    Post subject:
Reply with quote

krisprems,

If I understood the original post properly, Nayanish wants to write all the records depending on the occurences in one file.

Let me put this way, AAAAAAAAA record occured thrice. So anyother record set (eg ZZZZZZZZZ etc...) which has repeated 3 times should go with AAAAAAAAA in one file (say this file OCCUR3).

Whereas VVVVVVVVV & GGGGGGGGG record occured twice. So he wanted to write these records in one file (say this as OCCUR2) etc.....

The record HHHHHHHH should go to another file (say OCCUR1).
Back to top
View user's profile Send private message
krisprems

Active Member


Joined: 27 Nov 2006
Posts: 649
Location: India

PostPosted: Tue Sep 25, 2007 3:25 pm    Post subject:
Reply with quote

This DFSORT/ICETOOL JCL, writes the records having the key with one
occurance into one file, and 2 occurance in to 2nf file, and 3 occurance in to third file.
Code:
//*******************************************************               
//STEP001  EXEC PGM=ICETOOL                                             
//TOOLMSG  DD SYSOUT=*                                                 
//DFSMSG   DD SYSOUT=*                                                 
//IN1      DD *                                                         
AAAAAAAAA                                                               
VVVVVVVVV                                                               
GGGGGGGG                                                               
AAAAAAAAA                                                               
HHHHHHHHH                                                               
AAAAAAAAA                                                               
GGGGGGGG                                                               
VVVVVVVVV                                                               
/*                                                                     
//TMP1     DD DSN=&&TEMP1,DISP=(MOD,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA 
//OCCUR1   DD SYSOUT=*                                                 
//OCCUR2   DD SYSOUT=*                                                 
//OCCUR3   DD SYSOUT=*                                                 
//TOOLIN   DD *                                                         
 OCCUR FROM(IN1) LIST(OCCUR1) EQUAL(1)-                                 
 ON(1,9,CH) ON(VALCNT,N04)                                             
 OCCUR FROM(IN1) LIST(OCCUR2) EQUAL(2)-   
 ON(1,9,CH) ON(VALCNT,N04)                                             
 OCCUR FROM(IN1) LIST(OCCUR3) EQUAL(3)-                                 
 ON(1,9,CH) ON(VALCNT,N04)                                             
/*                                                 

OCCUR1 contains:
Code:
---+----1----+----2----+
(1,9,CH)    VALUE COUNT
HHHHHHHHH             1

OCCUR2 contains:
Code:
---+----1----+----2----+
(1,9,CH)    VALUE COUNT
GGGGGGGG              2
VVVVVVVVV             2

OCCUR3 conatains:
Code:
---+----1----+----2----+
(1,9,CH)    VALUE COUNT
AAAAAAAAA             3


nayanishpatil: To help you better, show the sample as to how o/p should look like.
murmohk1: Hope my understanding is correct, after your comment
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1439
Location: Bangalore,India

PostPosted: Tue Sep 25, 2007 6:02 pm    Post subject:
Reply with quote

Kris,

Quote:
Hope my understanding is correct, after your comment

Lets hope I got the requirement correctly.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue Sep 25, 2007 6:27 pm    Post subject:
Reply with quote

Hello,

Let's say you have the understood the requirement. . .

What happens when there is a count other than 1, 2, or 3? As i asked earlier, what happens if there are 900 (ok, that's too many, so let's say 300) different counts? That is too many dd statements for one step.

I'd be interested in how this output wouild be used and maybe we can offer more alternatives.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Tue Sep 25, 2007 9:18 pm    Post subject:
Reply with quote

You don't need three passes over the input to do this. You can do it in one pass with a DFSORT job like the following:

Code:

//S1 EXEC PGM=ICEMAN
//SYSOUT  DD SYSOUT=*
//SORTIN DD *
AAAAAAAAA
VVVVVVVVV
GGGGGGGG
AAAAAAAAA
HHHHHHHHH
AAAAAAAAA
GGGGGGGG
VVVVVVVVV
/*
//CT1 DD SYSOUT=*
//CT2 DD SYSOUT=*
//CT3 DD SYSOUT=*
//SYSIN  DD *
  OPTION ZDPRINT
  INREC OVERLAY=(81:C'00000001')
  SORT FIELDS=(1,9,CH,A)
  SUM FIELDS=(81,8,ZD)
  OUTFIL FNAMES=CT1,INCLUDE=(81,8,ZD,EQ,+1),BUILD=(1,80)
  OUTFIL FNAMES=CT2,INCLUDE=(81,8,ZD,EQ,+2),BUILD=(1,80)
  OUTFIL FNAMES=CT3,INCLUDE=(81,8,ZD,EQ,+3),BUILD=(1,80)
/*


CT1 will have:

HHHHHHHHH

CT2 will have:

GGGGGGGG
VVVVVVVVV

CT3 will have:

AAAAAAAAA
Back to top
View user's profile Send private message
spath12

New User


Joined: 22 Jul 2009
Posts: 2
Location: Gurgaon

PostPosted: Thu Nov 19, 2009 2:19 pm    Post subject: Reply to: Count number of duplicate records
Reply with quote

I have tried the first code for getting the duplicate records count but records are coming from second column.

=COLS> ----+----1----+----2----+----3----+----4
****** ***************************** Top of Dat
000001 1(1,10,CH) VALUE COUNT
000002 ABCDFFF 000000000000007
000003 KUMAR 000000000000006
000004 AAAAAAA 000000000000007

Please suggest that how to get the records from first column and in first line char 1 will not come with (1,10,CH).
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Thu Nov 19, 2009 9:22 pm    Post subject:
Reply with quote

Hello,

You have replied to a topic that has been inactive for over 2 years.

You have also not very clearly described what you want to do. Describe the "rules" for getting from your input to the desired output.

Please post the output you want from the sample input. If a more representative sample is needed, add some more data to show the possible situations. Once the input sample has been built, show the output you want from the sample input.

Also mention the recfm and lrecl of all files.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Thu Nov 19, 2009 11:08 pm    Post subject:
Reply with quote

spath12,

If I understand you correctly, you do not want the ANSI carriage control character that DFSORT's OCCUR usually puts in the output of the report (e.g' 1' in column 1 for page eject).

You can eliminate that character by using the NOCC operand, e.g.


OCCUR NOCC FROM(...

For complete details on the OCCUR operator of DFSORT's ICETOOL, see:

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA40/6.10?DT=20090527161936
Back to top
View user's profile Send private message
spath12

New User


Joined: 22 Jul 2009
Posts: 2
Location: Gurgaon

PostPosted: Fri Nov 20, 2009 6:35 pm    Post subject:
Reply with quote

Hi Frank,

Yes, you are correct and I wanted to remove the ANCI carriage control character. I was not aware of this feature of DFSORT so getting the 1 in column 1 for page eject.

After using the NOCC i got the desired output.

Thanks Alot icon_smile.gif
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Removing Duplicates based on certain ... chandracdac DFSORT/ICETOOL 1 Fri Dec 09, 2016 4:40 am
No new posts Limit duplicate records in the SORT pshongal SYNCSORT 6 Mon Nov 21, 2016 12:54 pm
No new posts How to split the records using the am... vnktrrd DFSORT/ICETOOL 24 Fri Oct 28, 2016 7:33 pm
No new posts Sort records based on numeric field. Alks SYNCSORT 2 Wed Oct 19, 2016 10:14 pm
No new posts abend sort based on count records in ... anatol DFSORT/ICETOOL 5 Mon Oct 17, 2016 10:10 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us