IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Not losing all duplicates with SUM FIELDS=NONE


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Div Grad

New User


Joined: 08 Apr 2005
Posts: 45

PostPosted: Wed Aug 01, 2007 1:38 am
Reply with quote

I have two files where I can have at most one occurance of a key on each of the files. I want to merge them together and only end up with records where the key in question only occurs once and only once with respect to both files(i.e. remove key duplicates). I did my sort and a lot of the duplicates were removed but I've ended up with cases where I know there was a duplicate key but only one of the records was dropped.

Here's a record in the first file:
Code:

----+----1----+----2----+----3----+----4----+----5----+----6----+
               0005583460001455123         BUSINESS             
             


and here's the second file:
Code:

 Command ===>                                         
----+----1----+----2----+----3----+----4----+----5----
4113008124495000005583460001455123 NCH     BUSINESS   
   


Now I was expecting the records shown to disappear since the keys match. But this is what I ended up with:

Code:

 Command ===>                                       
----+----1----+----2----+----3----+----4----+----5---
4113008124495000005583460001455123 NCH     BUSINESS 
 


Only one of the two records with the key dropped, instead of both. Below is what I submitted, any ideas on what seems to have gone wrong here? There are about 8,000,000 records per file, could this be related to not giving enough sort space?

Code:

********************************* TOP OF DATA **********************************
ICE143I 0 BLOCKSET     SORT  TECHNIQUE SELECTED                                 
ICE250I 0 VISIT http://www.ibm.com/storage/dfsort FOR DFSORT PAPERS, EXAMPLES AN
ICE000I 1 - CONTROL STATEMENTS FOR 5694-A01, Z/OS DFSORT V1R5 - 15:39 ON TUE JUL
            SORT    FIELDS=(16,19,CH,A,44,10,CH,A)                             
            SUM     FIELDS=NONE                                                 
ICE201I E RECORD TYPE IS F - DATA STARTS IN POSITION 1                         
ICE751I 0 C5-K21008 C6-K90007 C7-K90000 C8-K90007 E9-K90007 C9-BASE   E5-K21514
ICE193I 0 ICEAM1 ENVIRONMENT IN EFFECT - ICEAM1 INSTALLATION MODULE SELECTED   
ICE088I 1 RJMSORTI.SORT01  .        , INPUT LRECL = 63, BLKSIZE = 27972, TYPE =
ICE093I 0 MAIN STORAGE = (MAX,12420848,12420848)                               
ICE156I 0 MAIN STORAGE ABOVE 16MB = (12290872,12290872)                         
ICE127I 0 OPTIONS: OVFLO=RC0 ,PAD=RC0 ,TRUNC=RC0 ,SPANINC=RC16,VLSCMP=N,SZERO=Y,
ICE128I 0 OPTIONS: SIZE=12420848,MAXLIM=1048576,MINLIM=450560,EQUALS=Y,LIST=Y,ER
ICE129I 0 OPTIONS: VIO=Y,RESDNT=ALL ,SMF=FULL ,WRKSEC=Y,OUTSEC=Y,VERIFY=N,CHALT=
ICE130I 0 OPTIONS: RESALL=4096,RESINV=0,SVC=109 ,CHECK=Y,WRKREL=Y,OUTREL=Y,CKPT=
ICE131I 0 OPTIONS: TMAXLIM=6291456,ARESALL=0,ARESINV=0,OVERRGN=65536,CINV=Y,CFW=
ICE132I 0 OPTIONS: VLSHRT=N,ZDPRINT=Y,IEXIT=Y,TEXIT=N,LISTX=N,EFS=NONE    ,EXITC
ICE133I 0 OPTIONS: HIPRMAX=3901   ,DSPSIZE=MAX ,ODMAXBF=0,SOLRF=N,VLLONG=N,VSAMI
ICE235I 0 OPTIONS: NULLOUT=RC0                                                 
ICE084I 0 EXCP ACCESS METHOD USED FOR SORTOUT                                   
ICE084I 0 EXCP ACCESS METHOD USED FOR SORTIN                                   
ICE750I 0 DC 985341672 TC 0 CS DSVUU KSZ 33 VSZ 33                             
ICE752I 0 FSZ=15640344 RC  IGN=0 E  AVG=68 0  WSP=1381360 C  DYN=0 0           
ICE751I 1 DE-K10929 D5-K05352 D3-K10929 D7-Q91626 E8-K21008                     
ICE090I 0 OUTPUT LRECL = 63, BLKSIZE = 27972, TYPE = FB                         
ICE055I 0 INSERT 0, DELETE 7796071                                             
ICE054I 0 RECORDS - IN: 15639609, OUT: 7843538                         
ICE134I 0 NUMBER OF BYTES SORTED: 985295367                             
ICE165I 0 TOTAL WORK DATA SET TRACKS ALLOCATED: 9000 , TRACKS USED: 0   
ICE199I 0 MEMORY OBJECT STORAGE USED = 0M BYTES                         
ICE180I 0 HIPERSPACE STORAGE USED = 1047572K BYTES                     
ICE188I 0 DATA SPACE STORAGE USED = 0K BYTES                           
ICE052I 0 END OF DFSORT                                                 
******************************** BOTTOM OF DATA ************************
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Aug 01, 2007 3:39 am
Reply with quote

Quote:
I've ended up with cases where I know there was a duplicate key but only one of the records was dropped.


That's exactly what SUM FIELDS=NONE does. See:

publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA20/3.17?DT=20060615185603

If you want to eliminate all duplicate records and only keep the unique records, you need to use a DFSORT/ICETOOL job like this:

Code:

//S1    EXEC  PGM=ICETOOL                                     
//TOOLMSG DD SYSOUT=*                                         
//DFSMSG  DD SYSOUT=*                         
//IN DD DSN=...  input file1
//    DD DSN=...  input file2
//OUT DD DSN=...  output file       
//TOOLIN   DD    *                                           
SELECT FROM(IN) TO(OUT) ON(16,19,CH) ON(44,10,CH) NODUPS
/*


If you're not familiar with DFSORT and DFSORT's ICETOOL, I'd suggest reading through "z/OS DFSORT: Getting Started". It's an excellent tutorial, with lots of examples, that will show you how to use DFSORT, DFSORT's ICETOOL and DFSORT Symbols. You can access it online, along with all of the other DFSORT books, from:

Use [URL] BBCode for External Links
Back to top
View user's profile Send private message
Div Grad

New User


Joined: 08 Apr 2005
Posts: 45

PostPosted: Wed Aug 01, 2007 5:53 am
Reply with quote

Frank - I win the dummy award, nine times out of ten I remember that sum fields = none means collapse down to one record per key. This time I got it into my head that it meant eliminate all records with a multiple occur of a key. Oops.

What I'd ideally like is:

- keep only records where a key occurs more than once (or a specified number of times)
- in a pinch I could also work with keep one and only one record for each key that occurs more than once (I already know how to do this another way).

To do the first could I mod your example to use 'DUPS' instead of 'NODUPS'? I'll start also looking at the manuals you suggest.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Aug 01, 2007 8:23 pm
Reply with quote

Quote:
To do the first could I mod your example to use 'DUPS' instead of 'NODUPS'? I'll start also looking at the manuals you suggest.


You can use ALLDUPS to keep all duplicates.

You can also use EQUAL(x), HIGHER(x) or LOWER(x) as appropriate.

For complete details on DFSORT/ICETOOL's SELECT operator and all of its parameters, see:

publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA20/6.11?SHELF=EZ2ZO10I&DT=20060615185603
Back to top
View user's profile Send private message
Div Grad

New User


Joined: 08 Apr 2005
Posts: 45

PostPosted: Wed Aug 01, 2007 9:26 pm
Reply with quote

Wow, that did it. Thatnks for the quick help.

I've bookmarked the manual also, I'll be pouring through that in my spare time.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts How to remove block of duplicates DFSORT/ICETOOL 8
This topic is locked: you cannot edit posts or make replies. Compare files with duplicates in one ... DFSORT/ICETOOL 11
No new posts Concatenate 2 fields (usage national)... COBOL Programming 2
No new posts Cobol COMP-2 fields getting scrambled... Java & MQSeries 6
No new posts Converting unpacked fields to pack us... SYNCSORT 4
Search our Forums:

Back to Top