IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Sort Tricks for matching duplicates in two files


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Pkelly

New User


Joined: 14 May 2009
Posts: 9
Location: ohio

PostPosted: Sun May 17, 2009 8:05 pm
Reply with quote

i saw the sort trick for matching when dups in one file but is there a way to match when there are dups in both files.
Back to top
View user's profile Send private message
Pkelly

New User


Joined: 14 May 2009
Posts: 9
Location: ohio

PostPosted: Sun May 17, 2009 8:17 pm
Reply with quote

it matches on acct number:
1. 1 or move account number 123 in file 1 and 1 account number 123 in file 2
2. 1 account number 123 in file 1 and 1 or more account number 123 in file 2
3. file one could have 2 account numbers 123 and file 2 could have 2 account number 123
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Mon May 18, 2009 8:49 pm
Reply with quote

Please show an example of the records in each input file (relevant fields only) and what you expect for output. Give the RECFM and LRECL of each input file. Give the starting position, length and format of each relevant field.
Back to top
View user's profile Send private message
Pkelly

New User


Joined: 14 May 2009
Posts: 9
Location: ohio

PostPosted: Mon May 18, 2009 11:08 pm
Reply with quote

File a - account starts in colum 9 for 20 and branch in colmn 30 for 10
......ã*00000000061847266000P0000000103
.....Èå.00000000061927327000S0000000103
.....iÑæ00000000070541546000P0000000103
.....ê..00000000070541546000P0000000103

.......<00000000080524644000P0000000103
.....ïê@00000000900778724000P0000000103

account number 00000000070541546000 is in file a twice and in file b once

File B account number column 1 for 20 branch col 21 for 10
000000000207625100000000000103
000000000705415460000000000103
000000009000519260000000000103
000000009007158370000000000103
000000009007787240000000000103
000000009007787240000000000103


both files are fixed block, 120
account number 00000000070541546000 is in file a twice and in file b once
Account number 00000000900778724000 is in file a once and in file b twice
also an account number can in in file a and file b twice

if there is a match i need to add the first 8 bytes from file a to file b and write out file b

so in the case of file a having 2 records and file b having one. the output would have 2 records with the first 8 bytes from file a and the rest from file b.

2 records in file b and 1 record in file a - both records in file b would have the 1st 8 bytes of file a record
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Mon May 18, 2009 11:18 pm
Reply with quote

Quote:
also an account number can in in file a and file b twice


Show this variation in your input example, and then show the expected output records.
Back to top
View user's profile Send private message
Pkelly

New User


Joined: 14 May 2009
Posts: 9
Location: ohio

PostPosted: Mon May 18, 2009 11:38 pm
Reply with quote

File a - account starts in colum 9 for 20 and branch in colmn 30 for 10
......ã*00000000061847266000P0000000103
.....Èå.00000000061927327000S0000000103
.....iÑæ00000000070541546000P0000000103
.....ê..00000000070541546000P0000000103

.......<00000000080524644000P0000000103
.....ïê@00000000900778724000P0000000103
.....lÑ*00000000914120724000P0000000103
......g%00000000914120724000P0000000103


account number 00000000070541546000 is in file a twice and in file b once

File B account number column 1 for 20 branch col 21 for 10
000000000207625100000000000103
000000000705415460000000000103
000000009000519260000000000103
000000009007158370000000000103
000000009007787240000000000103
000000009007787240000000000103

000000009141205790000000000103
000000009141206290000000000103



both files are fixed block, 120
account number 00000000070541546000 is in file a twice and in file b once
Account number 00000000900778724000 is in file a once and in file b twice
also an account number can in in file a and file b twice

the output would be:

so in the case of file a having 2 records and file b having one. the output would have 2 records with the first 8 bytes from file a and the rest from file b.
.....iÑæ000000000705415460000000000103 7916695328
.....ê..000000000705415460000000000103 7916696329

2 records in file b and 1 record in file a - both records in file b would have the 1st 8 bytes of file a record

.....ïê@000000009007787240000000000103 7922050963
.....ïê@000000009007787240000000000103 5550021039

Accounts in both files twice
.....lÑ*000000009141207240000000000103 7925401098
.....lÑ*000000009141207240000000000103 7925401099
......g%000000009141207240000000000103 7925401098
......g%000000009141207240000000000103 7925401099


the other case is one for one
......ç.0000000914145893000P0000000103
000000009141458930000000000103
Output

......ç.00000009141458930000000000103290957761
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Tue May 19, 2009 5:27 am
Reply with quote

Pkelly,

The following DFSORT/ICETOOL JCL will give you the desired results. I only assumed/considered max of 2 duplicates in file A(as mentioned in your requirements). The output consists of only records that have a match in both files.The final output file lrecl is 128(8 bytes from file A + 120 bytes from file B)

Code:

//STEP0100 EXEC PGM=ICETOOL   
//TOOLMSG  DD SYSOUT=*         
//DFSMSG   DD SYSOUT=*         
//I1       DD DSN=your 120 byte file A,DISP=SHR
//I2       DD DSN=your 120 byte file B,DISP=SHR
//T1       DD DSN=&&T1,DISP=(MOD,PASS),SPACE=(CYL,(X,Y),RLSE)       
//OUT      DD SYSOUT=*                                               
//TOOLIN   DD *                                                     
  SORT FROM(I1) USING(CTL1)                                         
  COPY FROM(I2) USING(CTL2)                                         
  SORT FROM(T1) USING(CTL3)                                         
//CTL1CNTL DD *                                                     
  INREC BUILD=(9,20,30,10,1,8)                                       
  SORT FIELDS=(1,30,CH,A)                                           
  OUTREC IFTHEN=(WHEN=INIT,OVERLAY=(39:SEQNUM,1,ZD,RESTART=(1,30))),
  IFTHEN=(WHEN=GROUP,BEGIN=(39,1,ZD,EQ,1),PUSH=(40:31,8),RECORDS=2) 
  OUTFIL FNAMES=T1,REMOVECC,NODETAIL,                               
  INCLUDE=(39,1,ZD,LE,2),BUILD=(138X),                               
  SECTIONS=(1,30,                                                   
  TRAILER3=(1,30,121:'1',31,8,40,8,COUNT=(M11,LENGTH=1)))           
//CTL2CNTL DD *                                                     
  OUTFIL FNAMES=T1,OVERLAY=(121:C'2',17X)                           
//CTL3CNTL DD *                                                     
  SORT FIELDS=(1,30,CH,A),EQUALS                                     
  OUTREC IFTHEN=(WHEN=GROUP,BEGIN=(121,1,CH,EQ,C'1'),               
  PUSH=(122:122,17,1,30))                                           
  OUTFIL FNAMES=OUT,IFOUTLEN=128,                                   
  OMIT=(121,1,ZD,EQ,1,OR,122,17,CH,EQ,C' ',OR,1,30,CH,NE,139,30,CH),
  IFTHEN=(WHEN=(138,1,ZD,EQ,1),BUILD=(122,8,1,120)),                 
  IFTHEN=(WHEN=(138,1,ZD,EQ,2),BUILD=(122,8,1,120,/,130,8,1,120))   
/*
Back to top
View user's profile Send private message
Pkelly

New User


Joined: 14 May 2009
Posts: 9
Location: ohio

PostPosted: Tue May 19, 2009 4:06 pm
Reply with quote

After running another set of data throught the conversion i found and account with this.

File a:
......ï%00000000911145960000P0000000103

File b:
000000009111459600000000000103 19706
000000009111459600000000000103 1010188084199
000000009111459600000000000103 19706
000000009111459600000000000103 9342549164
000000009111459600000000000103 3735727505
000000009111459600000000000103 6876011609
000000009111459600000000000103 7916155520
000000009111459600000000000103 791615552
000000009111459600000000000103 799689013

Results should be:
......ï%000000009111459600000000000103 19706
......ï%000000009111459600000000000103 1010188084199
......ï%000000009111459600000000000103 19706
......ï%000000009111459600000000000103 9342549164
......ï%000000009111459600000000000103 3735727505
......ï%000000009111459600000000000103 6876011609
......ï%000000009111459600000000000103 7916155520
......ï%000000009111459600000000000103 791615552
......ï%000000009111459600000000000103 799689013

Thank you
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Tue May 19, 2009 9:19 pm
Reply with quote

pkelly,

Did you run the job I provided? What is so special about the data you presented? The coded job handles a max of 2 duplicates for FILE A and for file B there is no such restriction. It can have any number of duplicates.

Just for the record the above job does give you the desired results
Back to top
View user's profile Send private message
Pkelly

New User


Joined: 14 May 2009
Posts: 9
Location: ohio

PostPosted: Tue May 19, 2009 9:32 pm
Reply with quote

thanks i have not run it but setting it up this afternoon to run. thanks for your help
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Need to set RC4 through JCL SORT DFSORT/ICETOOL 5
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Write line by line from two files DFSORT/ICETOOL 7
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts JCL sort card - get first day and las... JCL & VSAM 9
Search our Forums:

Back to Top