Sort Tricks for matching duplicates in two files

Pkelly · New User Joined: 14 May 2009 Posts: 9 Location: ohio

i saw the sort trick for matching when dups in one file but is there a way to match when there are dups in both files.

Pkelly · New User Joined: 14 May 2009 Posts: 9 Location: ohio

it matches on acct number:
1. 1 or move account number 123 in file 1 and 1 account number 123 in file 2
2. 1 account number 123 in file 1 and 1 or more account number 123 in file 2
3. file one could have 2 account numbers 123 and file 2 could have 2 account number 123

Frank Yaeger · Posted: Mon May 18, 2009 8:49 pm

Please show an example of the records in each input file (relevant fields only) and what you expect for output. Give the RECFM and LRECL of each input file. Give the starting position, length and format of each relevant field.

Pkelly · New User Joined: 14 May 2009 Posts: 9 Location: ohio

File a - account starts in colum 9 for 20 and branch in colmn 30 for 10
......ã*00000000061847266000P0000000103
.....Èå.00000000061927327000S0000000103
.....iÑæ00000000070541546000P0000000103
.....ê..00000000070541546000P0000000103
.......<00000000080524644000P0000000103
.....ïê@00000000900778724000P0000000103

account number 00000000070541546000 is in file a twice and in file b once

File B account number column 1 for 20 branch col 21 for 10
000000000207625100000000000103
000000000705415460000000000103
000000009000519260000000000103
000000009007158370000000000103
000000009007787240000000000103
000000009007787240000000000103

both files are fixed block, 120
account number 00000000070541546000 is in file a twice and in file b once
Account number 00000000900778724000 is in file a once and in file b twice
also an account number can in in file a and file b twice

if there is a match i need to add the first 8 bytes from file a to file b and write out file b

so in the case of file a having 2 records and file b having one. the output would have 2 records with the first 8 bytes from file a and the rest from file b.

2 records in file b and 1 record in file a - both records in file b would have the 1st 8 bytes of file a record

Frank Yaeger · Posted: Mon May 18, 2009 11:18 pm

Pkelly · New User Joined: 14 May 2009 Posts: 9 Location: ohio

File a - account starts in colum 9 for 20 and branch in colmn 30 for 10
......ã*00000000061847266000P0000000103
.....Èå.00000000061927327000S0000000103
.....iÑæ00000000070541546000P0000000103
.....ê..00000000070541546000P0000000103
.......<00000000080524644000P0000000103
.....ïê@00000000900778724000P0000000103
.....lÑ*00000000914120724000P0000000103
......g%00000000914120724000P0000000103

account number 00000000070541546000 is in file a twice and in file b once

File B account number column 1 for 20 branch col 21 for 10
000000000207625100000000000103
000000000705415460000000000103
000000009000519260000000000103
000000009007158370000000000103
000000009007787240000000000103
000000009007787240000000000103
000000009141205790000000000103
000000009141206290000000000103

both files are fixed block, 120
account number 00000000070541546000 is in file a twice and in file b once
Account number 00000000900778724000 is in file a once and in file b twice
also an account number can in in file a and file b twice

the output would be:

so in the case of file a having 2 records and file b having one. the output would have 2 records with the first 8 bytes from file a and the rest from file b.
.....iÑæ000000000705415460000000000103 7916695328
.....ê..000000000705415460000000000103 7916696329

2 records in file b and 1 record in file a - both records in file b would have the 1st 8 bytes of file a record

.....ïê@000000009007787240000000000103 7922050963
.....ïê@000000009007787240000000000103 5550021039

Accounts in both files twice
.....lÑ*000000009141207240000000000103 7925401098
.....lÑ*000000009141207240000000000103 7925401099
......g%000000009141207240000000000103 7925401098
......g%000000009141207240000000000103 7925401099

the other case is one for one
......ç.0000000914145893000P0000000103
000000009141458930000000000103
Output

......ç.00000009141458930000000000103290957761

Skolusu · Posted: Tue May 19, 2009 5:27 am

Pkelly,

The following DFSORT/ICETOOL JCL will give you the desired results. I only assumed/considered max of 2 duplicates in file A(as mentioned in your requirements). The output consists of only records that have a match in both files.The final output file lrecl is 128(8 bytes from file A + 120 bytes from file B)

Pkelly · New User Joined: 14 May 2009 Posts: 9 Location: ohio

After running another set of data throught the conversion i found and account with this.

File a:
......ï%00000000911145960000P0000000103

File b:
000000009111459600000000000103 19706
000000009111459600000000000103 1010188084199
000000009111459600000000000103 19706
000000009111459600000000000103 9342549164
000000009111459600000000000103 3735727505
000000009111459600000000000103 6876011609
000000009111459600000000000103 7916155520
000000009111459600000000000103 791615552
000000009111459600000000000103 799689013

Results should be:
......ï%000000009111459600000000000103 19706
......ï%000000009111459600000000000103 1010188084199
......ï%000000009111459600000000000103 19706
......ï%000000009111459600000000000103 9342549164
......ï%000000009111459600000000000103 3735727505
......ï%000000009111459600000000000103 6876011609
......ï%000000009111459600000000000103 7916155520
......ï%000000009111459600000000000103 791615552
......ï%000000009111459600000000000103 799689013

Thank you

Skolusu · Posted: Tue May 19, 2009 9:19 pm

pkelly,

Did you run the job I provided? What is so special about the data you presented? The coded job handles a max of 2 duplicates for FILE A and for file B there is no such restriction. It can have any number of duplicates.

Just for the record the above job does give you the desired results

Pkelly · New User Joined: 14 May 2009 Posts: 9 Location: ohio

thanks i have not run it but setting it up this afternoon to run. thanks for your help