Joined: 01 Feb 2007 Posts: 123 Location: Hyderabad
Hi,
I want to compare two files on two keys. Prior to this i posted the same topic on one key comparison and skolusu gave solution for that which is working well. But there is a change in my requirement. I want to compare two files on two keys. I tried to modify skolusu solution of one key comparison but not able to get output. Please help.
LRECL=31, RECFM=FB for all input and output files.
KEY1=1,5,ZD
KEY2=23,3,CH
Code:
----+----1----+----2----+----3----+----4----+----5
//FILE1 DD *
11111 BHASKAR 2000.00 HYD INDIA <-- TO OUTPUT ONLYF1
11111 BHASKAR 3000.00 HYD INDIA <-- TO OUTPUT ONLYF1
22222 SUNIL 1000.00 HYD INDIA <-- TO OUTPUT ONLYF1
33333 KUMAR 2500.00 VIJ INDIA <-- TO OUTPUT MATCH
33333 KUMAR 2500.00 VIJ INDIA <-- TO OUTPUT MATCH
/*
//FILE2 DD *
33333 KUMAR 300.00 VIJ INDIA <-- TO OUTPUT MATCH
33333 KUMAR 5000.00 VIJ INDIA <-- TO OUTPUT MATCH
44444 KTBS 1300.00 VIR US <-- TO OUTPUT ONLYF2
55555 BACHI 1110.00 BGL INDIA <-- TO OUTPUT ONLYF2
55555 BACHI 1000.00 BGL INDIA <-- TO OUTPUT ONLYF2
/*
Both input files will have duplicates.
Two records of 11111 and one record of 22222 from file1 which doesnt had match on two keys in file2 should be written to output ONLYF1 file.
One record of 44444 and two records of 55555 from file2 which doesnt had match on two keys in file1 should be written to output ONLYF2 file.
Two records of 33333 from file1 which had match on two keys with two records of 33333 from file2 should be written to output MATCH file. ie. totally all 4 records should be written to MATCH output file.
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
bhaskar_kanteti,
The logic is quite simple. We create an 1 additional records for every key combo from 2 files using sections and trailer3 parms and we will use that to sum to find if we have a match or not.
Sections is a control break of the key change and whenever the key changes we create a record with just the keys and space + init zero's and file ind.
This is how file1 will be formatted after first sort
Code:
--------------------------------------------------
KEY1 | DATA1 |KEY2|DATA2|IND-SEQ |FILEIND
1-5 | 6-22 |2325|26-31|32-40 |41-42
--------------------------------------------------
11111 BHASKAR 2000.00 HYD INDIA A00000001 11
11111 BHASKAR 3000.00 HYD INDIA A00000002 11
11111 HYD 00000000 11 < SECTIONS RECORD
22222 SUNIL 1000.00 HYD INDIA A00000001 11
22222 HYD 00000000 11 < SECTIONS RECORD
33333 KUMAR 2500.00 VIJ INDIA A00000001 11
33333 KUMAR 2500.00 VIJ INDIA A00000002 11
33333 VIJ 00000000 11 < SECTIONS RECORD
This is how file2 will be formatted after second sort
Code:
--------------------------------------------------
KEY1 | DATA1 |KEY2|DATA2|IND-SEQ |FILEIND
1-5 | 6-22 |2325|26-31|32-40 |41-42
--------------------------------------------------
33333 KUMAR 300.00 VIJ INDIA B00000001 22
33333 KUMAR 5000.00 VIJ INDIA B00000002 22
33333 VIJ 00000000 22 < SECTIONS RECORD
44444 KTBS 1300.00 VIR US B00000001 22
44444 VIR 00000000 22 < SECTIONS RECORD
55555 BACHI 1110.00 BGL INDIA B00000001 22
55555 BACHI 1000.00 BGL INDIA B00000002 22
55555 BGL 00000000 22 < SECTIONS RECORD
Now before splice is done we sort on
key1 + key2 + ind-seq
and then sum on 1 byte file indicator at pos 42
This is how the input to splice would look like
Code:
--------------------------------------------------
KEY1 | DATA1 |KEY2|DATA2|IND-SEQ |FILEIND
1-5 | 6-22 |2325|26-31|32-40 |41-42
--------------------------------------------------
11111 HYD 00000000 11 < SUMMED RECORD
11111 BHASKAR 2000.00 HYD INDIA A00000001 11
11111 BHASKAR 3000.00 HYD INDIA A00000002 11
22222 HYD 00000000 11 < SUMMED RECORD
22222 SUNIL 1000.00 HYD INDIA A00000001 11
33333 VIJ 00000000 13 < SUMMED RECORD
33333 KUMAR 2500.00 VIJ INDIA A00000001 11
33333 KUMAR 2500.00 VIJ INDIA A00000002 11
33333 KUMAR 300.00 VIJ INDIA B00000001 22
33333 KUMAR 5000.00 VIJ INDIA B00000002 22
44444 VIR 00000000 22 < SUMMED RECORD
44444 KTBS 1300.00 VIR US B00000001 22
55555 BGL 00000000 22 < SUMMED RECORD
55555 BACHI 1110.00 BGL INDIA B00000001 22
55555 BACHI 1000.00 BGL INDIA B00000002 22
Key 11111, 22222 does not have a match in file2 and hence the sum is only 1 key 33333 has a match on both files and hence the sum is 3
Key 44444, 55555 does not have a match in file1 and hence the sum is only 2
We now splice that summed value on to all the records for each key combo and this is how the output will be before OUTFIL processing
Code:
--------------------------------------------------
KEY1 | DATA1 |KEY2|DATA2|IND-SEQ |FILEIND
1-5 | 6-22 |2325|26-31|32-40 |41-42
--------------------------------------------------
11111 BHASKAR 2000.00 HYD INDIA A00000001 11
11111 BHASKAR 3000.00 HYD INDIA A00000002 11
22222 SUNIL 1000.00 HYD INDIA A00000001 11
33333 KUMAR 2500.00 VIJ INDIA A00000001 13
33333 KUMAR 2500.00 VIJ INDIA A00000002 13
33333 KUMAR 300.00 VIJ INDIA B00000001 23
33333 KUMAR 5000.00 VIJ INDIA B00000002 23
44444 KTBS 1300.00 VIR US B00000001 22
55555 BACHI 1110.00 BGL INDIA B00000001 22
55555 BACHI 1000.00 BGL INDIA B00000002 22
Now using OUTFIL we pick the records we want and chop off the additional bytes we added. All the matching records will have either a 13 or 23 and unmatched records will have the respective file ind
phew coding the solution was much much easier than trying to explain it
Joined: 01 Feb 2007 Posts: 123 Location: Hyderabad
Hi Skolusu,
Thank you so much. You explained step by step very clearly and that too with screen shots. Thanks for your excellent logic and thanks for your detailed explanation.