Compare 2 files and extract records between header & tra

rz061m · New User Joined: 03 Mar 2006 Posts: 48 Location: Chennai

I have 2 files like below. I need an output file that would give me all detail records including header & footer from the second file only if some of the detail records were missing in the first file. The file sizes are FB 80 bytes & fields to compare are between 5 to 20.

I/P File 1:
-------------
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
030|379339 | |
010|444333 | |
020|444333 |A2B |
020|444333 |A2C |
030|444333 | |

I/P File 2:
------------
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
020|379339 |A2D |
020|379339 |A2E |
030|379339 | |
010|444333 | |
020|444333 |A2B |
020|444333 |A2C |
030|444333 | |

O/P File:
-----------
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
020|379339 |A2D |
020|379339 |A2E |
030|379339 | |

I tried 2 file comparison to extract the missing records from second file but need logic to extract all the details including the corresponding header & footer.

mistah kurtz · Posted: Wed Aug 21, 2013 11:19 am

Here is my understanding. Please correct me if I'm wrong?
1. Records Starting with 010 is the Header.
2. Records Starting with 020 is the Data.
3. Records Starting with 030 is the Trailer.

Now you have group of records in both the Input files and you want only those group of records from 2nd Input file, which does not eaxctly matches with First Input File.

For ex:

Bill Woodger · Posted: Wed Aug 21, 2013 1:13 pm

It's a very simple two-step process. JOINKEYS, with de-duplicated output to get a list of "groups" that you want to extract. Then JOINKEYS to extract.

rz061m · New User Joined: 03 Mar 2006 Posts: 48 Location: Chennai

Bill Woodger · Posted: Wed Aug 21, 2013 6:57 pm

So, instead of struggling with SPLICE, did you look at what I suggested?

How many records can there be in a group?

rz061m · New User Joined: 03 Mar 2006 Posts: 48 Location: Chennai

rz061m · New User Joined: 03 Mar 2006 Posts: 48 Location: Chennai

Bill Woodger · Posted: Wed Aug 21, 2013 9:39 pm

Use JOINKEYS for the initial matching.

Only include the detail records.

De-duplicate, in the Main Task of the JOINKEYS (various ways to do this).

A second JOINKEYS step to take your de-duped "tickler file" and extract for hits.

That is not the end of the story, because you need to know how you want to match the keys. If you have a "hit" only because SORTing the data changes the order, is that OK?

Skolusu · Posted: Wed Aug 21, 2013 11:47 pm

rz061m.

Your match on just the 379339 and 444333 will give you a Cartesian join which you don't need. try this

1. Use JOINKEYS to match on the full key i.e first 16 bytes as you have shown with JOIN UNPAIRED,F2,ONLY that will bring out just unmatched records from file2. Use REFORMAT FIELDS to just write out 379339, 444333.. from F2. i.e REFORMAT FIELDS=(F2:5,6). Make sure you have a COPY operation for the main task and remove the duplicates using OUTFIL with SECTIONS and TRAILER3.

2. Now use another JOINKEYS to match file 2 and the output from above with matching on just the key at pos 5 for 6 bytes. Also the output from above is already SORTED, so make sure you have SORTED and NOSEQCK on the file you referencing in the JOINKEYS.