View previous topic :: View next topic
|
Author |
Message |
rz061m
New User
Joined: 03 Mar 2006 Posts: 48 Location: Chennai
|
|
|
|
I have 2 files like below. I need an output file that would give me all detail records including header & footer from the second file only if some of the detail records were missing in the first file. The file sizes are FB 80 bytes & fields to compare are between 5 to 20.
I/P File 1:
-------------
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
030|379339 | |
010|444333 | |
020|444333 |A2B |
020|444333 |A2C |
030|444333 | |
I/P File 2:
------------
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
020|379339 |A2D |
020|379339 |A2E |
030|379339 | |
010|444333 | |
020|444333 |A2B |
020|444333 |A2C |
030|444333 | |
O/P File:
-----------
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
020|379339 |A2D |
020|379339 |A2E |
030|379339 | |
I tried 2 file comparison to extract the missing records from second file but need logic to extract all the details including the corresponding header & footer. |
|
Back to top |
|
|
mistah kurtz
Active User
Joined: 28 Jan 2012 Posts: 316 Location: Room: TREE(3). Hilbert's Hotel
|
|
|
|
Here is my understanding. Please correct me if I'm wrong?
1. Records Starting with 010 is the Header.
2. Records Starting with 020 is the Data.
3. Records Starting with 030 is the Trailer.
Now you have group of records in both the Input files and you want only those group of records from 2nd Input file, which does not eaxctly matches with First Input File.
For ex:
Code: |
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
030|379339 | | |
and
Code: |
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
020|379339 |A2D |
020|379339 |A2E |
030|379339 | | |
do not match so you want the 2nd set of records in Output.
Code: |
010|444333 | |
020|444333 |A2B |
020|444333 |A2C |
030|444333 | | |
and
Code: |
010|444333 | |
020|444333 |A2B |
020|444333 |A2C |
030|444333 | | |
matches so you don't want it in output.
Can you post your SORT job that you have treid and SYSOUT output, using Code tags. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
It's a very simple two-step process. JOINKEYS, with de-duplicated output to get a list of "groups" that you want to extract. Then JOINKEYS to extract. |
|
Back to top |
|
|
rz061m
New User
Joined: 03 Mar 2006 Posts: 48 Location: Chennai
|
|
|
|
mistah kurtz wrote: |
Here is my understanding. Please correct me if I'm wrong?
1. Records Starting with 010 is the Header.
2. Records Starting with 020 is the Data.
3. Records Starting with 030 is the Trailer.
Now you have group of records in both the Input files and you want only those group of records from 2nd Input file, which does not eaxctly matches with First Input File.
For ex:
Code: |
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
030|379339 | | |
and
Code: |
010|379339 | |
020|379339 |A2B |
020|379339 |A2C |
020|379339 |A2D |
020|379339 |A2E |
030|379339 | | |
do not match so you want the 2nd set of records in Output.
Code: |
010|444333 | |
020|444333 |A2B |
020|444333 |A2C |
030|444333 | | |
and
Code: |
010|444333 | |
020|444333 |A2B |
020|444333 |A2C |
030|444333 | | |
matches so you don't want it in output.
Can you post your SORT job that you have treid and SYSOUT output, using Code tags. |
Yes your understanding is correct..... I used SPLICE to extract only the mismatch records, I do not know how to pull all the set starting from header, detail & Footer |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
So, instead of struggling with SPLICE, did you look at what I suggested?
How many records can there be in a group? |
|
Back to top |
|
|
rz061m
New User
Joined: 03 Mar 2006 Posts: 48 Location: Chennai
|
|
|
|
Bill Woodger wrote: |
So, instead of struggling with SPLICE, did you look at what I suggested?
How many records can there be in a group? |
Each group can have maximum of 15 detail records. I am yet to start the JOINKEYS options. I will try it out today. |
|
Back to top |
|
|
rz061m
New User
Joined: 03 Mar 2006 Posts: 48 Location: Chennai
|
|
|
|
Bill Woodger wrote: |
So, instead of struggling with SPLICE, did you look at what I suggested?
How many records can there be in a group? |
I could achieve what was needed with the below steps, not sure if there is a better way of doing it.
1. SPLICE the 2 files and extract records that are missing in the second file.
2. Remove duplicates based on key fields from the file from Step1
3. Join keys between first file and file from Step2 and extract all the records from first file matching the key. This would give me header/detail/footer. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Use JOINKEYS for the initial matching.
Only include the detail records.
De-duplicate, in the Main Task of the JOINKEYS (various ways to do this).
A second JOINKEYS step to take your de-duped "tickler file" and extract for hits.
That is not the end of the story, because you need to know how you want to match the keys. If you have a "hit" only because SORTing the data changes the order, is that OK? |
|
Back to top |
|
|
Skolusu
Senior Member
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
|
|
|
|
rz061m.
Your match on just the 379339 and 444333 will give you a Cartesian join which you don't need. try this
1. Use JOINKEYS to match on the full key i.e first 16 bytes as you have shown with JOIN UNPAIRED,F2,ONLY that will bring out just unmatched records from file2. Use REFORMAT FIELDS to just write out 379339, 444333.. from F2. i.e REFORMAT FIELDS=(F2:5,6). Make sure you have a COPY operation for the main task and remove the duplicates using OUTFIL with SECTIONS and TRAILER3.
2. Now use another JOINKEYS to match file 2 and the output from above with matching on just the key at pos 5 for 6 bytes. Also the output from above is already SORTED, so make sure you have SORTED and NOSEQCK on the file you referencing in the JOINKEYS. |
|
Back to top |
|
|
|