Match two files, create 3rd file w/fields from both I/Ps

Jenifer Lewis · New User Joined: 14 Sep 2009 Posts: 28 Location: Maine

I am looking for an ICETOOL solution to the following problem, as our DFSORT release (ICE201I F) is not recent enough to contain JOINKEYS.

Two comma-delimited files (LRECL=80) with no duplicates in either file:

File A:

1111111111111,AAAAAAAAAA,
2222222222222,BBBBBBBBBB,
3333333333333,CCCCCCCCCC,

File B:

1111111111111,XXXXXXXXXXXXXX,
3333333333333,YYYYYYYYYYYYYY,
3333333333333,ZZZZZZZZZZZZZZ,

I need to compare File A with File B, matching on first 13 bytes, then create a third comma-delimited output file consisting of File B's second field, File A's first field, and File A's second field.

Desired results:

XXXXXXXXXXXXXX,1111111111111,AAAAAAAAAA
YYYYYYYYYYYYYY,3333333333333,CCCCCCCCCC
ZZZZZZZZZZZZZZ,3333333333333,CCCCCCCCCC

Thank you for any help you can provide!

Bill Woodger · Posted: Wed Sep 03, 2014 4:32 am

Well, there's SPLICE.

You say "no duplicates" but show something which looks pretty duplicate in your sample and expected results. Can you be a bit clearer, please?

mistah kurtz · Posted: Wed Sep 03, 2014 12:39 pm

Bill has already pointed out, that there are duplicates in your sample. You can try the something similar to the below job which uses SPLICE and WITHALL to handle duplicates. If you don't have duplicates, you can remove the WITHALL parameter.

Jenifer Lewis · New User Joined: 14 Sep 2009 Posts: 28 Location: Maine

I apologize for not being a bit clearer: there are no duplicates within either File A or File B.

The results file should contain only those records which are on both files.

mistah kurtz · Posted: Wed Sep 03, 2014 5:52 pm

@Jenifer: Did you try running the solution that I posted. Are you getting the desired output?

Bill Woodger · Posted: Wed Sep 03, 2014 6:05 pm

And what is this if not duplicate?

Jenifer Lewis · New User Joined: 14 Sep 2009 Posts: 28 Location: Maine

What a busy morning this has been, mistah kurtz!

Okay, I just tried it, and I'm getting everything from File B, reformatted of course. So when there is a match between the two input files, data in the second field of File A shows up in the third field of the results file, and when there is not, the third field contains just spaces.

The ultimate goal is to have a results file with only those records from File B which match File A. I can sort the results to exclude those records with spaces in the third file, but I'd rather do it all in one step.

mistah kurtz · Posted: Wed Sep 03, 2014 6:45 pm

you are getting below Output?

Jenifer Lewis · New User Joined: 14 Sep 2009 Posts: 28 Location: Maine

Here's the problem: I supplied a File B that would always find a match in File A, which is not reality-based in this application. My apologies.

(I apologize for my absence. I've spent nearly all day on a production problem for a legacy system caused by one of the business users messing with last night's batch schedule, not telling any of us what was going on, then complaining this morning that things didn't go well. To think I get paid to have all this fun. What made me think I could work on new development today? Silly me.)

Okay, so when I run the JCL you so helpfully provided with the test data I created for this example, it works fine. When I use the test files I created from the actual database, I get the results described in my prior message.

Here is File A in its entirety:

Rohit Umarjikar · Posted: Thu Sep 04, 2014 10:32 am

Please have a look at the splice here as well and reformat as per the need. Also your example still shows duplicates and resulting into cartesian join.
pic.dhe.ibm.com/infocenter/zos/v1r13/index.jsp?topic=%2Fcom.ibm.zos.r13.iceg200%2Fice1cg6054.htm

Tricks
ftp.software.ibm.com/storage/dfsort/mvs/sorttrck.pdf

mistah kurtz · Posted: Thu Sep 04, 2014 11:50 am

Okay Jenifer. Assuming that the test data that you have shown us is truly representative of your actual data, try this. Hopefully this should work.

Jenifer Lewis · New User Joined: 14 Sep 2009 Posts: 28 Location: Maine

Rohit Umarjikar, you are of course correct about the duplicates. I was thinking of the account numbers (second field) in File B as making each record unique, which they are, but there were definitely duplicate supplier numbers (first field) and they were what needed to be matched on File A. I apologize for any confusion my blinkered view caused.

I appreciate the link to the SPLICE command, and I will examine it to see where I was getting things wrong (quite possibly the very thing mentioned above). I have the "DFSORT tricks" manual but so many of the tricks I've tried are unsupported by my employer's current release that I have pretty much given up using it as a resource.

mistah kurtz, the tweaks you applied to the BUILD and SPLICE statements worked beautifully. Your help has been an invaluable life-saver during these days of ever-increasing work loads with no time to learn and hone new skills. I'm sorry I had such a hard time focusing on what you were trying to show me, and that I made it more difficult with my unclear problem statements.

I thank both of you from the bottom of my heart.

mistah kurtz · Posted: Thu Sep 04, 2014 9:09 pm

You're welcome !