Matching files with different length

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

Hi all,
i have two input files with different length and number of records.

First file (700.000 records and LRECL=10000)
Second file (25.000 records and LRECL=322)

Example:

File 1:

Keys
111111
222222
333333
444444
555555

File 2:
111111
333333
444444

The output files will be:

File MATCH:
111111
333333
444444

File NOMATCH:
222222
555555

This is my JCL

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

If in the first file there are 100.000 records, the total of the output files would be 100.000.

Bill Woodger · Posted: Wed Mar 04, 2015 3:18 pm

I think you'll find you have duplicate keys on one or both input files.

Also, you may want to consider dynamic allocation of SORTWK files.

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

Please can u give me an example or the solution to resolve my problem.
I don't understand what you say.
Thank you

rinsio · Posted: Wed Mar 04, 2015 3:39 pm

You have in the second file duplicate keys. when both keys match, the output file (match) reflect all the instances of the key.

The resolve the problem depend what you want in the output file.

Regards

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

i would like:

File 1:

Keys
111111
222222
333333
444444
555555

File 2:
111111
333333
444444

The output files will be:

File MATCH:
111111
333333
444444

File NOMATCH:
222222
555555

Bill Woodger · Posted: Wed Mar 04, 2015 4:07 pm

No. That is what your existing code deals with. You need to show what you want to happen when there are duplicate key values withing either or both of your input files.

You also show your sample data in key order. Is that correct? If so, specify SORTED on the JOINKEYS statements and get rid of the SORTWKn files altogether.

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

I don't want duplicate key.
If i have 1000 records in the FILE 1 (master file), the output files will be:

File Match: 300
File No Match: 700

Total records both output files: 1000 (like input FILE 1)

p.s.

I don't want to use SORT with SUM FIELDS=NONE

Bill Woodger · Posted: Wed Mar 04, 2015 4:41 pm

You've avoided the question about whether your input is in sequence.

If you have to SORT your input data (it happens by default for each JOINKEYS) then what do you have against SUM FIELDS=NONE?

If you don't have to SORT (so you specify SORTED on the JOINKEYS) then you can use SEQNUM with RESTART= for the key, and have INCLUDE= on your first OUTFILs to just get the first record of each key.

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

I think that both input files are not sorted by key-field

Bill Woodger · Posted: Wed Mar 04, 2015 5:10 pm

So your sample data should represent that. Unsorted, both files, duplicates possible (both files?) and then the output you require.

If you need to SORT for JOINKEYS, easiest thing to do to get rid of duplicate keys is SUM FIELDS=NONE. That would be in a JNFnCNTL dataset.

Then the rest of the code does not need to change. Although you could make a change, to use OUTFIL SAVE on the second OUTFIL, so all data not on the first OUTFIL would appear on the second.

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

Is there another way to do the same thing (another kind of sort) without duplicate records in output files?

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

Maybe I don't explain the problem or i don't understand what you say (I'm sorry)

With my jcl i have about 725.000 (700.000 in the file-1 (master file) and 25.000 in the second file)

Now, the sum of the records of the output files are about 750.000. I'm expect 725.000.

So, i can sort both file by key-field but I think that the i wuold have 750.000 and not 725.000 records.

I think the this kind of sort-match doesn't need another step of SORT with SUM FIELDS=NONE.

Bill Woodger · Posted: Wed Mar 04, 2015 7:04 pm

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

Sorry, i'm expecting the total of the output files 700.000 records not 725.000

Bill Woodger · Posted: Wed Mar 04, 2015 7:32 pm

Run your SORT with the sample input I've shown. How many records do you get?

Add this to your step:

danylele74 · New User Joined: 03 Jul 2014 Posts: 28 Location: Italy

it's doesn't work

File1 (master) 2.396.705
File2 24.844

total records output (matched and no-matched): 106802

This is the jcl:

Bill Woodger · Posted: Wed Mar 04, 2015 9:33 pm

Please paste the sysout from the step. It worked.