How to match two files having duplicates

gvel19 · Posted: Wed Oct 01, 2008 1:48 pm

I have two input files. Whereas I need to match the files using keys.

File-1: (Sorted on key and no dups)
------
1
2
3
4
5
6

File-2: (sorted on key and have duplicates)
-------
1
2
4
4
4
5
5
6
I need write the matched records into an output file.I have tried but I'm not able to take care of the duplicates.It would be great if some one gives me hint to tackle the dups.

Thanks,
Vel

expat · Posted: Wed Oct 01, 2008 1:53 pm

Have you thought of using one of the sort products to do this for you ?

There are so many examples of available solutions in the SORT / JCL forums.

karthikr44 · Posted: Wed Oct 01, 2008 2:17 pm

Hi,

Please post the sample output for ur example. I want to know whether u want matched records from file1 or file2.

Regards
R KARTHIK

Escapa · Posted: Wed Oct 01, 2008 2:21 pm

gvel19 · Posted: Wed Oct 01, 2008 4:23 pm

Hi Karthik,

My output should contain
1
2
4
4
4
5
5
6
My output should contain the matched records of file-1.

roopannamdhari · New User Joined: 14 Sep 2006 Posts: 71 Location: Bangalore

Hi Karthik,

Escapa · Posted: Tue Oct 07, 2008 4:30 pm

ip1

dick scherrer · Posted: Wed Oct 08, 2008 1:11 am

Hello,

The posted code does not work for all cases. . .

Unfortunately, it will work some of the time. Due to insufficient testing/test data it would fail in production. It would be better if it abended, but it will most likely only give incorrect output sometimes. Very difficult to find sometimes.

Escapa · Posted: Thu Oct 09, 2008 12:27 pm

star_dhruv2000 · Posted: Tue Oct 14, 2008 3:12 pm

Its will be good if you can use SORT JOIN statement. Following is an example for the same and hope will clear all your issues:

Escapa · Posted: Tue Oct 14, 2008 3:17 pm

dick scherrer · Posted: Tue Oct 14, 2008 7:15 pm

Hello,

You can only use JOINKEYS if the sort for the system is Syncsort. . .

expat · Posted: Tue Oct 14, 2008 7:47 pm

What happens in your program in neither input file is sorted,

To me, if both files need to be in sorted order before processing, why not let the sort product do all of the work in one go rather than perform two sorts to get the input ready and then a COBOL program to do what SORT can do anyway.

file1 =

dick scherrer · Posted: Tue Oct 14, 2008 11:33 pm

Hi Expat,

If the only thing the process needed to accomplish is the match, i might agree. What i am seeing more and more of is jobstreams that have many unneeded steps so that things can be done one-at-a-time (using the sort or other utilitites) - each requiring at least one pass of all the data.

Pretty much every process i've been asked to look at because of poor performance lately have been because no one properly defined the process and kept plugging in "one more" step. Usually a bit of design saves many of these singleton steps, but does require there be some "real" programmer available.

While on some systems 100k or a million records is considered a large file, most of what i've supported for years have run to the hundreds of millions records and cannot afford the multi-passes of the data.

The topic process almost surely needs some additional processing of the data other than just the match. . .