Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

How to match two files having duplicates

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> COBOL Programming
View previous topic :: :: View next topic  
Author Message
gvel19

New User


Joined: 20 Jul 2008
Posts: 19
Location: Schenactady, US

PostPosted: Wed Oct 01, 2008 1:48 pm    Post subject: How to match two files having duplicates
Reply with quote

I have two input files. Whereas I need to match the files using keys.

File-1: (Sorted on key and no dups)
------
1
2
3
4
5
6

File-2: (sorted on key and have duplicates)
-------
1
2
4
4
4
5
5
6
I need write the matched records into an output file.I have tried but I'm not able to take care of the duplicates.It would be great if some one gives me hint to tackle the dups.

Thanks,
Vel
Back to top
View user's profile Send private message

expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8593
Location: Back in jolly old England

PostPosted: Wed Oct 01, 2008 1:53 pm    Post subject:
Reply with quote

Have you thought of using one of the sort products to do this for you ?

There are so many examples of available solutions in the SORT / JCL forums.
Back to top
View user's profile Send private message
karthikr44

Active User


Joined: 25 Aug 2007
Posts: 235
Location: Chennai

PostPosted: Wed Oct 01, 2008 2:17 pm    Post subject: Reply to: How to match two files having duplicates
Reply with quote

Hi,

Please post the sample output for ur example. I want to know whether u want matched records from file1 or file2.

Regards
R KARTHIK
Back to top
View user's profile Send private message
Escapa

Senior Member


Joined: 16 Feb 2007
Posts: 1399
Location: IL, USA

PostPosted: Wed Oct 01, 2008 2:21 pm    Post subject:
Reply with quote

Quote:
I have tried but I'm not able to take care of the duplicates.It would be great if some one gives me hint to tackle the dups

What is the logic you are using?
Back to top
View user's profile Send private message
gvel19

New User


Joined: 20 Jul 2008
Posts: 19
Location: Schenactady, US

PostPosted: Wed Oct 01, 2008 4:23 pm    Post subject: Reply to: How to match two files having duplicates
Reply with quote

Hi Karthik,

My output should contain
1
2
4
4
4
5
5
6
My output should contain the matched records of file-1.
Back to top
View user's profile Send private message
roopannamdhari
Warnings : 1

New User


Joined: 14 Sep 2006
Posts: 71
Location: Bangalore

PostPosted: Tue Oct 07, 2008 11:00 am    Post subject:
Reply with quote

Hi Karthik,

Code:
My output should contain
1
2
4
4
4
5
5
6
My output should contain the matched records of file-1.


output should contain file-1 r file-2.bcz here your output is having file-2 records
Back to top
View user's profile Send private message
Escapa

Senior Member


Joined: 16 Feb 2007
Posts: 1399
Location: IL, USA

PostPosted: Tue Oct 07, 2008 4:30 pm    Post subject:
Reply with quote

ip1
Code:

1
2
3
5
6

ip2
Code:

1
2
4
4
4
5
5
6

Here i assume that you want all the instances of file2 rec which are present in file1
Code:

DATA DIVISION.                         
FILE SECTION.                           
FD FILE1.                               
01 REC1.                               
    02 REC1-CMP-KEY PIC 9(1).           
    02 FILLER PIC X(79).               
FD FILE2.                               
01 REC2.                               
    02 REC2-CMP-KEY PIC 9(1).           
    02 FILLER PIC X(79).               
WORKING-STORAGE SECTION.               
77 EOF1 PIC X VALUE 'N'.               
77 EOF2 PIC X VALUE 'N'.               
PROCEDURE DIVISION.                     
    OPEN INPUT FILE1 FILE2.             
    READ FILE1 AT END MOVE 'Y' TO EOF1.
    READ FILE2 AT END MOVE 'Y' TO EOF2.
    PERFORM READ-BOTH-FILES               
    UNTIL EOF1 = 'Y' OR EOF2 = 'Y'.       
    CLOSE FILE1 FILE2.                     
    STOP RUN.                             
READ-BOTH-FILES.                           
    EVALUATE TRUE                         
    WHEN REC1-CMP-KEY = REC2-CMP-KEY       
    PERFORM MATCH-PARA                     
    WHEN REC1-CMP-KEY > REC2-CMP-KEY       
    PERFORM READF2                         
    WHEN REC1-CMP-KEY < REC2-CMP-KEY       
    PERFORM READF1                         
    END-EVALUATE.                         
MATCH-PARA.                               
    DISPLAY REC1.                         
    READ FILE2 AT END MOVE 'Y' TO EOF2.   
    IF REC1-CMP-KEY NOT = REC2-CMP-KEY THEN
    READ FILE1 AT END MOVE 'Y' TO EOF1.     
READF1.                                     
    READ FILE1 AT END MOVE 'Y' TO EOF1.     
READF2.                                     
    READ FILE2 AT END MOVE 'Y' TO EOF2.     

Output will be
Code:

1
2
5
5
6
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Wed Oct 08, 2008 1:11 am    Post subject:
Reply with quote

Hello,

The posted code does not work for all cases. . . icon_sad.gif

Unfortunately, it will work some of the time. Due to insufficient testing/test data it would fail in production. It would be better if it abended, but it will most likely only give incorrect output sometimes. Very difficult to find sometimes.
Back to top
View user's profile Send private message
Escapa

Senior Member


Joined: 16 Feb 2007
Posts: 1399
Location: IL, USA

PostPosted: Thu Oct 09, 2008 12:27 pm    Post subject:
Reply with quote

Quote:
Unfortunately, it will work some of the time. Due to insufficient testing/test data it would fail in production. It would be better if it abended, but it will most likely only give incorrect output sometimes. Very difficult to find sometimes.


Hi Dick,
Am confused with this... icon_confused.gif icon_confused.gif icon_confused.gif

Below are some of ip1 ip2 and o\p i have tested and its working as expected..
IP1 KEYS-----------IP2_KEYS--------------O/P
------------------------------------------------
1,2,3,5,6----------1,2,4,4,4,5,5,6--------1,2,5,5,6
EMPTY--------------1,2,4,4,4,5,5,6--------EMPTY
1,2,3,4------------1,2,4,4,4,5,5,6--------1,2,4,4,4
1,2,3,4------------EMPTY------------------EMPTY
1,2,3,4------------1,2,3,4----------------1,2,3,4
EMPTY--------------EMPTY------------------EMPTY
Back to top
View user's profile Send private message
star_dhruv2000

New User


Joined: 03 Nov 2006
Posts: 87
Location: Plymouth, MN USA

PostPosted: Tue Oct 14, 2008 3:12 pm    Post subject:
Reply with quote

Its will be good if you can use SORT JOIN statement. Following is an example for the same and hope will clear all your issues:


Code:

//SRTJNF1 DD *
1
2
3
4
//SRTJNF2 DD *
1
2
4
4
//SORTOUT DD SYSOUT=*
//SYSIN DD *
JOINKEYS FILE=F1,FIELDS=(1,1,CH,A)
JOINKEYS FILE=F2,FIELDS=(1,1,CH,A)
JOIN UNPAIRED
SORT FIELDS=COPY
/*


Hope this will resolve your issues icon_smile.gif

Happy coding!
Cheers
Back to top
View user's profile Send private message
Escapa

Senior Member


Joined: 16 Feb 2007
Posts: 1399
Location: IL, USA

PostPosted: Tue Oct 14, 2008 3:17 pm    Post subject:
Reply with quote

Quote:

Its will be good if you can use SORT JOIN statement.

May be. But as poster has posted it in COBOL FORUM it seems he wants in COBOL
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue Oct 14, 2008 7:15 pm    Post subject:
Reply with quote

Hello,

You can only use JOINKEYS if the sort for the system is Syncsort. . .
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8593
Location: Back in jolly old England

PostPosted: Tue Oct 14, 2008 7:47 pm    Post subject:
Reply with quote

What happens in your program in neither input file is sorted,

To me, if both files need to be in sorted order before processing, why not let the sort product do all of the work in one go rather than perform two sorts to get the input ready and then a COBOL program to do what SORT can do anyway.

file1 =
Code:

3
1
6
5


file 2 =
Code:

6
4
1
4
2
6
4
5
4
5
6
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue Oct 14, 2008 11:33 pm    Post subject:
Reply with quote

Hi Expat,

If the only thing the process needed to accomplish is the match, i might agree. What i am seeing more and more of is jobstreams that have many unneeded steps so that things can be done one-at-a-time (using the sort or other utilitites) - each requiring at least one pass of all the data.

Pretty much every process i've been asked to look at because of poor performance lately have been because no one properly defined the process and kept plugging in "one more" step. Usually a bit of design saves many of these singleton steps, but does require there be some "real" programmer available.

While on some systems 100k or a million records is considered a large file, most of what i've supported for years have run to the hundreds of millions records and cannot afford the multi-passes of the data.

The topic process almost surely needs some additional processing of the data other than just the match. . .
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> COBOL Programming All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Removing Duplicates based on certain ... chandracdac DFSORT/ICETOOL 8 Fri Dec 09, 2016 4:40 am
No new posts High CPU consumption Job using IAM fi... aswinir JCL & VSAM 15 Thu Dec 01, 2016 8:28 pm
No new posts Updating the counters after eliminati... PANDU1 DFSORT/ICETOOL 12 Mon Nov 21, 2016 9:47 am
No new posts Match or compare two files in VB Format anatol DFSORT/ICETOOL 14 Thu Nov 03, 2016 7:41 pm
This topic is locked: you cannot edit posts or make replies. How to use 2 input files in control c... Gunapala CN DFSORT/ICETOOL 23 Thu Oct 13, 2016 3:42 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us