Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

File comparison to find most efficient way

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> Mainframe Interview Questions
View previous topic :: :: View next topic  
Author Message
ashishsr123

New User


Joined: 06 May 2008
Posts: 33
Location: Chennai

PostPosted: Wed May 27, 2009 6:19 pm    Post subject: File comparison to find most efficient way
Reply with quote

Hi,

Interview question !

Scenario:

We have 3 files:
File A : contains Merchant ID only..10 in number.
File B: Contains transaction records for various merchants. Here are Merchants including merchants in file A. This file is HUGE ,having millions of records.

Output:
File C: Having transaction details of merchants in file A.
Note: File B has records which has merchant ID's as well. Merchant id is numeric number.

We have to find most efficient way to achieve this. A simple algorithm is required irrespective of any language.

Any ideas:

I suggest we sort the file b and then do binary search on file B against file merchant id is file A.( Now i see there will be lot issues with this as binary search would pic the first matching record and move out.)

We could also do a sequential search on file B and write record to file C when we have a match for merchant ID in file B.
Back to top
View user's profile Send private message

Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 7992
Location: Bellevue, IA

PostPosted: Wed May 27, 2009 6:30 pm    Post subject:
Reply with quote

Write a short program which reads file A into a table in memory. Make one pass against file B, checking each merchant id against table in memory; if there is a match output file B record to file C.

Your suggestion requires random access to the file for the binary search; I'm not even sure it would be possible to implement this under VSAM. Binary search also wouldn't necessarily find the first matching record -- depending on how the keys occur, any matching record could be selected. You would then have to read backward and forward in the file B to find the other matches.

As far as your suggestion of a sequential search on B, why make one pass against millions of records for each record of A when a table will reduce it to one pass, total?
Back to top
View user's profile Send private message
ashishsr123

New User


Joined: 06 May 2008
Posts: 33
Location: Chennai

PostPosted: Wed May 27, 2009 8:39 pm    Post subject:
Reply with quote

Quote:
table in memory


Not sure what you mean by table in memory...you mean using.. occurs ?
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Wed May 27, 2009 9:03 pm    Post subject:
Reply with quote

use sort to match a and b and output to c.
no program, only control cards.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 7992
Location: Bellevue, IA

PostPosted: Wed May 27, 2009 9:31 pm    Post subject:
Reply with quote

Yes, occurs in COBOL.
Back to top
View user's profile Send private message
sajjan jindal
Warnings : 1

New User


Joined: 09 Sep 2007
Posts: 60
Location: india

PostPosted: Mon Jul 06, 2009 11:42 am    Post subject: Reply to: File comparison to find most efficient way
Reply with quote

I assume that the File B will consist a single matching record for the record in File - A (however the below algo wont need much of modification for multiple records into the File B)

1. Start Reading the records from both the files until EOF
2. If File-A-Rec > File-B-Rec
Read next record from the File-B.
3. Else If File-A-Rec < File-B-Rec
Read next record from the File-A.
4. Else if File-A-Rec = File-B-Rec
write the record to File-C-Rec.
Read next record from the File-A.
Read next record from the File-B.
5. go to step 1
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Mon Jul 06, 2009 7:16 pm    Post subject:
Reply with quote

Hello,

For this particular question, possibly the most important bit of information wasn't provided/posted. . .

Are file A and file B in merchant id sequence or not?

If they are not in sequence, Rovert's suggestion would be most efficient.

If they are in sequence, DBZ's sort solution would be efficient.

If the rules were that this must be done with a program, code like pseudo code from sajjan jindal could work well. This is known as a "2-file match/merge" or "line balance". There is a "Sticky" near the top of the cobol forum that contains working code that will provide this functionality with very minor modification.
Back to top
View user's profile Send private message
ashishsr123

New User


Joined: 06 May 2008
Posts: 33
Location: Chennai

PostPosted: Sat Jul 11, 2009 5:21 pm    Post subject:
Reply with quote

Hello,

Dick:
Both the files are not sorted

sajjan jindal:
Your solution would not fit here as File B can have many matching records for File A.
Your solution would work in both the Files are sorted. Since it contains millions of records , it is not feasible.

I think Robert solution is simple and best.
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2237
Location: @my desk

PostPosted: Sat Jul 11, 2009 7:21 pm    Post subject:
Reply with quote

Quote:
We have 3 files:
File A : contains Merchant ID only..10 in number.
ashishsr123,

Do you mean that you'll have 10 records in File-A? If you have only a few records in File A, you can build INCLUDE statements out of this and apply this sort card on your huge file - File B. Kind of similar logic as Robert's, but using a utility.

Remember, this solution can be implemented ONLY if you have a limited number of records in File-A.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> Mainframe Interview Questions All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Changing of LRECL of a file abdulrafi DFSORT/ICETOOL 1 Fri Mar 24, 2017 3:25 pm
No new posts splitting a file abdulrafi DFSORT/ICETOOL 2 Fri Mar 24, 2017 11:51 am
No new posts Receive a file using PCOMM macro Harald.v.K IBM Tools 0 Thu Mar 23, 2017 6:50 pm
No new posts Export flat file data into excel sheet murali.andaluri DFSORT/ICETOOL 2 Mon Mar 20, 2017 5:39 pm
No new posts Formatting VB File Learncoholic DFSORT/ICETOOL 3 Mon Mar 20, 2017 12:29 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us