Need pattern matching algorithm

anurat · New User Joined: 19 Sep 2006 Posts: 6

Hi!
I have a situation.
There are 2 sequential files with account number as a field in the record. Now we have to check for the matching account numbers in both the files and put it in a output file and the account numbers that are not matching in a second output file.
I want the algorithm for this.

Mind you the number of records are more than a million, normal algorithm would take lot of time.
I want the algorithm which takes the least time.

Thank you

Anuvrat

Aji · New User Joined: 03 Feb 2006 Posts: 53 Location: Mumbai

Hi

Please find my suggestion.

1. Make both data files indexed on account number. (Organization is indexed).
2. read first file sequentially.
(read first-file next record at end
perfrom close-para.)
3. move acno1 to acno2.
4. read second file write records accordingly.
(ie. Read second-file not invalid key
write output1-rec
invalid key write output2-rec.)

Regards

Aji Cherian

anurat · New User Joined: 19 Sep 2006 Posts: 6

Hi! Aji

Thanks for the solution!!!!!!

But what about account numbers in the second file which are not there in the first file. This logic does not seem to take care of this criteria.

Thanks
Anuvrat.

Aji · New User Joined: 03 Feb 2006 Posts: 53 Location: Mumbai

Hi

Please see the modified logic.

2. read first file sequentially.
(read first-file next record at end
go to read-file2.

4. read second file write records accordingly.
(ie. Read second-file not invalid key
write output1-rec
delete file2-rec
invalid key write output2-rec.)

read-file2.
read second-file next record at end
perform close-files.
write output2-rec.

Aji Cherian

anurat · New User Joined: 19 Sep 2006 Posts: 6

thats the problem, we cannot delete the record from the file as it may be reuqired for some other purpose. after that the only solution is to make the local copy and then work on that but that will take so much time and space. Again we have to loop for the whole of the second file.

Thanks again boss.

Anuvrat

muthuvel · Posted: Thu Sep 21, 2006 5:47 pm

Hi,
A small bit of Eazytreive will provide you the solution.The only thing is sort the files based on account number and then the sorted files are passed as input to eazytreive and you will get the two desired files.

FILE INFILE1
IBD-OFFC 1 025 A
OFFC 1 008 A
*
FILE INFILE2
IBD-OFFC1 1 025 A
OFFC1 1 008 A
*
FILE OFILE1
OBD-OFFC 1 025 A
OFFCO1 1 008 A
FILE OFILE2
OBD-OFFC1 1 025 A
OFFCO2 1 008 A

*---------- JOB ---------------*

JOB INPUT (INFILE1 KEY INFILE1:OFFC +
INFILE2 KEY INFILE2:OFFC1)

IF MATCHED
MOVE IBD-OFFC TO OBD-OFFC
PUT OFILE1
ELSE
MOVE IBD-OFFC1 TO OBD-OFFC1
PUT OFILE2
END-IF

In this 25 is the record length and 08 is the key length.This proces will continue until the end of both files are reached.

I think this will help you.

Thanks,
Muthuvel.

DavidatK · Posted: Thu Sep 21, 2006 11:13 pm

Anuvrat,

Since this is in a COBOL forum, I'm assuming you want a COBOL solution.

Sort the account files by account number, then do a simple two file compare. This is much faster than creating a vsam file and processing it.

This is the most efficient way, regardless on the number of records.

Here is some pseudo code for a two file match

mmwife · Super Moderator Joined: 30 May 2003 Posts: 1592

Deleted by author.