IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Compare two VB files of length 5000


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Mazahar

New User


Joined: 11 Dec 2007
Posts: 82
Location: hyderabad

PostPosted: Thu May 07, 2009 1:08 pm
Reply with quote

Hi All,

Can any one tell me to compare two VB files of LRECL 5000.

1) I need to compare first file first record with all the records in second file, if it matches i need to write to MATCH file, if not need to write to DIFF file.

2) Like this i need to do for all the records in file 1.

3) Now i need to read the second file first record and compare all the records of first file, if it matches i need to write to Matchfile, if not write to DIFF file

I tried a PLI, COBOL for small files its working fine, but my files are having 17lk records because of that its taking millions of file open close statements.

I tried ICETOOL, but its not working for the length more than 4080.
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Thu May 07, 2009 1:14 pm
Reply with quote

Quote:
I tried a PLI, COBOL for small files its working fine, but my files are having 17lk records because of that its taking millions of file open close statements

I know that the answer is not sort related, but since the TS posted an issue in programming,
he deserves an answer

two/n files matching algorithms work best if the files are orderd/sorted on the fields to be used for the match
so that all is necessary are parallel reads with some comparisons
( ONE OPEN AND CLOSE for each dataset for the whole process )
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu May 07, 2009 1:17 pm
Reply with quote

Hello,

You might try to define the compare as 2 smaller lengths rather than one of 5000.

If that does not work for you:

Sort both files on the "match key".

Download the sample "match/merge" code from the "sticky" near the top of the cobol forum and change it according to your files.

Run the match/merge and create whatever output files you need.
Back to top
View user's profile Send private message
Mazahar

New User


Joined: 11 Dec 2007
Posts: 82
Location: hyderabad

PostPosted: Thu May 07, 2009 2:07 pm
Reply with quote

Hmmm...Right, If the file is in sorted order or file is having a Key i can comapre pretty easily, but the problem is we do not have a key nor the file in sorted order....file is having some text in each record....

file records are something like this

these are the closing charges
5% on closing charges
10% we will charge for opening
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Thu May 07, 2009 2:46 pm
Reply with quote

on what fields, in Your program, are You carrying on the comparison ??

if the records are completely unformatted then the field is the whole record

please define Your requirements in a better way

if You were able to write a program
You should be able also to describe the algorithm in a better way
Back to top
View user's profile Send private message
Mazahar

New User


Joined: 11 Dec 2007
Posts: 82
Location: hyderabad

PostPosted: Thu May 07, 2009 3:08 pm
Reply with quote

Yes En,

Field is complete record...

Lets us say i have File A and File B, i need to compare these two.

For this what i wrote is

Read File A in to FileA rec
do while not fileA eof
check file2 proc;
read fileA into FileArec;
end

check file2 proc
close fileB; << So that second file cursor moves to fisrt record again
Open fileB;
Read fileB into FileB rec
do while not fileB eof & rec_found
if FileA_rec= FileB Rec then
write matching;
else
write not matching;
read fileB to FileB rec;
end;
Back to top
View user's profile Send private message
subinraj

New User


Joined: 04 Sep 2007
Posts: 16
Location: Bangalore

PostPosted: Thu May 07, 2009 4:15 pm
Reply with quote

Try the followin steps

1. Sort FILEA and FELEB with the following sort card

Code:
OPTION VLSHRT           
SORT FIELDS=(1,5000,CH,A)


2. Use the sorted input files and create a program with follwing steps
Code:
Read FILEA into FILEA-Rec
Read FILEB into FILEB-Rec

PERFORM UNTIL FILEA-EOF AND FILEB-EOF
   EVALUATE TRUE
   WHEN FILEA-Rec == FILEB-Rec
      <Record is present in both files. Do the required processings>
      Read next FILEA IF FILEA-NOT-EOF
      Read next FILEB IF FILEB-NOT-EOF
 
   WHEN (FILEA-Rec < FILEB-Rec) OR FILEB-EOF
      <Record is present in FILEA AND NOT IN FILEB
       Write the FILEA-Rec to DIFF file>
                 Read next FILEA
         
   WHEN FILEA-Rec > FILEB-Rec OR FILEA-EOF
      <Record is present in FILEB AND NOT IN FILEA
       Write the FILEB-Rec to DIFF file>
                 Read next FILEB

   END-EVALUATE
END-PERFORM

Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu May 07, 2009 9:49 pm
Reply with quote

subinraj,

If you had actually bothered to test your control statements:

Code:

  OPTION VLSHRT           
  SORT FIELDS=(1,5000,CH,A)


you would have found out that they don't work! The maximum control field that DFSORT can use is 4092. 5000 is too large.

Please don't post untested "solutions" in this Forum.
Back to top
View user's profile Send private message
subinraj

New User


Joined: 04 Sep 2007
Posts: 16
Location: Bangalore

PostPosted: Fri May 08, 2009 1:39 pm
Reply with quote

Frank,

Thanks for pointing out the error. Actually i tested with a lower length file.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri May 08, 2009 9:27 pm
Reply with quote

Quote:
Actually i tested with a lower length file.


I don't know what you mean by that (LRECL? maximum record length?). Your statement gets a syntax error regardless of the length of the file.
Back to top
View user's profile Send private message
Mazahar

New User


Joined: 11 Dec 2007
Posts: 82
Location: hyderabad

PostPosted: Sat May 09, 2009 12:42 am
Reply with quote

I have achieved it....but so much of process

1) Wrote a REX to invoke superC got FileA

2) Took FileA and then wrote a sort to strip out I & D(which we get when we do 3.13) records got FILEB

3) wrote a PLI to strip out the FILEB records from actual file, worte PLI because i cant use the records in sort (for sort i need include or omit condition to be specified for the records of file) achived a fileC (now file C has difference records)

4) Now again wrote another PLI to divide File C from Actual file...and got the two files one with Difference and one with the Atuals

I dont know what ever i did is correct or wrong, but i have got the outputs...

If any one can suggest a better way than this....that would be a great help.

Expecting something with sorts only from Frank and Xpact :-)
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Sat May 09, 2009 1:04 am
Reply with quote

mazahar,

You can try to implement the solution listed here

www.ibmmainframes.com/viewtopic.php?p=125375#125375
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sat May 09, 2009 1:09 am
Reply with quote

Hmmm. . . .

That reply does not look like it came from "Frank and Xpact" icon_rolleyes.gif
Back to top
View user's profile Send private message
Mazahar

New User


Joined: 11 Dec 2007
Posts: 82
Location: hyderabad

PostPosted: Sun May 10, 2009 11:29 am
Reply with quote

Dick Scherrer......Sorry if i hurt you........i was looking for some sort things, that is why i specified their names....really sorry about that....

Sukolu......thanks for your job...i will try tomorrow.

If i have 1000's account numbers in a file, Can i sort them from a file with out giving it in sysin....i meansi have a file like this

this is my VB file of lrecl 4504 having lakhs of records
XXXXXXXXXX120XXXXXXXXX
XXXXXXXXXX121XXXXXXXXX
XXXXXXXXXX122XXXXXXXXX
XXXXXXXXXX123XXXXXXXXX
XXXXXXXXXX124XXXXXXXXX
XXXXXXXXXX125XXXXXXXXX
XXXXXXXXXX126XXXXXXXXX
XXXXXXXXXX127XXXXXXXXX
XXXXXXXXXX128XXXXXXXXX

All the accounts are there in 10th position....I have got a file like this with all the account number which i need to sort from that file
121
125
128
127

like this 1000's account numbers....

what i knew is i need to give all these accounts numbers in SYSIN with a Include condition......but copy pasting 1000's of accounts taking lot more time....is there any way i can sort this file directly from the source ?/
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Mon May 11, 2009 8:17 am
Reply with quote

Hello,

Quote:
Sorry if i hurt you........i was looking for some sort things, that is why i specified their names
No harm was done to me - my post was for humor and a bit guidance for you icon_smile.gif

Suggest you reflect that when one asks for help, one should not specify who they want to respond.

In this part of the forum (DFSORT) most topics are handled by Frank and Skolusu (who is also on the sort product development team at IBM).
Back to top
View user's profile Send private message
Mazahar

New User


Joined: 11 Dec 2007
Posts: 82
Location: hyderabad

PostPosted: Mon May 11, 2009 10:14 pm
Reply with quote

:-)

Cool.....

Any one can help me with answer to my query???



If i have 1000's account numbers in a file, Can i sort them from a file with out giving it in sysin....i meansi have a file like this

this is my VB file of lrecl 4504 having lakhs of records
XXXXXXXXXX120XXXXXXXXX
XXXXXXXXXX121XXXXXXXXX
XXXXXXXXXX122XXXXXXXXX
XXXXXXXXXX123XXXXXXXXX
XXXXXXXXXX124XXXXXXXXX
XXXXXXXXXX125XXXXXXXXX
XXXXXXXXXX126XXXXXXXXX
XXXXXXXXXX127XXXXXXXXX
XXXXXXXXXX128XXXXXXXXX

All the accounts are there in 10th position....I have got a file like this with all the account number which i need to sort from that file
121
125
128
127

like this 1000's account numbers....

what i knew is i need to give all these accounts numbers in SYSIN with a Include condition......but copy pasting 1000's of accounts taking lot more time....is there any way i can sort this file directly from the source ?/
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Mon May 11, 2009 11:25 pm
Reply with quote

So for output, you want the records from input file1 that have a matching account number in input file2 ... right?

You said input file1 has RECFM=VB and LRECL=4504.

What is the RECFM and LRECL of input file2?

What is the starting position, length and format of the account numbers in input file1?

What is the starting position, length and format of the account numbers in input file2?

Can there be duplicate account numbers in input file1 (e.g. two records with account number 123?).
Back to top
View user's profile Send private message
Mazahar

New User


Joined: 11 Dec 2007
Posts: 82
Location: hyderabad

PostPosted: Tue May 12, 2009 6:34 pm
Reply with quote

Frank,

"So for output, you want the records from input file1 that have a matching account number in input file2 ... right? "

YES


Second file is 80byte FB,

Starting position is 10 and lengtha is 10 is first file

Starting postion is 8 and length is 10 in second file, and format is Char
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue May 12, 2009 9:22 pm
Reply with quote

Quote:
Can there be duplicate account numbers in input file1 (e.g. two records with account number 123?).


You didn't answer this question, so I'll assume there are no duplicates within input file1.

Quote:
Starting position is 10 and lengtha is 10 is first file


I'll assume that you didn't count the RDW in positions 1-4, so the field really starts in position 14.

Given those assumptions, here's a DFSORT/ICETOOL job that will do what you asked for.

Code:

//S1   EXEC  PGM=ICETOOL
//TOOLMSG   DD  SYSOUT=*
//DFSMSG    DD  SYSOUT=*
//IN1 DD DSN=...  input file1 (VB/4504)
//IN2 DD DSN=...  input file2 (FB/80)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT DD DSN=...  output file (VB/4504)
//TOOLIN DD *
COPY FROM(IN1) TO(T1)
COPY FROM(IN2) USING(CTL2)
SELECT FROM(T1) TO(OUT) ON(14,10,CH) FIRSTDUP
/*
//CTL2CNTL DD *
  INREC BUILD=(10:8,10)
  OUTFIL FNAMES=T1,FTOV
/*
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Store the data for fixed length COBOL Programming 1
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts PARSE Syntax for not fix length word ... JCL & VSAM 7
No new posts Write line by line from two files DFSORT/ICETOOL 7
No new posts Compare only first records of the fil... SYNCSORT 7
Search our Forums:

Back to Top