SuperCE in sort ?

Vijay Subramaniyan · New User Joined: 06 Jul 2011 Posts: 14 Location: india

Hi,

I have two PS files in VB format . Each file is of length 27500. I want to
compare the two files record by record . I need to have the matched records in one dataset and the unmatched records in the other. I tried with superCE for this. It is taking more than 1 and half hours and still the job didnt get completed . Is there any way to accomplish this in sort ? I think the maximum number of bytes that we can specify in the control fields is 4082 odd.

Bill Woodger · Posted: Wed Sep 05, 2012 12:46 pm

Can you paste from the screen/batch job all the options/control cards that you are using for the SuperCe, please?

dbzTHEdinosauer · Posted: Wed Sep 05, 2012 1:35 pm

other than record count,
blocking factor of files
and bufno parm of dd statements.

Vijay Subramaniyan · New User Joined: 06 Jul 2011 Posts: 14 Location: india

This is the code used .
The block size of the file is 27504

//SUPERC EXEC PGM=ISRSUPC,
// PARM=(CHNGL,LINECMP,
// '',
// '')
//NEWDD DD DSN=DATX00D.PB.DXP2.HRDCPY.SCL.NEW1,
// DISP=SHR
//OLDDD DD DSN=DATX00D.GB.DXP2.HRDCPY.SCL.OUT,
// DISP=SHR
//OUTDD DD DSN=DATX00D.COVER.PAGE.TR.WORK1,DISP=(NEW,CATLG,DELETE),
// DATACLAS=JUMBO,DCB=(DATX00D.GB.DXP2.HRDCPY.SCL.OUTFILE)

enrico-sorichetti · Posted: Wed Sep 05, 2012 2:04 pm

superce is not for application file compare ...
it is for source code compare,

the joinkey process assumes sorted data,
superce does not make any assumptions,

joinkey is record oriented
superce is block oriented
( the update option will generate the update cards to get from source1 to source2)

the conclusion ... You are using the wrong tool

in this case faster to use the two file compare COBOL program that You can find here

dick scherrer · Posted: Wed Sep 05, 2012 6:45 pm

Hello,

Vijay Subramaniyan · New User Joined: 06 Jul 2011 Posts: 14 Location: india

Thanks Dick.

But your code seems to work only if the files are in sequence.

The two input files that I have are of length 27500.

To make these files in sequence, I can not have a sort card like the below

SORT FIELDS=(5,27296,CH,A) . Is it not ?

What would you advise to achieve this( Making both the files in sequence) ?

Bill Woodger · Posted: Sat Sep 08, 2012 8:07 pm

If your files are not in the same sequence as each other, then you'll have a problem using anything to compare them.

Is there anything which makes the records unique? You could sort on that, without having to sort on the whole thing. If you get "nrarly unique" you might have some amount fields you can include in the sort.

Vijay Subramaniyan · New User Joined: 06 Jul 2011 Posts: 14 Location: india

Bill .

I dont see anything unique. Both the files are AFP files . No other option in achieving this ? Can we be able to achieve this in Cobol sort ?

dbzTHEdinosauer · Posted: Sat Sep 08, 2012 8:33 pm

why don't you explain how you are generating the two files,
and why they would have to be sorted for comparison?
you were not sorting them for the superc,
why now???

this has indeed been a thread meandering everywhere,
because no explanation or direction was given.

a software engineer started this silliness and the train has been derailed.

Bill Woodger · Posted: Sat Sep 08, 2012 8:38 pm

You are trying to compare reports?

Are the input data the same?

Are the reports supposed to be different, or the same?

dbzTHEdinosauer · Posted: Sat Sep 08, 2012 8:58 pm

Bill Woodger · Posted: Sat Sep 08, 2012 10:11 pm

Well, for the picky...

You are trying to compare "documents"?

Are the input data the same?

Are the "documents" supposed to be different, or the same?

Reason being, if they (whatever they are as represented by your records) are supposed to be the same, then you do a one-to-one match on a "record number".

If the inputs are "different but equivalent" and the document definitions are unchanged, compare the inputs.

If everything is changed and the outputs are supposed to be different, then go whistle.

dbzTHEdinosauer · Posted: Sat Sep 08, 2012 10:31 pm

dick scherrer · Posted: Sun Sep 09, 2012 1:03 am

Hello,

Yes, matching 2 files requires they be in sequence.

If all you want to know if there is a difference, change the sample code to just read a record from each file and compare them. If they are equal, read the next. If there is a not equal, show the difference and stop because there would be no way for this code to determine which file was "different". Show the record number to makt it easier to look at the files to see why the difference.

The program would run to end of job if they all match and terminate when an unequal is found.

Bill Woodger · Posted: Sun Sep 09, 2012 1:18 am

The attempt to match with SuperCE might(!) imply that the files do not match very well.

You really need to tell us what you are trying to match. We can assume "documents" as you mentioned AFP. Should they be the same? Identical, or logical, for instance if there is a "time" anywhere in the document pages? Does AFP put in control information of some sort, specific to the job?

I'm suspecting you're going to have to rethink it. However, try to answer everything and we'll get a clearer picture of what you have.

If they are "documents" there seems little point in sorting the records.

If not documents you might test sorting seven times on chunks of 4000, with EQUALS. However, you seem to have variable-length records which would throw additional spanners.

Do you have the same, exactly, number of records on each? If not....

If so, you could extract the first 4000 bytes, including the RDW, sort on the whole thing and compare that using JOINKEYS.

I think you'll end up rethinking, no matter what the requirement says...

enrico-sorichetti · Posted: Sun Sep 09, 2012 12:02 pm

enrico-sorichetti · Posted: Sun Sep 09, 2012 1:15 pm

and in general to simply check if two files match the most effective approach it to use

BYTE compare
FMSTOP ==> stop at first mismatch

Vijay Subramaniyan · New User Joined: 06 Jul 2011 Posts: 14 Location: india

Bill,

I would respond to your question with the following case

An input file having 1000 records is processed/formatted through a COBOL program giving an output file that has the same number of records .

The same input file having 1000 records is now processed/formatted through another COBOL program giving an output file that has 1010 records ( i. e 10 records have been inserted somewhere, the formatting of the remaining 1000 records are the same as the first COBOL program)

Now I would want to compare both the output files and the resultant that I I expect is the 10 records which were inserted.

Pls say me if I am not clear

dbzTHEdinosauer · Posted: Mon Sep 10, 2012 2:22 pm

let me see if i have it correctly:

file-A goes into pgm-1 creating file-B

file-A goes into pgm-2 creating file-B with additional records.

unless the goal is
to prove that pgm-2 creates the same stuff as pgm1 with the addition of new records
so that pgm-1 can be removed from the system,
why have both pgm-1 and pgm-2?

why not test both old-file-B and new-file-B
by inputing them to pgm-3 and insuring that the results are correct?

Bill Woodger · Posted: Mon Sep 10, 2012 3:11 pm

Where does AFP come into this?

You do a "sideways" match.

Your "driver" is the first output.

Read driver until end of file.

With each record from driver:

Read record on subsidiary (the second output).

If records are equal, all is well, continue from Read driver...

If unequal, write to output file.

Continue from Read record on subsidiary...

When end of file on driver, write any remaining records on subsidiary to the output file.

At the end of this you should have your 10 extra records. You could include "record numbers" on the output if you wanted to know where they came from :-)

Note: this is a description of the process, not an indication as to how to structure your program :-)

Vijay Subramaniyan · New User Joined: 06 Jul 2011 Posts: 14 Location: india

Bill Woodger · Posted: Mon Sep 10, 2012 4:07 pm

OK, if the output is supposed to be otherwise identical other than the "extras" then you can proceed as I suggested.

However, things like dates/times of production can mess you up. AFP control information...

Code it up. Test it with your test data (into bad program and good program) and see if you have any problems with information outside of your control varying between the runs.

If you have a problem with that, establish whether it can reasonably be "masked" in its native format.

If it can't be masked natively, how about taking the "spool" files and masking?