Improve performance of file comparison

Priyanka Pyne · New User Joined: 09 Feb 2008 Posts: 95 Location: India

Hi,

I have created one tool using REXX and JCL which will compare two mainframe files in copybook layout.I am using filemanager to compare the files. But itis taking long time to give the output result. Is there any way which can expedite the comparison?

enrico-sorichetti · Posted: Thu Sep 29, 2011 9:34 pm

Priyanka Pyne · New User Joined: 09 Feb 2008 Posts: 95 Location: India

I tried with #50 k data and it is taking 20 mins time to complete the comparison.

dick scherrer · Posted: Thu Sep 29, 2011 11:47 pm

Hello,

What happens when you compare the files with SUPERC?

Priyanka Pyne · New User Joined: 09 Feb 2008 Posts: 95 Location: India

SuperC also taking almost same amount of time and moreover it does not provide the output result in Copybook layout and my requirement is to have the output result in copybook layout.

Akatsukami · Posted: Fri Sep 30, 2011 1:36 am

anatol · Posted: Fri Sep 30, 2011 1:37 am

I found sort is better, when you split to nodups, dups. Nodups - not match, dups are the same

dick scherrer · Posted: Fri Sep 30, 2011 1:59 am

Hello,

With only 50k records, the superc takes nearly 20 minutes also ?

One guess is that there is something horribly wrong with the definition of the file(s). Are both files in sequence before the compare is run?

I periodically compare several million records of 12k record length and this only takes a few minutes - less than 10 even on a bad day.

Priyanka Pyne · New User Joined: 09 Feb 2008 Posts: 95 Location: India

LRECL is 2094.

JCL looks like

Priyanka Pyne · New User Joined: 09 Feb 2008 Posts: 95 Location: India

Hi Dick,

Are you refering to SuperC?

I cannot use SuperC as I need the putput result in copybook layout.

dick scherrer · Posted: Fri Sep 30, 2011 2:09 am

Hello,

Yes, i understand that you don't want the superc format of the data. I was trying to understand why the superc took about the same amount of time as the filemanager run (unless i have misunderstood this).

I would expect the filemanager run to use significantly more time than superc but if they use a similar amount it may help to learn why.

All of the field level formatting takes a lot of time, but i still have no idea why 50k records takes 20 minutes. How much cpu time does each run use? EXCPs?

Akatsukami · Posted: Fri Sep 30, 2011 2:12 am

dbzTHEdinosauer · Posted: Fri Sep 30, 2011 2:31 am

what is the blocksize for FILE1 and FILE2? Which by the way you don't bother to provide

why are you forcing DCB parms on output files to be same as some other existing file DCB=*.SORT1.SORTIN
which, since it is not a SORT output, it is probably not optimized.

and by the way, what do you mean by:

Priyanka Pyne · New User Joined: 09 Feb 2008 Posts: 95 Location: India

The reason being i am using this JCL as a backend of a REXX tool. Where one can provide any type of input file. And hence lenth of the output file can not be pre determind. So I am copying the DCB of the input file.

By refering Copybook layout I wanted to say the layout of the file.

I am copying the control card which I have used for this purposes.

prino · Posted: Fri Sep 30, 2011 1:56 pm

Nic Clouston · Posted: Fri Sep 30, 2011 2:00 pm

Or read the sort counts from the sort step?

dbzTHEdinosauer · Posted: Fri Sep 30, 2011 2:15 pm

dick scherrer · Posted: Fri Sep 30, 2011 9:08 pm

Hello,

Just to see what happens with superc, i compared vb files with more than 8 million records:

Priyanka Pyne · New User Joined: 09 Feb 2008 Posts: 95 Location: India

Hi Dick,

Thanks for the SuperC result but as I mentioned earlier I cannot use SuperC as I need the comparison result in a file/copybook layout.

enrico-sorichetti · Posted: Mon Oct 03, 2011 9:43 pm

then You will have to bear the larger resource consumption and the longer elapsed time

dick scherrer · Posted: Mon Oct 03, 2011 10:29 pm

Hello,

Akatsukami · Posted: Mon Oct 03, 2011 10:54 pm

Now, I must come to Priyanka's defense here. Back on September 29, she said:

dick scherrer · Posted: Mon Oct 03, 2011 11:07 pm

Hi Akatsumi,

Bill Woodger · Posted: Mon Oct 03, 2011 11:36 pm

If both the comparisons are slow (FileManage and SurperC), as has been stated, then there must be something "odd" about the datasets or some big conflict between the control cards and the datasets.

dbz queried the backward reference to the sort step. Can you run with the actual known dcb info as a test, not do the backwards reference?

If that makes no difference, can you strip the FileManager cards down to the basics and see how that runs? If that is different, add the cards back a little at a time so you can locate what did it.

As has been requested, list DBB info for both files, and show EXCP info from the messages output.

Also, as dbz asked, how is the copybook getting into the comparison?

This thread is becoming typical of your queries. Long on length, short on... most things except length.

dick scherrer · Posted: Tue Oct 04, 2011 12:11 am

Hello,

The more i think about this, the more i'm convincing myself that the reason for the similarity in time used is because there are rather few mis-matched records.

Hopefully, the extra overhead is only used when there is a mis-match and the fomatted output is generated.

Which leads me to further believe that this job/class/media/etc is at the bottom of the barrel and only gets minimal resources. . .