IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Improve performance of file comparison


IBM Mainframe Forums -> JCL & VSAM
Post new topic   This topic is locked: you cannot edit posts or make replies.
View previous topic :: View next topic  
Author Message
Priyanka Pyne

New User


Joined: 09 Feb 2008
Posts: 95
Location: India

PostPosted: Thu Sep 29, 2011 9:27 pm
Reply with quote

Hi,

I have created one tool using REXX and JCL which will compare two mainframe files in copybook layout.I am using filemanager to compare the files. But itis taking long time to give the output result. Is there any way which can expedite the comparison?
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10886
Location: italy

PostPosted: Thu Sep 29, 2011 9:34 pm
Reply with quote

Quote:
But it is taking long time to give the output result.

BIBSD on... define the issue in measurable terms
long time/short time good performance/bad performarce are pretty useless terms of comparison
and usually depend on which side the opponents stand

but ... I really do not see how we could advice about it... given the scarce info You provided..
the fact that You used REXX to build the jcl is irrelevant

it is a good habit to <debug> by steps
what happened when You submitted the same compare with a hand crafted jcl ?
Back to top
View user's profile Send private message
Priyanka Pyne

New User


Joined: 09 Feb 2008
Posts: 95
Location: India

PostPosted: Thu Sep 29, 2011 9:51 pm
Reply with quote

I tried with #50 k data and it is taking 20 mins time to complete the comparison.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19243
Location: Inside the Matrix

PostPosted: Thu Sep 29, 2011 11:47 pm
Reply with quote

Hello,

What happens when you compare the files with SUPERC?
Back to top
View user's profile Send private message
Priyanka Pyne

New User


Joined: 09 Feb 2008
Posts: 95
Location: India

PostPosted: Fri Sep 30, 2011 1:13 am
Reply with quote

SuperC also taking almost same amount of time and moreover it does not provide the output result in Copybook layout and my requirement is to have the output result in copybook layout.
Back to top
View user's profile Send private message
Akatsukami

Global Moderator


Joined: 03 Oct 2009
Posts: 1787
Location: Bloomington, IL

PostPosted: Fri Sep 30, 2011 1:36 am
Reply with quote

Priyanka Pyne wrote:
I tried with #50 k data and it is taking 20 mins time to complete the comparison.

What are the LRECLs of your data sets? What does your JCL look like?
Back to top
View user's profile Send private message
anatol

Active User


Joined: 20 May 2010
Posts: 121
Location: canada

PostPosted: Fri Sep 30, 2011 1:37 am
Reply with quote

I found sort is better, when you split to nodups, dups. Nodups - not match, dups are the same
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19243
Location: Inside the Matrix

PostPosted: Fri Sep 30, 2011 1:59 am
Reply with quote

Hello,

With only 50k records, the superc takes nearly 20 minutes also ?

One guess is that there is something horribly wrong with the definition of the file(s). Are both files in sequence before the compare is run?

I periodically compare several million records of 12k record length and this only takes a few minutes - less than 10 even on a bad day.
Back to top
View user's profile Send private message
Priyanka Pyne

New User


Joined: 09 Feb 2008
Posts: 95
Location: India

PostPosted: Fri Sep 30, 2011 2:01 am
Reply with quote

LRECL is 2094.

JCL looks like
Code:
//FILEMGR  EXEC PGM=FMNMAIN,TIME=MAXIMUM                           
//STEPLIB  DD DISP=SHR,DSN=SYS1.FILEMNGR.SFMNMOD1                 
//         DD DISP=SHR,DSN=SYSC090.COBOLZOS.PROD.SIGYCOMP         
//*FMNCOB  DD DUMMY     Uncomment to force use of FM COBOL Compiler
//*SYSPRINT DD SYSOUT=*                                           
//*.SYSPRINT.ORD.USERNME.TEST.SYSPRINT                           
//SYSPRINT DD DSN=ORD.USERNME.TEST.SYSPRINT,                     
//         SPACE=(CYL,(900,900),RLSE),                             
//         DISP=(,CATLG,DELETE),                                   
//         DCB=*.SORT1.SORTIN                                     
//FMNTSPRT DD SYSOUT=*                                             
//SYSTERM  DD SYSOUT=*                                             
//SYSIN    DD *                                                   
$$FILEM DSCMP TYPE=FORMATTED, 
$$FILEM DSNOLD=FILE1,
$$FILEM DSNOLD=FILE2                                   
/*                                             
//*.FMINSOUT.ORD.&USERNME.TEST.INSERTED.T     
//FMINSOUT DD DSN=ORD.USERNME.TEST.INSERTED.T,
//         SPACE=(CYL,(900,900),RLSE),         
//         DISP=(,CATLG,DELETE),               
//         DCB=*.SORT1.SORTIN                   


Before comparing I am sorting both the files.

I have copied only the comparison step here.
Back to top
View user's profile Send private message
Priyanka Pyne

New User


Joined: 09 Feb 2008
Posts: 95
Location: India

PostPosted: Fri Sep 30, 2011 2:03 am
Reply with quote

Hi Dick,

Are you refering to SuperC?

I cannot use SuperC as I need the putput result in copybook layout.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19243
Location: Inside the Matrix

PostPosted: Fri Sep 30, 2011 2:09 am
Reply with quote

Hello,

Yes, i understand that you don't want the superc format of the data. I was trying to understand why the superc took about the same amount of time as the filemanager run (unless i have misunderstood this).

I would expect the filemanager run to use significantly more time than superc but if they use a similar amount it may help to learn why.

All of the field level formatting takes a lot of time, but i still have no idea why 50k records takes 20 minutes. How much cpu time does each run use? EXCPs?
Back to top
View user's profile Send private message
Akatsukami

Global Moderator


Joined: 03 Oct 2009
Posts: 1787
Location: Bloomington, IL

PostPosted: Fri Sep 30, 2011 2:12 am
Reply with quote

Priyanka Pyne wrote:
Hi Dick,

Are you refering to SuperC?

I cannot use SuperC as I need the putput result in copybook layout.

But a test run using SuperC will provide information useful for diagnosis. If a SuperC compare of the data also takes 20 minutes, there is a problem with the data or data set. If it only takes 20 milliseconds, then there is a problem with the File Mangler control cards.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Fri Sep 30, 2011 2:31 am
Reply with quote

what is the blocksize for FILE1 and FILE2? Which by the way you don't bother to provide

why are you forcing DCB parms on output files to be same as some other existing file DCB=*.SORT1.SORTIN
which, since it is not a SORT output, it is probably not optimized.

and by the way, what do you mean by:
Quote:
compare two mainframe files in copybook layout


and if there is a copybook format involved, where is the reference?
Back to top
View user's profile Send private message
Priyanka Pyne

New User


Joined: 09 Feb 2008
Posts: 95
Location: India

PostPosted: Fri Sep 30, 2011 3:19 am
Reply with quote

The reason being i am using this JCL as a backend of a REXX tool. Where one can provide any type of input file. And hence lenth of the output file can not be pre determind. So I am copying the DCB of the input file.

By refering Copybook layout I wanted to say the layout of the file.

I am copying the control card which I have used for this purposes.

Code:
$$FILEM DSCMP TYPE=FORMATTED,             
$$FILEM PACK=UNPACK,                     
$$FILEM SYNCH=KEYED,                     
$$FILEM KEYLOCOLD=1,                     
$$FILEM KEYLOCNEW=1,                     
$$FILEM KEYLEN=9,                         
$$FILEM KEYTYPE=CHAR,                     
$$FILEM LIST=LONG,                       
$$FILEM WIDE=YES,                         
$$FILEM HILIGHT=YES,                     
$$FILEM CHNGDFLD=YES,                     
$$FILEM IGNLEN=YES,                       
$$FILEM EXCLUDE=(,,MATCHED,),             
$$FILEM NUMDIFF=ALL,                     
$$FILEM DSNOLD=xxx.TEST.VSAM.FILE1,   
$$FILEM TCOLD=SYS2.xxxx.COPYLIB(zzzz),
$$FILEM LANG=COBOL,                       
$$FILEM SKIPOLD=0,                       
$$FILEM CMPOLD=ALL,                       
$$FILEM TCNEW=SYS2.xxxx.COPYLIB(zzzz),
$$FILEM SKIPNEW=0,                       
$$FILEM CMPNEW=ALL,                       
$$FILEM DSNNEW=xxx.TEST.VSAM.FILE2   


Previously I had provided wrong information about the data count. I am really sorry for that. The files have almost #686721 data. Because of which I was not able to open it view mode and not got the exact count.
Back to top
View user's profile Send private message
prino

Senior Member


Joined: 07 Feb 2009
Posts: 1314
Location: Vilnius, Lithuania

PostPosted: Fri Sep 30, 2011 1:56 pm
Reply with quote

Priyanka Pyne wrote:
... The files have almost #686721 data. Because of which I was not able to open it view mode and not got the exact count.

And obviously you've never heard of Browse, which can open files with zillions of records (if you have sithloads of time ;) )
Back to top
View user's profile Send private message
Nic Clouston

Global Moderator


Joined: 10 May 2007
Posts: 2454
Location: Hampshire, UK

PostPosted: Fri Sep 30, 2011 2:00 pm
Reply with quote

Or read the sort counts from the sort step?
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Fri Sep 30, 2011 2:15 pm
Reply with quote

Quote:
xxx.TEST.VSAM.FILE2


are these vsam or qsam files?
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19243
Location: Inside the Matrix

PostPosted: Fri Sep 30, 2011 9:08 pm
Reply with quote

Hello,

Just to see what happens with superc, i compared vb files with more than 8 million records:

Code:
                      LINE COMPARE SUMMARY AND STATISTICS         
                                                                   
                                                                   
                                                                   
 8243650 NUMBER OF LINE MATCHES               0  TOTAL CHANGES (PAIR
       0 REFORMATTED LINES                    0  PAIRED CHANGES (REF
       0 NEW FILE LINE INSERTIONS             0  NON-PAIRED INSERTS
       0 OLD FILE LINE DELETIONS              0  NON-PAIRED DELETES
 8243650 NEW FILE LINES PROCESSED                                   
 8243650 OLD FILE LINES PROCESSED                                   
                                                                   
LISTING-TYPE = DELTA      COMPARE-COLUMNS =    1:1065      LONGEST-LINE = 1065


And this took:
Code:
     12.82 MINUTES EXECUTION TIME
Back to top
View user's profile Send private message
Priyanka Pyne

New User


Joined: 09 Feb 2008
Posts: 95
Location: India

PostPosted: Mon Oct 03, 2011 9:42 pm
Reply with quote

Hi Dick,

Thanks for the SuperC result but as I mentioned earlier I cannot use SuperC as I need the comparison result in a file/copybook layout.
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10886
Location: italy

PostPosted: Mon Oct 03, 2011 9:43 pm
Reply with quote

then You will have to bear the larger resource consumption and the longer elapsed time
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19243
Location: Inside the Matrix

PostPosted: Mon Oct 03, 2011 10:29 pm
Reply with quote

Hello,

Quote:
but as I mentioned earlier I cannot use SuperC
You simply must start paying attention. I have NOT suggested you use superc for what you need. Please stop repeating this . . . icon_sad.gif

You need to understand why either of your processes takes so very long. Obviously, the problem is on your system.

Why does your compare take so long? You seem be unwilling or unable determine this. How long does it take to simply copy the 2 files (no comparison at all)?
Back to top
View user's profile Send private message
Akatsukami

Global Moderator


Joined: 03 Oct 2009
Posts: 1787
Location: Bloomington, IL

PostPosted: Mon Oct 03, 2011 10:54 pm
Reply with quote

Now, I must come to Priyanka's defense here. Back on September 29, she said:
Quote:
SuperC also taking almost same amount of time

She's also posted her JCL and control cards.

I suspect that there's some problem with the data set DCBs, but I don't have the data sets, so I can't be sure. I'm not a File Manager wizard; can anyone suggest a situation (data sets are VBS? VSAM? VTOL?) which neither File Mangler nor SuperC handle well?

Priyanka, what is the record format and block size of the data sets? What are the EXCP counts on the input file DDs?
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19243
Location: Inside the Matrix

PostPosted: Mon Oct 03, 2011 11:07 pm
Reply with quote

Hi Akatsumi,

Quote:
Now, I must come to Priyanka's defense here.
I don't believe defending is necessary. . .

I do believe that pretty much all of the answers are on that system, to which we have no access. For us to help, Pryanka must be our ears, eyes, and hands icon_smile.gif

Once again i wonder how/why these 2 runs take almost the same amount of time on that system. I only posted my stats to show that superc runs exponentially more data in considerable less time. . .

I also believe a keyed 2-file match (with the copybook used as the basis for showing mis-matches) would perform better.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Oct 03, 2011 11:36 pm
Reply with quote

If both the comparisons are slow (FileManage and SurperC), as has been stated, then there must be something "odd" about the datasets or some big conflict between the control cards and the datasets.

dbz queried the backward reference to the sort step. Can you run with the actual known dcb info as a test, not do the backwards reference?

If that makes no difference, can you strip the FileManager cards down to the basics and see how that runs? If that is different, add the cards back a little at a time so you can locate what did it.

As has been requested, list DBB info for both files, and show EXCP info from the messages output.

Also, as dbz asked, how is the copybook getting into the comparison?

This thread is becoming typical of your queries. Long on length, short on... most things except length.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19243
Location: Inside the Matrix

PostPosted: Tue Oct 04, 2011 12:11 am
Reply with quote

Hello,

The more i think about this, the more i'm convincing myself that the reason for the similarity in time used is because there are rather few mis-matched records.

Hopefully, the extra overhead is only used when there is a mis-match and the fomatted output is generated.

Which leads me to further believe that this job/class/media/etc is at the bottom of the barrel and only gets minimal resources. . .
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   This topic is locked: you cannot edit posts or make replies. View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM Goto page 1, 2  Next

 


Similar Topics
Topic Forum Replies
No new posts Unable to interpret a hex value to De... COBOL Programming 4
No new posts how to eliminate null indicator value... DB2 7
No new posts Two where-criteria with GT - Performa... DB2 4
No new posts Format Binary file to EBCDIC JCL & VSAM 4
No new posts Binary File format getting change whi... All Other Mainframe Topics 7
Search our Forums:

Back to Top