IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Clarification on sort performance


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Deepa.m

New User


Joined: 28 Apr 2005
Posts: 99

PostPosted: Wed Aug 18, 2010 4:31 pm
Reply with quote

Hi,
We need to process a production file that has ~95 million records of lrecl=365. Processing includes omiting few records (after omit the expected records are ~92 million)and then splitting the records to 2 files based on split condition and reformating to 61 byte file o/p lrecl is 61

we wrote a sort card to accomplish this task but this failed after processing ~60 million records saying it exceeds sort capacity. we already included 32 work datasets and included 10 more work datasets and ran successfully but was not consistent

We tried using the sort for omit condition and reformating the records then cobol program for split and write to 2 o/p files. This time the job consistently ran fine and cpu time, I/O service units reduced noticeably

After some research we found ourinitial sort card gave superior results when used with small file and COBOL program gace better results with large file.

Is our inference correct. Is COBOL program always better than Sort utility for reformating and split logic ?

Thanks,
Deepa.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Wed Aug 18, 2010 4:48 pm
Reply with quote

Deepa,
Please post your entire sysout for the abended sort step.

Thanks,
Back to top
View user's profile Send private message
Ronald Burr

Active User


Joined: 22 Oct 2009
Posts: 293
Location: U.S.A.

PostPosted: Wed Aug 18, 2010 5:58 pm
Reply with quote

At a very minimum, post the SORTIN statements for both processes.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Wed Aug 18, 2010 6:08 pm
Reply with quote

Quote:
Is our inference correct. Is COBOL program always better than Sort utility for reformating and split logic ?


Don't think so.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Wed Aug 18, 2010 7:42 pm
Reply with quote

Ronald Burr,
Quote:
At a very minimum, post the SORTIN statements for both processes.


Did you mean SYSIN? Her SORTIN would be 90 million records. icon_smile.gif


Thanks,
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Wed Aug 18, 2010 7:57 pm
Reply with quote

probably meant the sortwk statements or DD statements.

the jcl used in the step is suspect.
without seeing all of it, to include the control cards
it will be difficult to provide any sort of help.

if the work can be done with sort control cards,
then the reason the cobol runs faster
is that the control cards for a sort step (or sortwrk) or whatever
is not properly assigned.

I can't see a cobol sort running faster than a DFSORT step.

sysout messages would also help, for both the DFSORT and the COBOL Sort.

there is just not enough information being provided.
observations are of no help.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Wed Aug 18, 2010 8:01 pm
Reply with quote

dbzTHEdinosauer,
I knew he meant something else. I was just trying to add little humour to the post.I really hope he doesn't take it offensively cause I can't even delete it now icon_sad.gif.

Thanks,
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Wed Aug 18, 2010 8:12 pm
Reply with quote

sqlcode1,

ok, I jumped too fast and made an assumption.

having followed your posts for several months,
I should have realized your intent.

But I elaborated for the benefit of the TS,
who I believe is making incorrect assumptions based on not having all the facts.
But, he has asked in the proper forum,
and if Frank or Kolusu has the patience to wade thru all the irrelevant garbage that we have left, maybe he will provide an answer to the TS and us.

I think Ronald is a little more rational than I and will not take offense.
Back to top
View user's profile Send private message
Ronald Burr

Active User


Joined: 22 Oct 2009
Posts: 293
Location: U.S.A.

PostPosted: Wed Aug 18, 2010 8:55 pm
Reply with quote

icon_redface.gif Sorry, I posted before having my second cup of Java ( Coffee, that is, not code ). I am chagrined, but I'm not offended.

I'm "guessing" from what I read that in the first scenario, a SORT was being performed as well as the OMITs, reformatting, and splitting - hence the SORT CAPACITY EXCEEDED - whereas in the second scenario no SORT was being performed, just a COPY with OMITs and reformatting.

Furthermore, I'm "guessing" that the reformatting in either scenario is being done during OUTREC processing, not INREC processing - thus forcing the SORT (in scenario one) to carry the entire input record thru the sort process, rather than the (smaller, 61-byte) reformatted record.

But, I would need to see the SYSIN ( not SORTIN ) records from both runs in order to know whether I am "guessing" correctly.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Aug 18, 2010 10:43 pm
Reply with quote

Deepa,

It's impossible to help you based on the little bit of information you've given (and you certainly seem to be jumping the gun on your "conclusion").

Add the following to your sort job:

//SORTDIAG DD DUMMY

to show the diagnostic messages. Then rerun it and post the complete JES log (or e-mail it to me directly - yaeger@us.ibm.com) and I'll take a look.
Back to top
View user's profile Send private message
Deepa.m

New User


Joined: 28 Apr 2005
Posts: 99

PostPosted: Thu Aug 19, 2010 3:12 pm
Reply with quote

We are breaking our heads for the past 2 days to find a effective solution and was in a hurry while composing this query. Here are the details

Sort card used

Code:
SORT FIELDS=(5,4,BI,A)                                           
 INCLUDE COND=((1,4,BI,GT,X'00010000'),AND,                       
               (47,1,CH,EQ,C'R',OR,47,1,CH,EQ,C'C'),AND,           
               (152,3,CH,NE,C'TRN',OR,152,3,CH,NE,C'STL',OR,       
               152,3,CH,NE,C'INS',OR,152,3,CH,NE,C'CRS',OR,       
               152,3,CH,NE,C'BMC'))                               
 OUTFIL FNAMES=RET1,INCLUDE=(152,3,CH,EQ,C'BCD',OR,               
    (47,1,CH,EQ,C'R',AND,53,1,CH,EQ,C'O')),                       
 BUILD=(1,4,5,4,9,4,13,18,31,9,47,1,                               
                 53,1,60,10,90,1,152,3,197,6)                     
 OUTFIL FNAMES=COM1,INCLUDE=(152,3,CH,EQ,C'BCD',OR,               
    (47,1,CH,EQ,C'C',AND,53,1,CH,EQ,C'O')),                       
 BUILD=(1,4,5,4,9,4,13,18,31,9,47,1,                               
                53,1,60,10,90,1,152,3,197,6)                     



result:

RECORDS - IN: 93183341, OUT: 93166300
RET1 : DELETED = 17628761, REPORT = 0, DATA = 75537539
RET1 : TOTAL IN = 93166300, TOTAL OUT = 75537539
COM1 : DELETED = 41695313, REPORT = 0, DATA = 51470987
COM1 : TOTAL IN = 93166300, TOTAL OUT = 51470987

* *
* JOB START JOB END JOB ELAPSED TIME *
* 08/17/10 02:03:54 08/17/10 02:40:53 00:36:59 *


CPU SERVICE UNITS ::::64,427,861
I/O SERVICE UNITS :::::2,720,547
ALL SERVICE UNITS ::::70,026,527



then we split the sort into 2 steps


step1 :

SORT FIELDS=(5,4,BI,A)
INCLUDE COND=((1,4,BI,GT,X'00010000'),AND,
(47,1,CH,EQ,C'R',OR,47,1,CH,EQ,C'C'),AND,
(152,3,CH,NE,C'TRN',OR,152,3,CH,NE,C'STL',OR,
152,3,CH,NE,C'INS',OR,152,3,CH,NE,C'CRS',OR,
152,3,CH,NE,C'BMC'))
OUTREC FIELDS=(1,4,5,4,9,4,13,18,31,9,47,1,
53,1,60,10,90,1,152,3,197,6)

JOB START JOB END JOB ELAPSED TIME
08/17/10 01:59:17 08/17/10 02:11:21 00:12:04


CPU SERVICE UNITS ::::34,863,545
I/O SERVICE UNITS :::::::110,255
ALL SERVICE UNITS ::::36,649,199


Step 2: COBOL program for Split into 2 files

CPU SERVICE UNITS :::::3,797,837
I/O SERVICE UNITS :::::2,405,072
ALL SERVICE UNITS :::::6,857,256

jcl

//step1 EXEC PGM=SORT,
// PARM='RC16=ABE'
//SORTIN DD DSN=aaa.inp1,
// DISP=(SHR,KEEP,KEEP)
//RET1 DD DSN=aaa.ret,
// DISP=(NEW,CATLG,DELETE),VOL=(,,,255),
// UNIT=AUTO,
// DCB=(RECFM=FB,LRECL=61,BLKSIZE=27938)
//COM1 DD DSN=aaa.com,
// DISP=(NEW,CATLG,DELETE),VOL=(,,,255),
// UNIT=AUTO,
// DCB=(RECFM=FB,LRECL=61,BLKSIZE=27938)
//SORTWK01 DD DATACLAS=SORTL
//SORTWK02 DD DATACLAS=SORTL
//SORTWK03 DD DATACLAS=SORTL
//SORTWK04 DD DATACLAS=SORTL
//SORTWK05 DD DATACLAS=SORTL
//SORTWK06 DD DATACLAS=SORTL
//SORTWK07 DD DATACLAS=SORTL
//SORTWK08 DD DATACLAS=SORTL
//SORTWK09 DD DATACLAS=SORTL
//SORTWK10 DD DATACLAS=SORTL
//SORTWK11 DD DATACLAS=SORTL
//SORTWK12 DD DATACLAS=SORTL
//SORTWK13 DD DATACLAS=SORTL
//SORTWK14 DD DATACLAS=SORTL
//SORTWK15 DD DATACLAS=SORTL
//SORTWK16 DD DATACLAS=SORTL
//SORTWK17 DD DATACLAS=SORTL
//SORTWK18 DD DATACLAS=SORTL
//SORTWK19 DD DATACLAS=SORTL
//SORTWK20 DD DATACLAS=SORTL
//SORTWK21 DD DATACLAS=SORTL
//SORTWK22 DD DATACLAS=SORTL
//SORTWK23 DD DATACLAS=SORTL
//SORTWK24 DD DATACLAS=SORTL
//SORTWK25 DD DATACLAS=SORTL
//SORTWK26 DD DATACLAS=SORTL
//SORTWK27 DD DATACLAS=SORTL
//SORTWK28 DD DATACLAS=SORTL
//SORTWK29 DD DATACLAS=SORTL
//SORTWK30 DD DATACLAS=SORTL
//SORTWK31 DD DATACLAS=SORTL
//SORTWK32 DD DATACLAS=SORTL
//SORTWK33 DD DATACLAS=SORTL
//SYSIN DD DSN=aaa.bbb(abc),disp=shr
//SYSOUT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*


Let me know for any questions

Frank,

I reacllign the i/p file and it will provide the JES log once done
Back to top
View user's profile Send private message
Deepa.m

New User


Joined: 28 Apr 2005
Posts: 99

PostPosted: Thu Aug 19, 2010 3:14 pm
Reply with quote

Frank,

I am recalling the i/p file and will provide jeslog once done. I only saved few details from the last run which i furnished above.
Back to top
View user's profile Send private message
Ronald Burr

Active User


Joined: 22 Oct 2009
Posts: 293
Location: U.S.A.

PostPosted: Thu Aug 19, 2010 7:16 pm
Reply with quote

It is as I suspected: reformatting is occurring during outrec rather than inrec processing.

But there is also a glaring error in your INCLUDE logic: no matter WHAT the value is in position 152 the AND condition will be satisfied because all of the tests are for NE and connected by ORs. Logically, if the value is 'TRN' then it is NE 'STL'; if it is 'STL' then it is NE 'TRN'. So no matter what value is there, the AND condition is true. In the code posted below, I have taken the liberty to change the connectors to ANDs rather than ORs.

Besides that, since you are also applying INCLUDE logic during OUTFIL processing, it makes no sence to select MORE records during pre-sort INCLUDE than will be passed during OUTFIL INCLUDEs. The pre-sort INCLUDE should be inclusive of all records that will be passed during OUTFIL processing.

Then again, the way your OUTFIL INCLUDEs are coded, it would appear that 'BCD' records will be written to BOTH output files. Is that what you wanted? I did NOT change that logic.

Frank or Kolusu may come up with a better solution, but I would suggest:
Code:

 INCLUDE COND=(1,4,BI,GT,X'00010000'),AND,
              (152,3,CH,EQ,C'BCD',OR,
               (53,1,CH,EQ,C'O',AND,
               (47,1,CH,EQ,C'R',OR,47,1,CH,EQ,C'C'),AND,
               (152,3,CH,NE,C'TRN',AND,152,3,CH,NE,C'STL',AND,
                152,3,CH,NE,C'INS',AND,152,3,CH,NE,C'CRS',AND,
                152,3,CH,NE,'BMC')))
 INREC FIELDS=(1,39,47,1,53,1,60,10,90,1,152,3,197,6)
 SORT FIELDS=(5,4,BI,A)
 OUTFIL FNAMES=RET1,INCLUDE=(53,3,CH,EQ,C'BCD',OR,40,2,CH,EQ,C'RO'),BUILD=(1,61)
 OUTFIL FNAMES=COM1,INCLUDE=(53,3,CH,EQ,C'BCD',OR,40,2,CH,EQ,C'CO'),BUILD=(1,61)


By changing the INCLUDE logic slightly, you will INCLUDE ONLY those records that will also be INCLUDEd during OUTFIL processing, and will select 'BCD' records with fewer tests.

By doing the reformat during INREC processing, you will considerably reduce the amount of data being handled by the sort process.

By combining fields ( i.e. coding 1,39 instead of 1,4,5,4,9,4,13,18,31,9 ) you will reduce the number of data moves required during the refomatting process ( unless the DFSORT developers put in code to test for, and consolidate same )

Obviously, since reformatting is taking place during INREC, the INCLUDE offsets during OUTFIL processing have to be changed to reflect their new locations.

Since, during reforamtting, we moved positions 47 and 53 next to each other, a single 2-byte test can be used during OUTFIL INCLUDE processing rather than two 1-byte tests connected by AND logic.

By reformatting during INREC processing, the BUILD during OUTFIL processing is simplified to output the entire (reformatted) record.

Hope this helps.
Back to top
View user's profile Send private message
Deepa.m

New User


Joined: 28 Apr 2005
Posts: 99

PostPosted: Fri Aug 20, 2010 4:58 pm
Reply with quote

Ronald,

Thanks for a detailed analysis. we were so dumb to miss that error in the negative check( NE)


By doing the reformat during INREC processing, you will considerably reduce the amount of data being handled by the sort process. - Valuable suggestion and reduced the cpu time by 5 min

We also noticed that less the number of conditions in Include less the cpu time so we optimised our query as below. This is not a conclusion just our observation.So removed all duplicate checks


INCLUDE COND=(1,4,BI,GT,X'00010000',AND,
152,3,CH,NE,C'TRN',AND,152,3,CH,NE,C'STL',AND,
152,3,CH,NE,C'INS',AND,152,3,CH,NE,C'CRS',AND,
152,3,CH,NE,'BMC')
INREC FIELDS=(1,39,47,1,53,1,60,10,90,1,152,3,197,6)
SORT FIELDS=(5,4,BI,A)
OUTFIL FNAMES=RET1,INCLUDE=(53,3,CH,EQ,C'BCD',OR,40,2,CH,EQ,C'RO'),BUILD=(1,61)
OUTFIL FNAMES=COM1,INCLUDE=(53,3,CH,EQ,C'BCD',OR,40,2,CH,EQ,C'CO'),BUILD=(1,61)




overall your suggestions were valuable and made a great difference.
Thank you
Back to top
View user's profile Send private message
Ronald Burr

Active User


Joined: 22 Oct 2009
Posts: 293
Location: U.S.A.

PostPosted: Fri Aug 20, 2010 5:58 pm
Reply with quote

I'm glad that my response was helpful.
However, I am still curious as to whether you really intend to write ALL of the 'BCD' records to BOTH output files regardless of whether they have an 'R' or a 'C' in position 47 of the input file.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Need to set RC4 through JCL SORT DFSORT/ICETOOL 5
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts JCL sort card - get first day and las... JCL & VSAM 9
No new posts Sort First/last record of a subset th... DFSORT/ICETOOL 7
No new posts how to calculate SUM value for VB fil... DFSORT/ICETOOL 1
Search our Forums:

Back to Top