IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Syncsort - Sort performance issue


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Rijit

Active User


Joined: 15 Apr 2010
Posts: 168
Location: Pune

PostPosted: Tue Oct 05, 2010 11:02 pm
Reply with quote

I have a particular sort step which usually runs around for 3 hrs (clock time). But the actual TCB time for that step is close to 6 mins I observed for most of the past runs.. This is a production job. In the sort step we have 1 input KSDS (more than million records) and output is a tape file.

Can anyone advise on this why does it take so long to execute this step in every run. As far as my experience with sort it is the fastest and would process millions of records in mins.

The sort card has SORT FILEDS=(1,5,CH,A,15,8,BI,A)
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Oct 05, 2010 11:14 pm
Reply with quote

You haven't given any real information to analize.

If you want to pursue this, add the following to your job:

//SORTDIAG DD DUMMY

to get diagnostic messages. Then rerun the job and send the complete JES log to me offline (yaeger@us.ibm.com) and I'll have our DFSORT Performance Team Leader take a look at it.
Back to top
View user's profile Send private message
Rijit

Active User


Joined: 15 Apr 2010
Posts: 168
Location: Pune

PostPosted: Tue Oct 05, 2010 11:24 pm
Reply with quote

Frank,I am Glad you came to my rescue!!

I will be sending you shortly.Can you please let me know where do I add //SORTDIAG DD DUMMY
in my job. I never used this statement before..
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Oct 05, 2010 11:32 pm
Reply with quote

You can add it anywhere in the DFSORT step. For example:

Code:

//S1 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTDIAG DD DUMMY
//SORTIN DD ...
...
Back to top
View user's profile Send private message
Rijit

Active User


Joined: 15 Apr 2010
Posts: 168
Location: Pune

PostPosted: Tue Oct 05, 2010 11:41 pm
Reply with quote

Thank you Frank, once I am back on my workstation I will send you the dump. Now I don't have access to mainframe..
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Oct 06, 2010 12:46 am
Reply with quote

Just to be clear: I don't want a "dump". I want the complete JES log.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Oct 06, 2010 10:39 pm
Reply with quote

From the JES log output you sent me, it's obvious you are using Syncsort, not DFSORT.

I'm a DFSORT developer. DFSORT and Syncsort are competitive products. I'm happy to answer questions on DFSORT and DFSORT's ICETOOL, but I don't answer questions on Syncsort.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Oct 07, 2010 12:04 am
Reply with quote

Hello,

Is this run at the same time the online or other batch processes are heavily using the vsam file?

Experiments:
1. Run the sort but specify DD DUMMY for the output.
2. Run the same job, but merely copy the data rather than sort it.

Post the jcl used and the results of these 2 runs - especially the step terminiation info.
Back to top
View user's profile Send private message
Rijit

Active User


Joined: 15 Apr 2010
Posts: 168
Location: Pune

PostPosted: Wed Oct 13, 2010 6:12 pm
Reply with quote

dick scherrer wrote:
Hello,

Is this run at the same time the online or other batch processes are heavily using the vsam file?

Experiments:
1. Run the sort but specify DD DUMMY for the output.
2. Run the same job, but merely copy the data rather than sort it.

Post the jcl used and the results of these 2 runs - especially the step terminiation info.


Hi Dick,

Nice to hear from you. Right now I don't have access to mainframe but I will definately pursue the experimental 2 points which you mentioned and share the results.

And regarding your question, I need to investigate a little bit. But one thing is for sure that this job runs as part of the night batch and has nothing to do with the onlines. Anyways I will check and confirm you.

Hey Dick but one thing I tried, The ran the job with the O/P file pointing to DASD instead of a tape which is in the original job. It was running pretty fast and finally it abended with S237. But I checked the O/P file, I had specified the DISP as (NEW,CATLG,CATLG). It had processed 50 % of the records in a very less time. But since the total no of records is so huge whenever I try to run it blows up with a space abend. I am just guessing there might be some issue with the tape icon_confused.gif
Back to top
View user's profile Send private message
Rijit

Active User


Joined: 15 Apr 2010
Posts: 168
Location: Pune

PostPosted: Wed Oct 13, 2010 6:19 pm
Reply with quote

Here is the JCL for the step.

Code:
 2 //STEP20  EXEC  PGM=SORT                                             
   //*                                                                   
 3 //SYSIN    DD  *                                                     
 4 //SORTIN   DD  DSN=ROTP.TYUT.EXTRACT.KSDS,                 
   //                     DISP=SHR                                               
   //*                                                                   
 5 //SORTOUT  DD  DSN=ROTP.INT1.Y5454.EXTRACT,             
   //             DISP=(NEW,CATLG,CATLG),                               
   //             UNIT=TAPEV,                                           
   //             DCB=(LRECL=75,BLKSIZE=0,RECFM=FB),EXPDT=99000         
   //*                                                                   
 6 //SORTWK01 DD  UNIT=SYSDA,                                           
   //             SPACE=(CYL,(500,100))                               
 7 //SORTWK02 DD  UNIT=SYSDA,                                         
   //             SPACE=(CYL,(500,100))                               
 8 //SORTWK03 DD  UNIT=SYSDA,                                         
   //             SPACE=(CYL,(500,100))                               
 9 //SORTWK04 DD  UNIT=SYSDA,                                         
   //             SPACE=(CYL,(500,100))                               
   //*                                                                 
10 //SYSLOG   DD  SYSOUT=*                                             
11 //SYSOUT   DD  SYSOUT=*                                             
12 //SORTDIAG DD  DUMMY                                               
13 //SYSPRINT DD  SYSOUT=*                                             
14 //SYSUDUMP DD  SYSOUT=*     
  SYSIN :
     SORT  FIELDS=(1,14,BI,A,21,10,CH,A)                                 
     END

To be frank I have never seen any sort step running for 3 hrs icon_surprised.gif


Edited: Please use BBcode when You post some code/error, that's rather readable, Thanks... Anuj
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Wed Oct 13, 2010 7:14 pm
Reply with quote

Quote:
finally it abended with S237
In addition to the s237, there would have been a return code (rc).

There would also have been an IEC023I message.

You need to look up what caused your particular abend.
Back to top
View user's profile Send private message
Anuj Dhawan

Superior Member


Joined: 22 Apr 2006
Posts: 6250
Location: Mumbai, India

PostPosted: Thu Oct 14, 2010 5:02 pm
Reply with quote

Rijit wrote:
To be frank I have never seen any sort step running for 3 hrs icon_surprised.gif
If you were "Frank" - Sort-Jobs won't malfunction in front of you! icon_wink.gif
Back to top
View user's profile Send private message
Anuj Dhawan

Superior Member


Joined: 22 Apr 2006
Posts: 6250
Location: Mumbai, India

PostPosted: Thu Oct 14, 2010 5:11 pm
Reply with quote

Rijit wrote:
The ran the job with the O/P file pointing to DASD instead of a tape which is in the original job. It was running pretty fast and finally it abended with S237. But I checked the O/P file, I had specified the DISP as (NEW,CATLG,CATLG). It had processed 50 % of the records in a very less time. But since the total no of records is so huge whenever I try to run it blows up with a space abend. I am just guessing there might be some issue with the tape icon_confused.gif
S237 is an end-of-volume - do you execute your "experiment Job" in test?

Also, you need to tell - what does "so huge" means to you? How many records, precisely. Look at the SYSOUT of failed Job, this information should be listed there.

And how are you so sure that "50%" of records were written pretty fast?
Back to top
View user's profile Send private message
Rijit

Active User


Joined: 15 Apr 2010
Posts: 168
Location: Pune

PostPosted: Tue Oct 19, 2010 10:49 pm
Reply with quote

Hi Anuj,

Yes I executed the experimental job in test environment and it abended with S237.

The total no of records in the I/P KSDS is in billions. And I had given my O/P file of this sort step with DISP - NEW,CATLG,CTLG..So even though it abended with s237 but still I had the output file and could count the no of records. It had processed almost 50 % of the records when the O/P file was on DASD instead of tape in significantly less time. But this is not really helping me as I can't use DASD for creating these output file in production..I was just playing in test region.

But I would like to know what factors can really hamper the performance of the sort, like in my case where the real CPU time is near 6 mins but elapsed time goes around to 3 hrs..And this happens in every prod run..If you have any suggestions pls lemme know..Thanks!
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Oct 19, 2010 11:17 pm
Reply with quote

Hello,

Suggest you run a test around 4 in the morning some Saturday or Sunday morning. At these times, it is likely that the machine will be least busy.

If it takes 3hrs to get 6mins cpu time, your job is being impacted by other jobs or the tape system has "problems". What type of media is being used for the output tape? If it is 3480/3490 much of the time may be lost due to waiting for tape mounts. Does your system now have "fat" tapes (3590 or equivelent)? They are extremely fast and will hold a very high volume of data. Talk with your storage management people.
Back to top
View user's profile Send private message
Anuj Dhawan

Superior Member


Joined: 22 Apr 2006
Posts: 6250
Location: Mumbai, India

PostPosted: Wed Oct 20, 2010 5:58 pm
Reply with quote

Well, with the information, you've posted - all I can do -- speculate about what is happening at your end.

1. Well, you talk about "huge" so you might like to use VSCORET=16M on EXEC, which will allow the sort-step to set the maximum amount of virtual storage below AND above the 16-megabyte line that SyncSort can use for its working set when SyncSort’s Dynamic Storage Management (DSM) facility is inactive. BUT, this gets site-specific whether:
    a. the DSM active at your shop or not?
    b. Value of VSCORET is, usually, set up by the product-representatives so you need to know how good the 16M value is for the step in question?
Both the question can be answered by your sort-product representative.

2. When you see the expanded JCL in JESJCL DD what does it tell about sort-work files?

3. What is default for DYNALLOC?

4. Check what is default for RETRY is set to under DYNALLOC ?
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM

 


Similar Topics
Topic Forum Replies
No new posts Need to set RC4 through JCL SORT DFSORT/ICETOOL 5
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts JCL sort card - get first day and las... JCL & VSAM 9
Search our Forums:

Back to Top