View previous topic :: View next topic
|
Author |
Message |
Rijit
Active User
Joined: 15 Apr 2010 Posts: 168 Location: Pune
|
|
|
|
I have a particular sort step which usually runs around for 3 hrs (clock time). But the actual TCB time for that step is close to 6 mins I observed for most of the past runs.. This is a production job. In the sort step we have 1 input KSDS (more than million records) and output is a tape file.
Can anyone advise on this why does it take so long to execute this step in every run. As far as my experience with sort it is the fastest and would process millions of records in mins.
The sort card has SORT FILEDS=(1,5,CH,A,15,8,BI,A) |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
You haven't given any real information to analize.
If you want to pursue this, add the following to your job:
//SORTDIAG DD DUMMY
to get diagnostic messages. Then rerun the job and send the complete JES log to me offline (yaeger@us.ibm.com) and I'll have our DFSORT Performance Team Leader take a look at it. |
|
Back to top |
|
|
Rijit
Active User
Joined: 15 Apr 2010 Posts: 168 Location: Pune
|
|
|
|
Frank,I am Glad you came to my rescue!!
I will be sending you shortly.Can you please let me know where do I add //SORTDIAG DD DUMMY
in my job. I never used this statement before.. |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
You can add it anywhere in the DFSORT step. For example:
Code: |
//S1 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTDIAG DD DUMMY
//SORTIN DD ...
...
|
|
|
Back to top |
|
|
Rijit
Active User
Joined: 15 Apr 2010 Posts: 168 Location: Pune
|
|
|
|
Thank you Frank, once I am back on my workstation I will send you the dump. Now I don't have access to mainframe.. |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
Just to be clear: I don't want a "dump". I want the complete JES log. |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
From the JES log output you sent me, it's obvious you are using Syncsort, not DFSORT.
I'm a DFSORT developer. DFSORT and Syncsort are competitive products. I'm happy to answer questions on DFSORT and DFSORT's ICETOOL, but I don't answer questions on Syncsort. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Is this run at the same time the online or other batch processes are heavily using the vsam file?
Experiments:
1. Run the sort but specify DD DUMMY for the output.
2. Run the same job, but merely copy the data rather than sort it.
Post the jcl used and the results of these 2 runs - especially the step terminiation info. |
|
Back to top |
|
|
Rijit
Active User
Joined: 15 Apr 2010 Posts: 168 Location: Pune
|
|
|
|
dick scherrer wrote: |
Hello,
Is this run at the same time the online or other batch processes are heavily using the vsam file?
Experiments:
1. Run the sort but specify DD DUMMY for the output.
2. Run the same job, but merely copy the data rather than sort it.
Post the jcl used and the results of these 2 runs - especially the step terminiation info. |
Hi Dick,
Nice to hear from you. Right now I don't have access to mainframe but I will definately pursue the experimental 2 points which you mentioned and share the results.
And regarding your question, I need to investigate a little bit. But one thing is for sure that this job runs as part of the night batch and has nothing to do with the onlines. Anyways I will check and confirm you.
Hey Dick but one thing I tried, The ran the job with the O/P file pointing to DASD instead of a tape which is in the original job. It was running pretty fast and finally it abended with S237. But I checked the O/P file, I had specified the DISP as (NEW,CATLG,CATLG). It had processed 50 % of the records in a very less time. But since the total no of records is so huge whenever I try to run it blows up with a space abend. I am just guessing there might be some issue with the tape |
|
Back to top |
|
|
Rijit
Active User
Joined: 15 Apr 2010 Posts: 168 Location: Pune
|
|
|
|
Here is the JCL for the step.
Code: |
2 //STEP20 EXEC PGM=SORT
//*
3 //SYSIN DD *
4 //SORTIN DD DSN=ROTP.TYUT.EXTRACT.KSDS,
// DISP=SHR
//*
5 //SORTOUT DD DSN=ROTP.INT1.Y5454.EXTRACT,
// DISP=(NEW,CATLG,CATLG),
// UNIT=TAPEV,
// DCB=(LRECL=75,BLKSIZE=0,RECFM=FB),EXPDT=99000
//*
6 //SORTWK01 DD UNIT=SYSDA,
// SPACE=(CYL,(500,100))
7 //SORTWK02 DD UNIT=SYSDA,
// SPACE=(CYL,(500,100))
8 //SORTWK03 DD UNIT=SYSDA,
// SPACE=(CYL,(500,100))
9 //SORTWK04 DD UNIT=SYSDA,
// SPACE=(CYL,(500,100))
//*
10 //SYSLOG DD SYSOUT=*
11 //SYSOUT DD SYSOUT=*
12 //SORTDIAG DD DUMMY
13 //SYSPRINT DD SYSOUT=*
14 //SYSUDUMP DD SYSOUT=*
SYSIN :
SORT FIELDS=(1,14,BI,A,21,10,CH,A)
END |
To be frank I have never seen any sort step running for 3 hrs
Edited: Please use BBcode when You post some code/error, that's rather readable, Thanks... Anuj |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Quote: |
finally it abended with S237 |
In addition to the s237, there would have been a return code (rc).
There would also have been an IEC023I message.
You need to look up what caused your particular abend. |
|
Back to top |
|
|
Anuj Dhawan
Superior Member
Joined: 22 Apr 2006 Posts: 6250 Location: Mumbai, India
|
|
|
|
Rijit wrote: |
To be frank I have never seen any sort step running for 3 hrs |
If you were "Frank" - Sort-Jobs won't malfunction in front of you! |
|
Back to top |
|
|
Anuj Dhawan
Superior Member
Joined: 22 Apr 2006 Posts: 6250 Location: Mumbai, India
|
|
|
|
Rijit wrote: |
The ran the job with the O/P file pointing to DASD instead of a tape which is in the original job. It was running pretty fast and finally it abended with S237. But I checked the O/P file, I had specified the DISP as (NEW,CATLG,CATLG). It had processed 50 % of the records in a very less time. But since the total no of records is so huge whenever I try to run it blows up with a space abend. I am just guessing there might be some issue with the tape |
S237 is an end-of-volume - do you execute your "experiment Job" in test?
Also, you need to tell - what does "so huge" means to you? How many records, precisely. Look at the SYSOUT of failed Job, this information should be listed there.
And how are you so sure that "50%" of records were written pretty fast? |
|
Back to top |
|
|
Rijit
Active User
Joined: 15 Apr 2010 Posts: 168 Location: Pune
|
|
|
|
Hi Anuj,
Yes I executed the experimental job in test environment and it abended with S237.
The total no of records in the I/P KSDS is in billions. And I had given my O/P file of this sort step with DISP - NEW,CATLG,CTLG..So even though it abended with s237 but still I had the output file and could count the no of records. It had processed almost 50 % of the records when the O/P file was on DASD instead of tape in significantly less time. But this is not really helping me as I can't use DASD for creating these output file in production..I was just playing in test region.
But I would like to know what factors can really hamper the performance of the sort, like in my case where the real CPU time is near 6 mins but elapsed time goes around to 3 hrs..And this happens in every prod run..If you have any suggestions pls lemme know..Thanks! |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Suggest you run a test around 4 in the morning some Saturday or Sunday morning. At these times, it is likely that the machine will be least busy.
If it takes 3hrs to get 6mins cpu time, your job is being impacted by other jobs or the tape system has "problems". What type of media is being used for the output tape? If it is 3480/3490 much of the time may be lost due to waiting for tape mounts. Does your system now have "fat" tapes (3590 or equivelent)? They are extremely fast and will hold a very high volume of data. Talk with your storage management people. |
|
Back to top |
|
|
Anuj Dhawan
Superior Member
Joined: 22 Apr 2006 Posts: 6250 Location: Mumbai, India
|
|
|
|
Well, with the information, you've posted - all I can do -- speculate about what is happening at your end.
1. Well, you talk about "huge" so you might like to use VSCORET=16M on EXEC, which will allow the sort-step to set the maximum amount of virtual storage below AND above the 16-megabyte line that SyncSort can use for its working set when SyncSort’s Dynamic Storage Management (DSM) facility is inactive. BUT, this gets site-specific whether:
a. the DSM active at your shop or not?
b. Value of VSCORET is, usually, set up by the product-representatives so you need to know how good the 16M value is for the step in question? Both the question can be answered by your sort-product representative.
2. When you see the expanded JCL in JESJCL DD what does it tell about sort-work files?
3. What is default for DYNALLOC?
4. Check what is default for RETRY is set to under DYNALLOC ? |
|
Back to top |
|
|
|