View previous topic :: View next topic
|
Author |
Message |
kbmk
New User
Joined: 27 Sep 2007 Posts: 24 Location: Chennai
|
|
|
|
Hi,
I have to create a new job which should extract details from a tape file. I do not need all the records & fields from the tape. I am using the below sort
Code: |
SORT FIELDS=(1,3,CH,A)
INCLUDE COND=(1,9,CH,LT,C'079')
INREC FIELDS=(....) |
The tape file size is 10.28GB. I have used five sortwk files with space as(CYL,(1500,200). But, when i have checked the syslog, no sort work files are used. Also, the job takes more than 2 hours.
I have also tried MAXSORT parameter with JCL as below, but it ran for 4 hours. I might have coded something wrong here.
Code: |
//STEP01 EXEC PGM=SORT,PARM='MAXSORT,MAXWKSP=MAX'
//SORTIN DD DSN=TAPE.FILE,DISP=SHR
//SORTOUT DD DSN=DASD.FILE,DISP=SHR (cataloged before)
//SORTBKPT DD DSN=BKDSN.FILE,DISP=(OLD,KEEP) (Cataloged before)
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(1500,100))
...
//SORTWK10 DD UNIT=SYSDA,SPACE=(CYL,(1500,100))
//SORTOU01 DD DSN=&&DSN1,DISP=(NEW,DELETE),
// SPACE=(CYL,(500,200),RLSE)
...
//SORTOU05 DD DSN=&&DSN5,DISP=(NEW,DELETE),
// SPACE=(CYL,(500,200),RLSE)
//SYSIN DD *
SORT FIELDS=(1,3,CH,A)
INCLUDE COND=(1,9,CH,LT,C'079')
INREC FIELDS=(....)
/*
//* |
Kindly advice me whether there is a chance to reduce the run time. I have only syncsort. Any suggestions to reduce the job run time will be helpful. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
I have only syncsort. |
Which is very fast. . .
What tape media is being used? How many tape volumes are read? You might look at the syslog and see how much tiimes was spent waiting on tape mounts.
What is the lrecl for the data and how many records are there that will be "included"? |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
do they need to be sorted? instead of COPY? |
|
Back to top |
|
|
kbmk
New User
Joined: 27 Sep 2007 Posts: 24 Location: Chennai
|
|
|
|
LRECL is 1000. No of records is 11043008 (calculated with formula (block-size * number of blocks)/lrecl).
The media type is 36tk (Obtained from TLMS tape inquiry). The extracted output file contains about 11000 records with LRECL as 400.
If Sort fields=copy will reduce the run time, it will be fine to have another step to sort the disk file (will not take more time). |
|
Back to top |
|
|
Anuj Dhawan
Superior Member
Joined: 22 Apr 2006 Posts: 6250 Location: Mumbai, India
|
|
|
|
Quote: |
But, when i have checked the syslog, no sort work files are used. Also, the job takes more than 2 hours. |
As Dick has asked about copy (probably with the same intentions as mine - and my thoughts are) - sort product don't use sortwork files in "copy-operation" - those 2-hrs might be related to tape mountings or i-o operations, so if sorting is not required just 'option copy".
Suggest you get in touch with Alissa, syncsort point of contact in this forum - she might be able to help you better, as she supports this product. |
|
Back to top |
|
|
gcicchet
Senior Member
Joined: 28 Jul 2006 Posts: 1702 Location: Australia
|
|
|
|
Hi,
changing the option to copy should not make much difference considering it's only sorting 11000 records with an LRECL of 400.
Gerry |
|
Back to top |
|
|
gcicchet
Senior Member
Joined: 28 Jul 2006 Posts: 1702 Location: Australia
|
|
|
|
Hi,
is it possible to show the output from run ?
Gerry |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
I guess your input blocksize=lrecl=1000.
What is your output blocksize? |
|
Back to top |
|
|
gcicchet
Senior Member
Joined: 28 Jul 2006 Posts: 1702 Location: Australia
|
|
|
|
Hi Peter,
I'd be more concerned with the input blocksize, that's why I asked for the output from the run.
Gerry |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
gcicchet wrote: |
Hi Peter,
I'd be more concerned with the input blocksize, that's why I asked for the output from the run.
Gerry |
Hello Gerry,
I mentioned my doubt about the input blksize in my comment.
So im quite sure the output is also blksize=lrecl=400 |
|
Back to top |
|
|
kbmk
New User
Joined: 27 Sep 2007 Posts: 24 Location: Chennai
|
|
|
|
Quote: |
As Dick has asked about copy (probably with the same intentions as mine - and my thoughts are) - sort product don't use sortwork files in "copy-operation" - those 2-hrs might be related to tape mountings or i-o operations, so if sorting is not required just 'option copy". |
I have used only sort fields=(1,9,ch,a). But still, it has not used the sortwork files.
Quote: |
I guess your input blocksize=lrecl=1000.
What is your output blocksize? |
Input file block size is 32000.
For some reason, here there is no convention to code blocksize along with DCB parameter for output files. it will be left for the system to decide automatically. I have verified the dataset information, it says 27939 & lrecl=417.
Quote: |
is it possible to show the output from run ? |
Do you mean by output file? it is difficult to get that. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Not the data file - just the informational sysout data generated by the run. It is possibly still in the spool. . .
It may also help if you post the complete jcl and control statements.
For the sysout data and the jcl/control statements, please use the "Code" tag for readability.
How many tape volumes were mounted to read the data? |
|
Back to top |
|
|
Anuj Dhawan
Superior Member
Joined: 22 Apr 2006 Posts: 6250 Location: Mumbai, India
|
|
|
|
kbmk wrote: |
I have used only sort fields=(1,9,ch,a). But still, it has not used the sortwork files. |
(An un-tested assumption is)- may be your file is already sorted on those keys then...
Well, please post the sysout as asked many times - otherwise my empty mind will keep on adding clutters here... (since some days it's just behaving as evil's workshop...hmmmm) |
|
Back to top |
|
|
kbmk
New User
Joined: 27 Sep 2007 Posts: 24 Location: Chennai
|
|
|
|
Sorry, my spool output logs are erased. I have to run the job again. I will provide the SYSOUT today. |
|
Back to top |
|
|
kbmk
New User
Joined: 27 Sep 2007 Posts: 24 Location: Chennai
|
|
|
|
sysout data
Code: |
************************************************************************************************************************************
* STEP | STEP | PROGRAM | TIME IN SECS | VIRT STORAGE IN K | COMP | SERVICE UNITS *
* NO | NAME | NAME | ELAPSED CPU SRB CPU TCB |REQUSTD USED OVRHD | CODE | STEP CPU IO MSO *
************************************************************************************************************************************
* 002 | S01 | SORT | 7145.81 3.15 49.34 |0131072 0003080 00264 | 0000 | 1247776 8273907 1808667 0*
************************************************************************************************************************************
* DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS *
*SORTIN 55622|SORTOUT 0|SYSOUT 0|SYSIN 0|SORTIN 74168|SORTOUT 0|SYSOUT 0*
*SYSIN 0|SORTIN 101300|SORTOUT 0|SYSOUT 0|SYSIN 0|SORTIN 102478|SORTOUT 0*
*SYSOUT 0|SYSIN 0|SORTIN 27773|SORTOUT 1|SYSOUT 0|SYSIN 0| *
*******************************************************************************************************************************
IEF373I STEP/S01 /START 2010021.1438
IEF374I STEP/S01 /STOP 2010021.1637 CPU 0MIN 49.34SEC SRB 0MIN 03.15SEC VIRT 3080K SYS 264K EXT 6384K SYS 11504K
*********************************************************************************************************
* | TIME IN SECS | SERVICE UNITS | *
* JOBNAME | ELAPSED CPU SRB CPU TCB | JOB CPU IO MSO SRB | SYS ID *
*********************************************************************************************************
* SORT1234 | 7146.13 3.15 49.49 | 1304995 8331014 1808727 0 1165254 | SYSD *
*********************************************************************************************************
IEF375I JOB/SORT1234/START 2010021.1438
IEF376I JOB/SORT1234/STOP 2010021.1637 CPU 0MIN 49.49SEC SRB 0MIN 03.15SEC
|
JCL for the above sysout
Code: |
//STEP EXEC PGM=SORT,PARM='IO'
//SORTIN DD DSN=INPUT.TAPE.FILE,DISP=SHR
//SORTOUT DD DSN=OUTPUT.FILE,
// DISP=(NEW,CATLG),
// UNIT=TESTDA,SPACE=(CYL,(50,5),RLSE)
//SYSIN DD *
SORT FIELDS=COPY
INCLUDE COND=(....)
INREC FIELDS=(....)
/*
//* |
The input tape file has 11403008 records with lrecl as 1000 (total of 17 volumes). The output file has 985 records with lrecl as 100 in this case (Because of INREC statement). It took almost 2 hours to complete.
Actually, the output of the above file is used to sum up the fields in a particular position. I have tried to combine both these two steps below.
I have submitted another job with sort fields=(1,9,ch,a) and sum fields=(54,6,zd). Previously when I submit the jobs with DYNALLOC=32 parameter, no sort work files are used. The output contains just 29 records. this job is now running. i will post the details once it is completed.
Kindly advise me how to reduce the job run time. |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
2 hours for 17 volumes is not that long (7 minutes a volume), taking in account the time needed for mounting/demounting, sleeping tape operators (if they still exist). |
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
Quote: |
sleeping tape operators (if they still exist). |
usually on third shift. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
Hard to do, I would say that it' s all wait time due to tape handling
also as far as the copy/sort debate it is a non issue
a properly written sort step will pass to the sort process only the selected/included records ,
1000 records will certainly be handled in main storage without the need of workfiles
no reason to split in multiple steps |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
Yes Robert, and in day time shifts drinking coffee or reading papers. With the paper lying over the next reel and turning the tape vault over to find it. |
|
Back to top |
|
|
kbmk
New User
Joined: 27 Sep 2007 Posts: 24 Location: Chennai
|
|
|
|
Robert Sample wrote: |
Quote: |
sleeping tape operators (if they still exist). |
usually on third shift. |
Ok. Thanks. But, I want to understand why sort work files are not being used when I am using DYNALLOC=32 with SORT FIELDS=(1,9,CH,A)
Any help in this regard? |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
I didnt see you use DYNALLOC in your samples.
But probably sort could do the 1000 records in memory, like
Enrico stated. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Good start, but that is not all of the requested info. . .
At least 2 sets of sysout are missing - the jes msg log and the informational output from the sort execution.
If the input requires 17 physical volumes, you might ask if higher-volume media is available for these very large files. Are these tapes mounted by a silo or are they manually loaded? I syspect much of the elapsed time is due to waiting on tape mounts. |
|
Back to top |
|
|
Ronald Burr
Active User
Joined: 22 Oct 2009 Posts: 293 Location: U.S.A.
|
|
|
|
One thing you "could" do to reduce elapsed time, if permitted by your shop, is to "ping-pong" tape drives for the input file. By doing so, you would essentially eliminate any time lost in mounting/demounting physical tapes ( except for the first volume ). To "ping-pong" tape drives, just code UNIT=(TAPE,2) on the input file DD statement ( assuming that 'TAPE' is the unit designation for tape in your shop - if not, just substitute whatever is your shop's standard ). |
|
Back to top |
|
|
kbmk
New User
Joined: 27 Sep 2007 Posts: 24 Location: Chennai
|
|
|
|
dick scherrer wrote: |
At least 2 sets of sysout are missing - the jes msg log and the informational output from the sort execution.. |
Are you asking for the complete JESMSG log. Kindly let me know if you need any specific details from JESMSG log, I can find it and provide you.
Quote: |
Are these tapes mounted by a silo or are they manually loaded? |
the step where the file is created, we have UNIT=SILO. Could you please explain what is meant by SILO? I thought it is a tape name.
To my astonishment, my second job with SORT FIELDS=(1,9,CH,A) and SUM FIELDS statement has completed within 1hrs 20 mins. where as the SORT FIELDS=COPY has taken almost 2 hours. I am still not able to understand why a copy is taking more time than the sorting?
I have used DYNALLOC=32 and IO parameter in this JCL. After summing up, the number of records in the output file is 29.
Code: |
//STEP01 EXEC PGM=SORT,'IO,DYNALLOC=32'
//SORTIN DD DSN=INPUT.TAPE.FILE,DISP=SHR
//SORTOUT DD DSN=OUTPUT.DASD.FILE,DISP=(NEW,CATLG,CATLG),
// UNIT=SYSDA,SPACE=(CYL,(1,5),RLSE
//SYSOUT DD SYSOUT=*
//SYSIN DD *
INREC FIELDS=(...)
SORT FIELDS=(1,9,CH,A)
SUM FIELDS=(54,6,ZD)
INCLUDE COND=(....)
/*
//*
|
Code: |
************************************************************************************************************************************
* STEP | STEP | PROGRAM | TIME IN SECS | VIRT STORAGE IN K | COMP | SERVICE UNITS *
* NO | NAME | NAME | ELAPSED CPU SRB CPU TCB |REQUSTD USED OVRHD | CODE | STEP CPU IO MSO *
************************************************************************************************************************************
* 002 | SO1 | SORT | 5090.57 .20 39.46 |0131072 0003080 00268 | 0000 | 4777562 4615061 87009 0*
************************************************************************************************************************************
* DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS | DDNAME EXCPS |DDNAME EXCPS *
*SORTIN 1549|SORTOUT 0|SYSOUT 0|SYSIN 0|SORTIN 6236|SORTOUT 0|SYSOUT 0*
*SYSIN 0|SORTIN 5970|SORTOUT 0|SYSOUT 0|SYSIN 0|SORTIN 3259|SORTOUT 1*
*SYSOUT 0|SYSIN 0|SORTWK01 2| | | | *
************************************************************************************************************************************
IEF373I STEP/S01 /START 2010021.2121
IEF374I STEP/S01 /STOP 2010021.2246 CPU OMIN 39.46SEC SRB 0MIN 00.20SEC VIRT 3080K SYS 268K EXT 26252K SYS 11504K
**********************************************************************************************************
* | TIME IN SECS | SERVICE UNITS | *
* JOBNAME | ELAPSED CPU SRB CPU TCB | JOB CPU IO MSO SRB | SYS ID *
**********************************************************************************************************
* SORT1234 | 5090.84 .20 39.61 | 4833141 4670538 87069 0 75534 | SYSD *
**********************************************************************************************************
IEF375I JOB/SORT1234/START 2010021.2121
IEF376I JOB/SORT1234/STOP 2010021.2246 CPU OMIN 39.61SEC SRB 0MIN 00.20SEC
|
Let me know if you need any other details. The same input file is used for different extraction purposes. In this particular case, it is extracting only 985 records and it is summed up to give a total of 29 records.
In other jobs, more than 10,000 records are extracted and no SUM statement is needed and the file is needed to be sorted order.
Also, if someone can tell me know how to understand the above SYSLOG, it will be more helpful.
happy weekend to all!! |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
UNIT=SILO is specific to your system - it could just as well be UNIT=BARN. These names are determined by the system configuration.
On the other hand, a silo is a hardware device that contains many cart volumes. These are inventoried within the hardware and are automatically mounted on a tape drive when needed. The computer operators do not manually mount these volumes.
Even 10,000 is a very small number of records and would process in just a few seconds - unless it took longer to pass 17 volumes looking for the 10,000 records.
You need to work with your local support people to gain a better understanding of where the time goes in this job. From what you've posted, almost all of the time is spent reading the 17 volumes or waitiing for them to be mounted for processing.
And tape datasets should not be specified with disp=shr. . . A tape volume cannot be shared. . . |
|
Back to top |
|
|
|