IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Extracting fields from a tape file using SYNCSORT


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
kbmk

New User


Joined: 27 Sep 2007
Posts: 24
Location: Chennai

PostPosted: Thu Jan 21, 2010 10:16 am
Reply with quote

Hi,

I have to create a new job which should extract details from a tape file. I do not need all the records & fields from the tape. I am using the below sort

Code:
SORT FIELDS=(1,3,CH,A)
INCLUDE COND=(1,9,CH,LT,C'079')
INREC FIELDS=(....)


The tape file size is 10.28GB. I have used five sortwk files with space as(CYL,(1500,200). But, when i have checked the syslog, no sort work files are used. Also, the job takes more than 2 hours.

I have also tried MAXSORT parameter with JCL as below, but it ran for 4 hours. I might have coded something wrong here.

Code:


//STEP01 EXEC PGM=SORT,PARM='MAXSORT,MAXWKSP=MAX'
//SORTIN DD DSN=TAPE.FILE,DISP=SHR
//SORTOUT DD DSN=DASD.FILE,DISP=SHR (cataloged before)
//SORTBKPT DD DSN=BKDSN.FILE,DISP=(OLD,KEEP) (Cataloged before)
//SORTWK01 DD UNIT=SYSDA,SPACE=(CYL,(1500,100))
...
//SORTWK10 DD UNIT=SYSDA,SPACE=(CYL,(1500,100))
//SORTOU01 DD DSN=&&DSN1,DISP=(NEW,DELETE),
//              SPACE=(CYL,(500,200),RLSE)
...
//SORTOU05 DD DSN=&&DSN5,DISP=(NEW,DELETE),
//             SPACE=(CYL,(500,200),RLSE)
//SYSIN DD *
SORT FIELDS=(1,3,CH,A)
INCLUDE COND=(1,9,CH,LT,C'079')
INREC FIELDS=(....)
/*
//*


Kindly advice me whether there is a chance to reduce the run time. I have only syncsort. Any suggestions to reduce the job run time will be helpful.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Jan 21, 2010 10:39 am
Reply with quote

Hello,

Quote:
I have only syncsort.
Which is very fast. . .

What tape media is being used? How many tape volumes are read? You might look at the syslog and see how much tiimes was spent waiting on tape mounts.

What is the lrecl for the data and how many records are there that will be "included"?
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Thu Jan 21, 2010 11:00 am
Reply with quote

do they need to be sorted? instead of COPY?
Back to top
View user's profile Send private message
kbmk

New User


Joined: 27 Sep 2007
Posts: 24
Location: Chennai

PostPosted: Thu Jan 21, 2010 12:15 pm
Reply with quote

LRECL is 1000. No of records is 11043008 (calculated with formula (block-size * number of blocks)/lrecl).

The media type is 36tk (Obtained from TLMS tape inquiry). The extracted output file contains about 11000 records with LRECL as 400.

If Sort fields=copy will reduce the run time, it will be fine to have another step to sort the disk file (will not take more time).
Back to top
View user's profile Send private message
Anuj Dhawan

Superior Member


Joined: 22 Apr 2006
Posts: 6250
Location: Mumbai, India

PostPosted: Thu Jan 21, 2010 3:48 pm
Reply with quote

Quote:
But, when i have checked the syslog, no sort work files are used. Also, the job takes more than 2 hours.
As Dick has asked about copy (probably with the same intentions as mine - and my thoughts are) - sort product don't use sortwork files in "copy-operation" - those 2-hrs might be related to tape mountings or i-o operations, so if sorting is not required just 'option copy".

Suggest you get in touch with Alissa, syncsort point of contact in this forum - she might be able to help you better, as she supports this product.
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Thu Jan 21, 2010 4:02 pm
Reply with quote

Hi,

changing the option to copy should not make much difference considering it's only sorting 11000 records with an LRECL of 400.


Gerry
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Thu Jan 21, 2010 4:10 pm
Reply with quote

Hi,

is it possible to show the output from run ?


Gerry
Back to top
View user's profile Send private message
PeterHolland

Global Moderator


Joined: 27 Oct 2009
Posts: 2481
Location: Netherlands, Amstelveen

PostPosted: Thu Jan 21, 2010 4:31 pm
Reply with quote

I guess your input blocksize=lrecl=1000.
What is your output blocksize?
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Thu Jan 21, 2010 4:36 pm
Reply with quote

Hi Peter,

I'd be more concerned with the input blocksize, that's why I asked for the output from the run.


Gerry
Back to top
View user's profile Send private message
PeterHolland

Global Moderator


Joined: 27 Oct 2009
Posts: 2481
Location: Netherlands, Amstelveen

PostPosted: Thu Jan 21, 2010 4:43 pm
Reply with quote

gcicchet wrote:
Hi Peter,

I'd be more concerned with the input blocksize, that's why I asked for the output from the run.


Gerry


Hello Gerry,

I mentioned my doubt about the input blksize in my comment.
So im quite sure the output is also blksize=lrecl=400
Back to top
View user's profile Send private message
kbmk

New User


Joined: 27 Sep 2007
Posts: 24
Location: Chennai

PostPosted: Thu Jan 21, 2010 5:48 pm
Reply with quote

Quote:
As Dick has asked about copy (probably with the same intentions as mine - and my thoughts are) - sort product don't use sortwork files in "copy-operation" - those 2-hrs might be related to tape mountings or i-o operations, so if sorting is not required just 'option copy".

I have used only sort fields=(1,9,ch,a). But still, it has not used the sortwork files.
Quote:
I guess your input blocksize=lrecl=1000.
What is your output blocksize?

Input file block size is 32000.

For some reason, here there is no convention to code blocksize along with DCB parameter for output files. it will be left for the system to decide automatically. I have verified the dataset information, it says 27939 & lrecl=417.

Quote:
is it possible to show the output from run ?

Do you mean by output file? it is difficult to get that.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Jan 21, 2010 9:05 pm
Reply with quote

Hello,

Not the data file - just the informational sysout data generated by the run. It is possibly still in the spool. . .

It may also help if you post the complete jcl and control statements.

For the sysout data and the jcl/control statements, please use the "Code" tag for readability.

How many tape volumes were mounted to read the data?
Back to top
View user's profile Send private message
Anuj Dhawan

Superior Member


Joined: 22 Apr 2006
Posts: 6250
Location: Mumbai, India

PostPosted: Thu Jan 21, 2010 9:29 pm
Reply with quote

kbmk wrote:
I have used only sort fields=(1,9,ch,a). But still, it has not used the sortwork files.
(An un-tested assumption is)- may be your file is already sorted on those keys then...

Well, please post the sysout as asked many times - otherwise my empty mind will keep on adding clutters here... (since some days it's just behaving as evil's workshop...hmmmm)
Back to top
View user's profile Send private message
kbmk

New User


Joined: 27 Sep 2007
Posts: 24
Location: Chennai

PostPosted: Fri Jan 22, 2010 8:39 am
Reply with quote

Sorry, my spool output logs are erased. I have to run the job again. I will provide the SYSOUT today.
Back to top
View user's profile Send private message
kbmk

New User


Joined: 27 Sep 2007
Posts: 24
Location: Chennai

PostPosted: Fri Jan 22, 2010 8:02 pm
Reply with quote

sysout data
Code:
************************************************************************************************************************************
 * STEP |   STEP   | PROGRAM  |          TIME IN SECS          |   VIRT STORAGE IN K   |  COMP  |           SERVICE UNITS           *
 *  NO  |   NAME   |   NAME   |  ELAPSED    CPU SRB    CPU TCB |REQUSTD   USED   OVRHD |  CODE  |  STEP     CPU      IO       MSO   *
 ************************************************************************************************************************************
 * 002  | S01      | SORT     |  7145.81       3.15      49.34 |0131072 0003080  00264 |  0000  | 1247776  8273907  1808667        0*
 ************************************************************************************************************************************
 * DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  *
 *SORTIN      55622|SORTOUT         0|SYSOUT          0|SYSIN           0|SORTIN      74168|SORTOUT         0|SYSOUT          0*
 *SYSIN           0|SORTIN     101300|SORTOUT         0|SYSOUT          0|SYSIN           0|SORTIN     102478|SORTOUT         0*
 *SYSOUT          0|SYSIN           0|SORTIN      27773|SORTOUT         1|SYSOUT          0|SYSIN           0|                 *
 *******************************************************************************************************************************

 IEF373I STEP/S01     /START 2010021.1438
 IEF374I STEP/S01     /STOP  2010021.1637 CPU    0MIN 49.34SEC SRB    0MIN 03.15SEC VIRT  3080K SYS   264K EXT    6384K SYS   11504K

 *********************************************************************************************************
 *          |          TIME IN SECS          |                  SERVICE UNITS                  |         *
 * JOBNAME  |  ELAPSED    CPU SRB    CPU TCB |   JOB        CPU        IO       MSO       SRB  | SYS ID  *
 *********************************************************************************************************
 * SORT1234 |  7146.13       3.15      49.49 | 1304995   8331014   1808727         0   1165254 |  SYSD   *
 *********************************************************************************************************

 IEF375I  JOB/SORT1234/START 2010021.1438
 IEF376I  JOB/SORT1234/STOP  2010021.1637 CPU    0MIN 49.49SEC SRB    0MIN 03.15SEC



JCL for the above sysout
Code:
//STEP EXEC PGM=SORT,PARM='IO'
//SORTIN DD DSN=INPUT.TAPE.FILE,DISP=SHR
//SORTOUT DD DSN=OUTPUT.FILE,
//      DISP=(NEW,CATLG),
//     UNIT=TESTDA,SPACE=(CYL,(50,5),RLSE)
//SYSIN DD *
 SORT FIELDS=COPY
 INCLUDE COND=(....)
 INREC FIELDS=(....)
/*
//*


The input tape file has 11403008 records with lrecl as 1000 (total of 17 volumes). The output file has 985 records with lrecl as 100 in this case (Because of INREC statement). It took almost 2 hours to complete.

Actually, the output of the above file is used to sum up the fields in a particular position. I have tried to combine both these two steps below.

I have submitted another job with sort fields=(1,9,ch,a) and sum fields=(54,6,zd). Previously when I submit the jobs with DYNALLOC=32 parameter, no sort work files are used. The output contains just 29 records. this job is now running. i will post the details once it is completed.

Kindly advise me how to reduce the job run time.
Back to top
View user's profile Send private message
PeterHolland

Global Moderator


Joined: 27 Oct 2009
Posts: 2481
Location: Netherlands, Amstelveen

PostPosted: Fri Jan 22, 2010 8:13 pm
Reply with quote

2 hours for 17 volumes is not that long (7 minutes a volume), taking in account the time needed for mounting/demounting, sleeping tape operators (if they still exist).
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Fri Jan 22, 2010 8:16 pm
Reply with quote

Quote:
sleeping tape operators (if they still exist).
usually on third shift. icon_biggrin.gif
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Fri Jan 22, 2010 8:20 pm
Reply with quote

Hard to do, I would say that it' s all wait time due to tape handling
also as far as the copy/sort debate it is a non issue
a properly written sort step will pass to the sort process only the selected/included records ,
1000 records will certainly be handled in main storage without the need of workfiles
no reason to split in multiple steps
Back to top
View user's profile Send private message
PeterHolland

Global Moderator


Joined: 27 Oct 2009
Posts: 2481
Location: Netherlands, Amstelveen

PostPosted: Fri Jan 22, 2010 8:21 pm
Reply with quote

Yes Robert, and in day time shifts drinking coffee or reading papers. With the paper lying over the next reel and turning the tape vault over to find it.
Back to top
View user's profile Send private message
kbmk

New User


Joined: 27 Sep 2007
Posts: 24
Location: Chennai

PostPosted: Fri Jan 22, 2010 8:22 pm
Reply with quote

Robert Sample wrote:
Quote:
sleeping tape operators (if they still exist).
usually on third shift. icon_biggrin.gif


Ok. Thanks. But, I want to understand why sort work files are not being used when I am using DYNALLOC=32 with SORT FIELDS=(1,9,CH,A)

Any help in this regard?
Back to top
View user's profile Send private message
PeterHolland

Global Moderator


Joined: 27 Oct 2009
Posts: 2481
Location: Netherlands, Amstelveen

PostPosted: Fri Jan 22, 2010 8:32 pm
Reply with quote

I didnt see you use DYNALLOC in your samples.
But probably sort could do the 1000 records in memory, like
Enrico stated.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri Jan 22, 2010 8:43 pm
Reply with quote

Hello,

Good start, but that is not all of the requested info. . .

At least 2 sets of sysout are missing - the jes msg log and the informational output from the sort execution.

If the input requires 17 physical volumes, you might ask if higher-volume media is available for these very large files. Are these tapes mounted by a silo or are they manually loaded? I syspect much of the elapsed time is due to waiting on tape mounts.
Back to top
View user's profile Send private message
Ronald Burr

Active User


Joined: 22 Oct 2009
Posts: 293
Location: U.S.A.

PostPosted: Fri Jan 22, 2010 9:31 pm
Reply with quote

One thing you "could" do to reduce elapsed time, if permitted by your shop, is to "ping-pong" tape drives for the input file. By doing so, you would essentially eliminate any time lost in mounting/demounting physical tapes ( except for the first volume ). To "ping-pong" tape drives, just code UNIT=(TAPE,2) on the input file DD statement ( assuming that 'TAPE' is the unit designation for tape in your shop - if not, just substitute whatever is your shop's standard ).
Back to top
View user's profile Send private message
kbmk

New User


Joined: 27 Sep 2007
Posts: 24
Location: Chennai

PostPosted: Sat Jan 23, 2010 6:40 am
Reply with quote

dick scherrer wrote:


At least 2 sets of sysout are missing - the jes msg log and the informational output from the sort execution..


Are you asking for the complete JESMSG log. Kindly let me know if you need any specific details from JESMSG log, I can find it and provide you.

Quote:

Are these tapes mounted by a silo or are they manually loaded?


the step where the file is created, we have UNIT=SILO. Could you please explain what is meant by SILO? I thought it is a tape name.

To my astonishment, my second job with SORT FIELDS=(1,9,CH,A) and SUM FIELDS statement has completed within 1hrs 20 mins. where as the SORT FIELDS=COPY has taken almost 2 hours. I am still not able to understand why a copy is taking more time than the sorting?

I have used DYNALLOC=32 and IO parameter in this JCL. After summing up, the number of records in the output file is 29.

Code:
//STEP01 EXEC PGM=SORT,'IO,DYNALLOC=32'
//SORTIN DD DSN=INPUT.TAPE.FILE,DISP=SHR
//SORTOUT DD DSN=OUTPUT.DASD.FILE,DISP=(NEW,CATLG,CATLG),
//             UNIT=SYSDA,SPACE=(CYL,(1,5),RLSE
//SYSOUT  DD SYSOUT=*
//SYSIN   DD *
INREC FIELDS=(...)
SORT FIELDS=(1,9,CH,A)
SUM FIELDS=(54,6,ZD)
INCLUDE COND=(....)
/*
//*


Code:

************************************************************************************************************************************
* STEP |   STEP   | PROGRAM  |          TIME IN SECS          |   VIRT STORAGE IN K   |  COMP  |           SERVICE UNITS           *
*  NO  |   NAME   |   NAME   |  ELAPSED    CPU SRB    CPU TCB |REQUSTD   USED  OVRHD  |  CODE  |  STEP     CPU      IO       MSO   *
************************************************************************************************************************************
* 002  | SO1      | SORT     |  5090.57        .20      39.46 |0131072 0003080 00268  |  0000  | 4777562  4615061    87009        0*
************************************************************************************************************************************
* DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  | DDNAME   EXCPS  |DDNAME    EXCPS  *
*SORTIN       1549|SORTOUT         0|SYSOUT          0|SYSIN           0|SORTIN       6236|SORTOUT         0|SYSOUT          0*
*SYSIN           0|SORTIN       5970|SORTOUT         0|SYSOUT          0|SYSIN           0|SORTIN       3259|SORTOUT         1*
*SYSOUT          0|SYSIN           0|SORTWK01        2|                 |                 |                 |                 *
************************************************************************************************************************************

IEF373I STEP/S01     /START 2010021.2121
IEF374I STEP/S01     /STOP  2010021.2246 CPU     OMIN 39.46SEC SRB   0MIN 00.20SEC VIRT  3080K SYS   268K EXT   26252K SYS   11504K

**********************************************************************************************************
*          |           TIME IN SECS          |                  SERVICE UNITS                  |         *
* JOBNAME  |  ELAPSED    CPU SRB     CPU TCB |   JOB        CPU        IO       MSO       SRB  | SYS ID  *
**********************************************************************************************************
* SORT1234 |  5090.84        .20       39.61 | 4833141   4670538     87069         0     75534 |  SYSD   *
********************************************************************************************************** 

IEF375I JOB/SORT1234/START 2010021.2121
IEF376I JOB/SORT1234/STOP  2010021.2246 CPU     OMIN 39.61SEC SRB   0MIN 00.20SEC


Let me know if you need any other details. The same input file is used for different extraction purposes. In this particular case, it is extracting only 985 records and it is summed up to give a total of 29 records.
In other jobs, more than 10,000 records are extracted and no SUM statement is needed and the file is needed to be sorted order.

Also, if someone can tell me know how to understand the above SYSLOG, it will be more helpful.

happy weekend to all!!
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sat Jan 23, 2010 9:31 am
Reply with quote

Hello,

UNIT=SILO is specific to your system - it could just as well be UNIT=BARN. These names are determined by the system configuration.

On the other hand, a silo is a hardware device that contains many cart volumes. These are inventoried within the hardware and are automatically mounted on a tape drive when needed. The computer operators do not manually mount these volumes.

Even 10,000 is a very small number of records and would process in just a few seconds - unless it took longer to pass 17 volumes looking for the 10,000 records.

You need to work with your local support people to gain a better understanding of where the time goes in this job. From what you've posted, almost all of the time is spent reading the 17 volumes or waitiing for them to be mounted for processing.

And tape datasets should not be specified with disp=shr. . . A tape volume cannot be shared. . .
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM Goto page 1, 2  Next

 


Similar Topics
Topic Forum Replies
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 1
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
Search our Forums:

Back to Top