Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
As you are using SYNCSORT and this is a DFSORT part of the forum you will not get far. SYNCSORT questions get posted in the JCL part of the forum. No doubt some kind moderator will move it.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
If you contact Syncsort support, I'm sure they will be glad to help. If your search around in the JCL forum, not in the DFSORT forum, then you willl even find some appropriate contact e-mail addresses.
One thing I'd do, rather than wirting out two files that are pretty-much the same, but just a bit more data at the end of the record in one of them, is to just write out one file, and, if possible, change the programs reading the "new" file so they can read a bigger record but still do nothing with the extra bit of data.
EDIT: Sheesh, you are only copying. What do you do with the output files? Why is a COPY taking so much CPU? I suppose that is your question.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Yes, seems odd to save 63 bytes of DASD per record, only to waste 24,000+ per block (if on DASD). The bigger rip-off on then would be reading later.
You are doing 222 million "moves" of around 1700 bytes, just to shorten the records. Has to be a better way, even if only by putting a dummy "IFTHEN" with IFOUTLEN= and the desired record length.
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
IF DASD, the poor blocking would cause the excp count to be very high,
(1 block is a physical i/o)
So, i would hazard a guess that
(if DASD is the output
we have no idea, since pertinent info is missing:
DD statements from JCL
jesmsg info on i/o
WHAT ELSE IS GOING ON IN THE MACHINE WHILE THIS JOB RUNS
)
the job is waiting for i/o channel activity.
for someone with 133 posts, the TS has provided us with nothing except,
My job is running a long time
and posted in the wrong forum.
I confirmed by checking the JCL all the 3 output datasets are on DASD not tapes. And they have DCB=(RECFM=FB,LRECL=1745,BLKSIZE=0) coded in so the BLKSIZE is calculated by the system.
IF DASD, the poor blocking would cause the excp count to be very high,
(1 block is a physical i/o)
So, i would hazard a guess that
(if DASD is the output
we have no idea, since pertinent info is missing:
DD statements from JCL
jesmsg info on i/o
WHAT ELSE IS GOING ON IN THE MACHINE WHILE THIS JOB RUNS
)
the job is waiting for i/o channel activity.
for someone with 133 posts, the TS has provided us with nothing except,
My job is running a long time
and posted in the wrong forum.
Sorry for delayed response. Here is the expanded JCL for this step
Code:
XX PARM='PGMN=SORT,ABEND=0004,DYNALLOC=(SYSDA,16),VSCORET=64M'
XXSYSPRINT DD SYSOUT=*
XXSYSOUT DD SYSOUT=*
XXSYSUDUMP DD SYSOUT=I
&&SORTIN DD DSN=&SORTIN.,DISP=SHR
XXSORTIN DD DSN=INVP.PRDM.INVOICE.MISC.UNLD.FULL,DISP=SHR
&&SORTOUT1 DD DSN=&SORTOUT1.,
&& DISP=(NEW,CATLG,DELETE),
&& SPACE=&SPACE,DATACLAS=DCCOMP,
&& DCB=(RECFM=FB,LRECL=1745,BLKSIZE=0)
XXSORTOUT1 DD DSN=INVP.PRDM.INVOICE.MISC.UNLD.TX,
XX DISP=(NEW,CATLG,DELETE),
XX SPACE=(CYL,(200,100),RLSE),DATACLAS=DCCOMP,
XX DCB=(RECFM=FB,LRECL=1745,BLKSIZE=0)
&&SORTOUT2 DD DSN=&SORTOUT2.,
&& DISP=(NEW,CATLG,DELETE),
&& SPACE=&SPACE,DATACLAS=DCCOMP,
&& DCB=(RECFM=FB,LRECL=1708,BLKSIZE=0)
XXSORTOUT2 DD DSN=INVP.PRDM.INVOICE.MISC.UNLD.NEW,
XX DISP=(NEW,CATLG,DELETE),
XX SPACE=(CYL,(200,100),RLSE),DATACLAS=DCCOMP,
XX DCB=(RECFM=FB,LRECL=1708,BLKSIZE=0)
&&SORTOUT3 DD DSN=&SORTOUT3.,
&& DISP=(NEW,CATLG,DELETE),
&& SPACE=&SPACE,DATACLAS=&DATACLAS,
&& DCB=(RECFM=FB,LRECL=1682,BLKSIZE=0)
XXSORTOUT3 DD DSN=INVP.PRDM.INVOICE.MISC.UNLD.IL,
XX DISP=(NEW,CATLG,DELETE),
XX SPACE=(CYL,(200,100),RLSE),DATACLAS=DCCOMP,
XX DCB=(RECFM=FB,LRECL=1682,BLKSIZE=0)
&&SYSIN DD DSN=&PARMLIB.(&MEM.),
&& DISP=SHR
XXSYSIN DD DSN=NBSP.WCC.PARMLIB(WC3SE63B),
XX DISP=SHR
SORT FIELDS=COPY
. OUTFIL FNAMES=SORTOUT1,INCLUDE=(1709,10,PD,NE,1000001)
. OUTFIL FNAMES=SORTOUT2,INCLUDE=(1709,10,PD,EQ,1000001),
. OUTREC=(1:1,1708)
. OUTFIL FNAMES=SORTOUT3,INCLUDE=(1709,10,PD,EQ,1000001),
. OUTREC=(1:1,1682)
JESYMSG
Code:
.IEF373I STEP/DB2ABEND/START 2012078.1109
.IEF374I STEP/DB2ABEND/STOP 2012078.1109 CPU 0MIN 00.00SEC SRB 0MIN 00.0
.IEF236I ALLOC. FOR WRTGMR56 PS0100 JS0200
.IGD103I SMS ALLOCATED TO DDNAME JOBLIB
.IGD103I SMS ALLOCATED TO DDNAME
.IGD103I SMS ALLOCATED TO DDNAME
.IEF237I JES2 ALLOCATED TO SYSPRINT
.IEF237I JES2 ALLOCATED TO SYSOUT
.IEF237I JES2 ALLOCATED TO SYSUDUMP
.IGD103I SMS ALLOCATED TO DDNAME SORTIN
.IGD17070I DATA SET
INVP.PRDM.INVOICE.MISC.UNLD.TX
.ALLOCATED SUCCESSFULLY WITH 1 STRIPE(S).
.IGD17160I DATA SET INVP.PRDM.INVOICE.MISC.UNLD.TX
.IS ELIGIBLE FOR COMPRESSION
.IGD101I SMS ALLOCATED TO DDNAME (SORTOUT1)
.
DSN (INVP.PRDM.INVOICE.MISC.UNLD.TX )
. STORCLAS (PRD1000S) MGMTCLAS (PRDLG) DATACLAS (DCCOM)
. VOL SER NOS= 3SS#JB
.IGD17070I DATA SET INVP.PRDM.INVOICE.MISC.UNLD.NEW
.ALLOCATED SUCCESSFULLY WITH 1 STRIPE(S).
.IGD17160I DATA SET INVP.PRDM.INVOICE.MISC.UNLD.NEW
.IS ELIGIBLE FOR COMPRESSION
.IGD101I SMS ALLOCATED TO DDNAME (SORTOUT2)
. DSN (INVP.PRDM.INVOICE.MISC.UNLD.NEW )
. STORCLAS (PRD1000S) MGMTCLAS (PRDLG) DATACLAS (DCCOM)
. VOL SER NOS= 3SS#1R
.IGD17070I DATA SET INVP.PRDM.INVOICE.MISC.UNLD.IL
.ALLOCATED SUCCESSFULLY WITH 1 STRIPE(S).
.IGD17160I DATA SET INVP.PRDM.INVOICE.MISC.UNLD.IL
.IS ELIGIBLE FOR COMPRESSION
.IGD101I SMS ALLOCATED TO DDNAME (SORTOUT3)
.
DSN (INVP.PRDM.INVOICE.MISC.UNLD.IL )
. STORCLAS (PRD1000S) MGMTCLAS (PRDLG) DATACLAS (DCCOM)
. VOL SER NOS= 3SSLQ2
.IGD103I SMS ALLOCATED TO DDNAME SYSIN
.ACC20210-A ADDVOL FOR DD=SORTOUT2 DSN=INVP.PRDM.INVOICE.MISC.UNLD.NEW VOL=3SS
.ACC20600-A VOLUME * WAS ADDED TO DATA SET
.ACC20210-A ADDVOL FOR DD=SORTOUT3 DSN=INVP.PRDM.INVOICE.MISC.UNLD.IL VOL=3SSLQ2
.ACC20600-A VOLUME * WAS ADDED TO DATA SET
.IEF142I WRTGMR56 PS0100 JS0200 - STEP WAS EXECUTED - COND CODE 0000
If you contact Syncsort support, I'm sure they will be glad to help. If your search around in the JCL forum, not in the DFSORT forum, then you willl even find some appropriate contact e-mail addresses.
One thing I'd do, rather than wirting out two files that are pretty-much the same, but just a bit more data at the end of the record in one of them, is to just write out one file, and, if possible, change the programs reading the "new" file so they can read a bigger record but still do nothing with the extra bit of data.
EDIT: Sheesh, you are only copying. What do you do with the output files? Why is a COPY taking so much CPU? I suppose that is your question.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Looking further lead to my previous question about what you are acutally doing with the files.
The little one it obviously makes sense to have seperate at some point. The bigger ones, less so (now that we know they aren't on tape, going to different locations, or similar).
How is the input being created? Does it come out of a SORT/TOOL step at any point? That could be a good time to create the small file.
The big output files are being processed at some point. How about, reading the big input file for your first application, doing logic to select and ignore the small sub-set to a new file, and just going with the big files for the rest of the processing, as mentioned above.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
I meant "big file" singular in the above. Just use the main input file for the rest of the processing. You are "saving" little bits of DASD (subject to your blocking problem) yet having three copies of 111 million records of approx. 1700 bytes!
If you just use the main input file in place of the two you are creating, you save all that time/processing/cost and a whole heap of DASD, backup time/media, JCL simplicity, design simplicity, etc.
The best way to "save" CPU/IO is not usually by "tuning" as such, but by working out how things which are being done don't need to be done.
Pity I don't have some way to charge you for all those savings... :-)
I checked the 3rd file with 1682 LRECL can be avoided. There is an easytrieve program which uses the 3rd file to get some data form it. So it can get the data from the bigger file also. Thanks a ton!
No problem. Now, If you can only use the big file instead of the 1708, and get the little file created somewhere else...
We need at least the first two files for our programs to work.
Anyways the little file has to be created in a downstream job as it is also required by another program. So it will consume the same CPU later in that step.
But I guess the advantage here is if we create the little file before then the SORTOUT2 will have less records to process by the EZT pgm in the same job It will have the records of the little file (SORTOUT1) eliminated in the sort step..Am I corect?
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
The following is extracted from what you posted earlier.
Code:
. SORT FIELDS=COPY
. OUTFIL FNAMES=SORTOUT1,INCLUDE=(1709,10,PD,NE,1000001)
. OUTFIL FNAMES=SORTOUT2,INCLUDE=(1709,10,PD,EQ,1000001),
. OUTREC=(1:1,1708)
. OUTFIL FNAMES=SORTOUT3,INCLUDE=(1709,10,PD,EQ,1000001),
. OUTREC=(1:1,1682)
. WER108I SORTIN : RECFM=FB ; LRECL= 1745; BLKSIZE= 27920
. WER110I SORTOUT1 : RECFM=FB ; LRECL= 1745; BLKSIZE= 31410
. WER110I SORTOUT2 : RECFM=FB ; LRECL= 1708; BLKSIZE= 32452
. WER110I SORTOUT3 : RECFM=FB ; LRECL= 1682; BLKSIZE= 31958
. WER405I SORTOUT1 : DATA RECORDS OUT 78437; TOTAL RECORDS OUT 78437
. WER405I SORTOUT2 : DATA RECORDS OUT 111363849; TOTAL RECORDS OUT 111363849
. WER405I SORTOUT3 : DATA RECORDS OUT 111363849; TOTAL RECORDS OUT 111363849
You have one input file with 111 million records on it.
You have three output files.
The first output file contains 78,437 records. This is less than 0.008% of the file. The Easytrieve will hardly notice if it has to read those extra records and ignore them.
The two large output files are almost the same as the large input file, just missing some little bits at the back, 37 bytes for the first and 55 for the second. If you change whatever uses those files to work with the little extra length, then you save yourself writing and storing those two files.
Now, you don't have to specifically read the main input to create the small file. In the first place where you have to read a big file, you can amend to also create the small file. This will add very little processing time to that program and hardly overburden it with complexity.
So, if you run everything off your existing main input file, creating the small file the first time the main input file is read by a program (or at least some time before you need to read the small file) then:
You change all programs to operate with the size of the existing main file
You need new code to create the small file from somewhere where the main input file is already being read
You need to code to ignore the records from the small file as they now remain on the big file
In exchange for the above work, you completely remove the job which is running slowly. The additional resources necessary on a daily basis is just the minimal stuff to identify and write the records to the small file from an application program of your choice which is already reading the main input file.
To get these benfits, you have minor coding changes to application programs.
These benefits are, in a year, saving about a month of elapsed time and five-and-a-half days of CPU time. Plus you save on holding two additional copies of the vast majority of your main input file on DASD. I just tried to calculate how much that would be in Cylinders and broke my calculator.
If there is any sort of genuine business, or technical, reason why that can't be done for a file containing 111 million 1700-byte records, I'd like to know it. The "designer" who came up with this in the first place should be encouraged to wear an absurd paper-hat at work for at least a month.
Just change all the programs to read the main file, ignoring any data they don't want. Ditch the SORT/COPY. Create the small file the first time a big file is read. OK, if that can't be done and keep your programs "working", what more can I say? :-)