View previous topic :: View next topic
|
Author |
Message |
Indrajit_57 Warnings : 1 New User
Joined: 27 Jun 2006 Posts: 60
|
|
|
|
Hi,
I am trying to merge records from one file into another file. The first input file is very small (having no more than 1000 records), the second input file is very huge (can have 30 million records). Both the files have same length (13667/FB). I am trying to use the following step to Merge, but the jobs abends with reason SORTIN02 OUT OF SEQ. I know it is because since the Output file is same as SORTIN02, so the records might be no longer in sorted order, once the Merge operation takes off. The reason for using the same file name as SORTIN02 is to save DASD space. Creating a new file altogther may give SPACE abend.
Code: |
//STEP050 EXEC PGM=SORT,COND=(0,NE)
//SORTIN01 DD DSN=Infile1,DISP=SHR
//SORTIN02 DD DSN=Infile2,DISP=SHR
//SORTOUT DD DSN=Infile2,DISP=SHR
//SYSPRINT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=D
//SYSIN DD *
MERGE FIELDS=(1,27,CH,A)
OPTION NOEQUALS
/* |
When I use a new file in SORTOUT, and having 1.1 million records in Infile 2 and around 1000 records in Infile 1, the job takes around 3 minutes to complete with high CPU. My problem is when there will be 30 million records (in production), the elapsed time and CPU will be much higher which will be almost 30 times higher than the one run with sample data. Below is complete sysout (with 1.1 million data in Infile 2)
Code: |
MERGE FIELDS=(1,27,CH,A)
OPTION NOEQUALS
WER276B SYSDIAG= 493223, 1761317, 1761317, 4220850
WER164B 6,852K BYTES OF VIRTUAL STORAGE AVAILABLE, MAX REQUESTED,
WER164B 64K BYTES RESERVE REQUESTED, 3,296K BYTES USED
WER146B 64K BYTES OF EMERGENCY SPACE ALLOCATED
WER109I MERGE INPUT : TYPE=F; LRECL= 13667
WER110I SORTOUT : RECFM=FB ; LRECL= 13667; BLKSIZE= 27334
WER410B 5,824K BYTES OF VIRTUAL STORAGE AVAILABLE ABOVE THE 16MEG LINE,
WER410B 0 BYTES RESERVE REQUESTED, 3,128K BYTES USED
WER209B 1,500 PRIMARY AND 3,000 SECONDARY SORTOUT TRACKS ALLOCATED, 3,264 USED
WER211B SYNCSMF CALLED BY SYNCSORT; RC=0000
WER449I SYNCSORT GLOBAL DSM SUBSYSTEM ACTIVE
WER416B SORTIN : EXCP'S=32369
WER416B SORTOUT : EXCP'S=32428,UNIT=3390,DEV=A08C,CHP=(C0C4C8CCD0D4D8DC,1),VO
WER416B TOTAL OF 64,797 EXCP'S ISSUED FOR MERGING
WER054I RCD IN 1165056, OUT 1165056
WER072I NOEQUALS, BALANCE IN EFFECT
WER169I RELEASE 1.3 BATCH 0506 TPF LEVEL 2.1 |
Code: |
CPU=22.11 ELAPSED=3:04.05 I/O=66,247 |
Is there any efficient way of doing this? |
|
Back to top |
|
 |
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
It is an extremely bad idea to use the same DSN for SORTOUT as one of the input datasets. In this particular case you have trashed your file.
If you are concerned about DASD use, the answer is going to depend on just how concerned.
If completely concerned, copy your input to "tape" and delete the input. Merge from "tape" and DASD for the small file.
If concerned but it is OK for a couple of hours, backup to "tape" after the Merge, and outside the Critical Path. Then delete the input. Produce your JCL so that it can be run either from DASD directly, or by restoring to DASD first. |
|
Back to top |
|
 |
Indrajit_57 Warnings : 1 New User
Joined: 27 Jun 2006 Posts: 60
|
|
|
|
Hi Bill,
I realized that using the same DSN in SORTOUT is a bad idea when I got the OUT OF SEQ error message. So I used a different file name in SORTOUT. But my concern now is the CPU time and the elapsed time that the Merge is taking.
Is there any way by which I can optimize the Merge? Can using some parameter be helpful?
Thanks,
Indrajit |
|
Back to top |
|
 |
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Well, you have OPTION NOEQUALS, which is good, as long as you don't mind which order the data is taken from the individual input files when keys are equal.
A MERGE should be pretty zippy. Have a look in your manual for information on performance tuning, but I don't think you'll find a magic bullet. You have a lot of data. It is going to take the time it is going to take.
I don't know about SyncSort, but DFSORT ignores any BUFNO you specify. You could experiment. |
|
Back to top |
|
 |
dick scherrer
Moderator Emeritus

Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Is all of the data online and available or does media need to be mounted (tape) or recalled due to migration?
Merging 2 files that are in sequence should run very quickly.
How long does it take to simply read both files - no merge? |
|
Back to top |
|
 |
Indrajit_57 Warnings : 1 New User
Joined: 27 Jun 2006 Posts: 60
|
|
|
|
Hi Dick,
All the data is available online (in DASD). When I read the files (no merging), the Infile1 (having around 1500 records) takes 2 sec, the read for Infile 2 (having 1.1 records) takes around 58 secs. However the merge is taking 3 minutes.
Thanks,
Indrajit |
|
Back to top |
|
 |
Indrajit_57 Warnings : 1 New User
Joined: 27 Jun 2006 Posts: 60
|
|
|
|
Hi,
I updated my JCL as below and the job is running within seconds. My getting the expected output. But is this the right way? Will I get some unexpected results with different set of inputs.
Code: |
//STEP050 EXEC PGM=SORT,COND=(0,NE)
//SORTIN DD DSN=Infile1,DISP=SHR
//SORTOUT DD DSN=Infile2,DISP=MOD
//SYSPRINT DD SYSOUT=*
//SYSOUT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=D
//SYSIN DD *
SORT FIELDS=(1,27,CH,A)
OPTION NOEQUALS
/* |
|
|
Back to top |
|
 |
Anuj Dhawan
Superior Member

Joined: 22 Apr 2006 Posts: 6248 Location: Mumbai, India
|
|
|
|
Quote: |
Will I get some unexpected results with different set of inputs. |
As you ask for it - No, but why do you ask? |
|
Back to top |
|
 |
gcicchet
Senior Member
Joined: 28 Jul 2006 Posts: 1702 Location: Australia
|
|
|
|
Hi,
so you are no longer merging records ?
Gerry |
|
Back to top |
|
 |
Indrajit_57 Warnings : 1 New User
Joined: 27 Jun 2006 Posts: 60
|
|
|
|
I just realized that with the above code, the records are getting appended at the end of the file and not appearing in the Sorted order. So basically my purpose is not solved.  |
|
Back to top |
|
 |
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Yes, DISP=MOD and those Control Cards is going to give you nothing except the data from the small file, in sorted order, appended to the original file.
To add to the lack of benefit, they were already in sorted order, so you've even expended pointless resources in getting the wrong result.
MERGE is fast. You already have OPTION NOEQUALS. Perhaps contact SyncSort support and see if there is anything they can suggest? |
|
Back to top |
|
 |
dick scherrer
Moderator Emeritus

Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
If you have not already done so, change the output dsn to a dsn thet is NOT one of the input files.
Post the run time for this. |
|
Back to top |
|
 |
Indrajit_57 Warnings : 1 New User
Joined: 27 Jun 2006 Posts: 60
|
|
|
|
Hi Dick,
In my very first post, I provided the run time along with CPU time and SYSOUT details, which was produced by using a different DSN in SORTOUT.
Thanks,
Indrajit |
|
Back to top |
|
 |
dick scherrer
Moderator Emeritus

Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
How long does it take to simply read both files - no merge? |
Quote: |
All the data is available online (in DASD). When I read the files (no merging), the Infile1 (having around 1500 records) takes 2 sec, the read for Infile 2 (having 1.1 records) takes around 58 secs. However the merge is taking 3 minutes. |
Next, please run these 2 tests copying the data, not just reading it. |
|
Back to top |
|
 |
Dale Robertson
New User

Joined: 21 Jun 2013 Posts: 44 Location: U.S.A.
|
|
|
|
Indrajit_57,
It's like a hippo on a submarine - there's no getting around it. You must boink the previous version then allocate a new one or your results will be poobah!
"Taxation without representation is Poobah!"
--MAD Magazine - 1956
Code: |
//STEP050D EXEC PGM=IEFBR14
//DD1 DD DSN=Infile3,DISP=(MOD,DELETE),SPACE=(TRK,0)
//*
//STEP050 EXEC PGM=SORT,COND=(0,NE)
//SORTIN01 DD DSN=Infile1,DISP=SHR
//SORTIN02 DD DSN=Infile2,DISP=SHR
//SORTOUT DD DSN=Infile3,DISP=(,CATLG,DELETE),
// SPACE=(CYL,(200,200),RLSE) or whatever
//SYSOUT DD SYSOUT=*
//SYSIN DD *
MERGE FIELDS=(1,27,CH,A)
OPTION NOEQUALS
/* |
Code'd |
|
Back to top |
|
 |
|
|