IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Help needed on SYNCSORT MERGE


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Thu Apr 11, 2013 3:37 pm
Reply with quote

Hi,

I am trying to merge records from one file into another file. The first input file is very small (having no more than 1000 records), the second input file is very huge (can have 30 million records). Both the files have same length (13667/FB). I am trying to use the following step to Merge, but the jobs abends with reason SORTIN02 OUT OF SEQ. I know it is because since the Output file is same as SORTIN02, so the records might be no longer in sorted order, once the Merge operation takes off. The reason for using the same file name as SORTIN02 is to save DASD space. Creating a new file altogther may give SPACE abend.

Code:
//STEP050  EXEC PGM=SORT,COND=(0,NE)                                   
//SORTIN01 DD DSN=Infile1,DISP=SHR
//SORTIN02 DD DSN=Infile2,DISP=SHR
//SORTOUT  DD DSN=Infile2,DISP=SHR
//SYSPRINT DD SYSOUT=*                                                 
//SYSOUT   DD SYSOUT=*                                                 
//SYSUDUMP DD SYSOUT=D                                                 
//SYSIN    DD *

   MERGE FIELDS=(1,27,CH,A)     
   OPTION NOEQUALS             
/*


When I use a new file in SORTOUT, and having 1.1 million records in Infile 2 and around 1000 records in Infile 1, the job takes around 3 minutes to complete with high CPU. My problem is when there will be 30 million records (in production), the elapsed time and CPU will be much higher which will be almost 30 times higher than the one run with sample data. Below is complete sysout (with 1.1 million data in Infile 2)

Code:
MERGE FIELDS=(1,27,CH,A)                                                       
OPTION NOEQUALS                                                               
WER276B  SYSDIAG= 493223, 1761317, 1761317, 4220850                           
WER164B  6,852K BYTES OF VIRTUAL STORAGE AVAILABLE, MAX REQUESTED,             
WER164B     64K BYTES RESERVE REQUESTED, 3,296K BYTES USED                     
WER146B  64K BYTES OF EMERGENCY SPACE ALLOCATED                               
WER109I  MERGE INPUT  :   TYPE=F; LRECL= 13667                                 
WER110I  SORTOUT  : RECFM=FB   ; LRECL= 13667; BLKSIZE= 27334                 
WER410B  5,824K BYTES OF VIRTUAL STORAGE AVAILABLE ABOVE THE 16MEG LINE,       
WER410B     0 BYTES RESERVE REQUESTED, 3,128K BYTES USED                       
WER209B  1,500 PRIMARY AND 3,000 SECONDARY SORTOUT TRACKS ALLOCATED, 3,264 USED
WER211B  SYNCSMF  CALLED BY SYNCSORT; RC=0000                                 
WER449I  SYNCSORT GLOBAL DSM SUBSYSTEM ACTIVE                                 
WER416B  SORTIN   : EXCP'S=32369                                               
WER416B  SORTOUT  : EXCP'S=32428,UNIT=3390,DEV=A08C,CHP=(C0C4C8CCD0D4D8DC,1),VO
WER416B  TOTAL OF 64,797 EXCP'S ISSUED FOR MERGING                             
WER054I  RCD IN    1165056, OUT    1165056                                     
WER072I  NOEQUALS, BALANCE IN EFFECT                                           
WER169I  RELEASE 1.3 BATCH 0506 TPF LEVEL 2.1


Code:
CPU=22.11 ELAPSED=3:04.05 I/O=66,247     


Is there any efficient way of doing this?
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Apr 11, 2013 4:21 pm
Reply with quote

It is an extremely bad idea to use the same DSN for SORTOUT as one of the input datasets. In this particular case you have trashed your file.

If you are concerned about DASD use, the answer is going to depend on just how concerned.

If completely concerned, copy your input to "tape" and delete the input. Merge from "tape" and DASD for the small file.

If concerned but it is OK for a couple of hours, backup to "tape" after the Merge, and outside the Critical Path. Then delete the input. Produce your JCL so that it can be run either from DASD directly, or by restoring to DASD first.
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Thu Apr 11, 2013 4:34 pm
Reply with quote

Hi Bill,

I realized that using the same DSN in SORTOUT is a bad idea when I got the OUT OF SEQ error message. So I used a different file name in SORTOUT. But my concern now is the CPU time and the elapsed time that the Merge is taking.

Is there any way by which I can optimize the Merge? Can using some parameter be helpful?

Thanks,
Indrajit
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Apr 11, 2013 5:19 pm
Reply with quote

Well, you have OPTION NOEQUALS, which is good, as long as you don't mind which order the data is taken from the individual input files when keys are equal.

A MERGE should be pretty zippy. Have a look in your manual for information on performance tuning, but I don't think you'll find a magic bullet. You have a lot of data. It is going to take the time it is going to take.

I don't know about SyncSort, but DFSORT ignores any BUFNO you specify. You could experiment.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Apr 11, 2013 8:05 pm
Reply with quote

Hello,

Is all of the data online and available or does media need to be mounted (tape) or recalled due to migration?

Merging 2 files that are in sequence should run very quickly.

How long does it take to simply read both files - no merge?
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Fri Apr 12, 2013 9:01 am
Reply with quote

Hi Dick,

All the data is available online (in DASD). When I read the files (no merging), the Infile1 (having around 1500 records) takes 2 sec, the read for Infile 2 (having 1.1 records) takes around 58 secs. However the merge is taking 3 minutes.

Thanks,
Indrajit
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Fri Apr 12, 2013 9:45 am
Reply with quote

Hi,

I updated my JCL as below and the job is running within seconds. My getting the expected output. But is this the right way? Will I get some unexpected results with different set of inputs.

Code:
//STEP050  EXEC PGM=SORT,COND=(0,NE)                                   
//SORTIN DD DSN=Infile1,DISP=SHR
//SORTOUT  DD DSN=Infile2,DISP=MOD
//SYSPRINT DD SYSOUT=*                                                 
//SYSOUT   DD SYSOUT=*                                                 
//SYSUDUMP DD SYSOUT=D                                                 
//SYSIN    DD *

   SORT FIELDS=(1,27,CH,A)     
   OPTION NOEQUALS             
/*
Back to top
View user's profile Send private message
Anuj Dhawan

Superior Member


Joined: 22 Apr 2006
Posts: 6250
Location: Mumbai, India

PostPosted: Fri Apr 12, 2013 10:05 am
Reply with quote

Quote:
Will I get some unexpected results with different set of inputs.
As you ask for it - No, but why do you ask?
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Fri Apr 12, 2013 10:24 am
Reply with quote

Hi,

so you are no longer merging records ?


Gerry
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Fri Apr 12, 2013 10:27 am
Reply with quote

I just realized that with the above code, the records are getting appended at the end of the file and not appearing in the Sorted order. So basically my purpose is not solved. icon_cry.gif
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Fri Apr 12, 2013 12:08 pm
Reply with quote

Yes, DISP=MOD and those Control Cards is going to give you nothing except the data from the small file, in sorted order, appended to the original file.

To add to the lack of benefit, they were already in sorted order, so you've even expended pointless resources in getting the wrong result.

MERGE is fast. You already have OPTION NOEQUALS. Perhaps contact SyncSort support and see if there is anything they can suggest?
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri Apr 12, 2013 11:46 pm
Reply with quote

Hello,

If you have not already done so, change the output dsn to a dsn thet is NOT one of the input files.

Post the run time for this.
Back to top
View user's profile Send private message
Indrajit_57
Warnings : 1

New User


Joined: 27 Jun 2006
Posts: 60

PostPosted: Sat Apr 13, 2013 8:49 am
Reply with quote

Hi Dick,

In my very first post, I provided the run time along with CPU time and SYSOUT details, which was produced by using a different DSN in SORTOUT.

Thanks,
Indrajit
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sun Apr 14, 2013 8:19 am
Reply with quote

Hello,

Quote:
How long does it take to simply read both files - no merge?

Quote:
All the data is available online (in DASD). When I read the files (no merging), the Infile1 (having around 1500 records) takes 2 sec, the read for Infile 2 (having 1.1 records) takes around 58 secs. However the merge is taking 3 minutes.
Next, please run these 2 tests copying the data, not just reading it.
Back to top
View user's profile Send private message
Dale Robertson

New User


Joined: 21 Jun 2013
Posts: 44
Location: U.S.A.

PostPosted: Tue Jun 25, 2013 7:37 pm
Reply with quote

Indrajit_57,

It's like a hippo on a submarine - there's no getting around it. You must boink the previous version then allocate a new one or your results will be poobah!

"Taxation without representation is Poobah!"
--MAD Magazine - 1956

Code:
//STEP050D  EXEC PGM=IEFBR14
//DD1  DD DSN=Infile3,DISP=(MOD,DELETE),SPACE=(TRK,0)
//*
//STEP050  EXEC PGM=SORT,COND=(0,NE)                                   
//SORTIN01 DD DSN=Infile1,DISP=SHR
//SORTIN02 DD DSN=Infile2,DISP=SHR
//SORTOUT  DD DSN=Infile3,DISP=(,CATLG,DELETE),
//    SPACE=(CYL,(200,200),RLSE) or whatever
//SYSOUT   DD SYSOUT=*
//SYSIN    DD *

   MERGE FIELDS=(1,27,CH,A)     
   OPTION NOEQUALS             
/*
Code'd
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM

 


Similar Topics
Topic Forum Replies
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts Merge two VSAM KSDS files into third ... JCL & VSAM 6
No new posts Mainframe Programmer with CICS Skill... Mainframe Jobs 0
This topic is locked: you cannot edit posts or make replies. Merge 2 input files based on the reco... JCL & VSAM 2
No new posts Merge 2 input files after sort SYNCSORT 14
Search our Forums:

Back to Top