IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Splitting 1 dataset into multiple datasets


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Wed Apr 16, 2014 9:10 pm
Reply with quote

Hi. Is there a way to split a large file into smaller files using syncsort/ICETOOL? I'd like to be able to just set a certain amount of records and have it create as many datasets as it needs based on that number I set.

Thanks
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Apr 16, 2014 9:18 pm
Reply with quote

SyncSort topics live in the JCL forum.

Have you looked at the various SPLIT* options of OUTFIL?
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Wed Apr 16, 2014 9:24 pm
Reply with quote

Sorry of for the wrong forum. I looked a bit at OUTFIL but that seems to be done with include cond. I just want to say after x amount or records create a file, then after that next set of x records create file 2, etc. There may be more OUTFIL can do that I'm not familar with so I'll keep looking into that option.

Thanks
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Wed Apr 16, 2014 9:38 pm
Reply with quote

If the number of output datasets are going to be unknown, you may need to build the job dynamically based on the input record count. But this involves multiple passes of data.

REXX could be a better option to read 'x' records from input, allocate a new output dataset and write into it, and repeat until end-of-input
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Wed Apr 16, 2014 9:39 pm
Reply with quote

Thanks, I'll check into REXX. Sounds like it's what I want to have happen.
Back to top
View user's profile Send private message
nevilh

Active User


Joined: 01 Sep 2006
Posts: 262

PostPosted: Wed Apr 16, 2014 10:24 pm
Reply with quote

Rather than write your own Rexx why not try using IDCAMS REPRO and use the skip and count parameters.
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Wed Apr 16, 2014 10:49 pm
Reply with quote

Not familar with REPRO but tried the following from an exmaple I found online but did not work unfortunately.

Code:
//COPYDATA JOB (31000,G5),'COPY',CLASS=O,MSGCLASS=2,LINES=500000,       
//         NOTIFY=&SYSUID TYPRUN=HOLD                                   
//**********************************************************************
//S1SORT   EXEC PGM=IDCAMS,REGION=3072K                                 
//**********************************************************************
//SORTIN   DD DISP=SHR,DSN=EDT.TST.RXD.UR078074.D036                   
//**********************************************************************
//SORTOUT DD DISP=(NEW,CATLG,DELETE),                                   
//           DSN=TST.G5.UR078074.D036,                                 
//           SPACE=(CYL,(500,100),RLSE,,ROUND),UNIT=TEST,               
//*          LABEL=RETPD=999,                                           
//           DCB=*.SORTIN                                               
//SYSOUT  DD SYSOUT=6                                                   
//SYSIN   DD *                                                         
*                                                                       
REPRO -                                                                 
INFILE(SORTIN) -                                                       
OUTFILE(SORTOUT)                                                       
//                                                                     
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Wed Apr 16, 2014 11:07 pm
Reply with quote

nevilh,

I am afraid IDCAMS REPRO will NOT fit for the OPs requirement.

The OP has a large input file to be split into a number of smaller output files, with a fixed 'x' number of records in each output file.
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Wed Apr 16, 2014 11:13 pm
Reply with quote

I was able to get what I needed with the following code for example but it means knowing how many files I would want to create based on knowing how many total records I have on the original file.

Code:
//SRTTESTA JOB (31000,G5),'EAS',CLASS=F,MSGCLASS=2,           
//         NOTIFY=&SYSUID TYPRUN=HOLD                         
//S1SORT   EXEC PGM=SORT,REGION=3072K                         
//SORTIN   DD DISP=SHR,DSN=TST.WAY.G500414.X47CSV             
//OUT1    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G500414.X47CSV1,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                     
//OUT2    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G500414.X47CSV2,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                     
//OUT3    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G500414.X47CSV3,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                 
//OUT4    DD DISP=(NEW,CATLG,DELETE),                     
//           DSN=TST.WAY.G500414.X47CSV4,                 
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,
//           LABEL=RETPD=180,                             
//           DCB=*.SORTIN                                 
//OUT5    DD DISP=(NEW,CATLG,DELETE),                     
//           DSN=TST.WAY.G500414.X47CSV5,                 
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,
//           LABEL=RETPD=180,                             
//           DCB=*.SORTIN                                 
//SYSOUT  DD SYSOUT=6                                     
//SYSIN   DD *                                           
  OPTION COPY                                             
  OUTFIL FNAMES=OUT1,ENDREC=00000020                     
  OUTFIL FNAMES=OUT2,STARTREC=00000021,ENDREC=00000040   
  OUTFIL FNAMES=OUT3,STARTREC=00000041,ENDREC=00000060   
  OUTFIL FNAMES=OUT4,STARTREC=00000061,ENDREC=00000080   
  OUTFIL FNAMES=OUT5,SAVE 
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Wed Apr 16, 2014 11:25 pm
Reply with quote

Jay Villaverde,

I think SPLIT1R parameter is a better alternative for your above example.

Code:
  OPTION COPY
  OUTFIL FNAMES=(OUT1,OUT2,OUT3,OUT4,OUT5),SPLIT1R=20


But with changing number of input records, ie. your jcl has to be dynamic with varying number of output files.
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Wed Apr 16, 2014 11:28 pm
Reply with quote

Thanks, will try SPLIT1R which is more efficient.
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Wed Apr 16, 2014 11:40 pm
Reply with quote

But even SPLIT1R may not help if the input record count keeps changing every time. icon_confused.gif

This older topic HERE might be of some interest to you. But I'm sure it can be improved with the newer functions available in sort products these days.
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Wed Apr 16, 2014 11:44 pm
Reply with quote

True, I will still to have a general idea of total records going in, but it's a start. Will check out that link.

Thanks
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Apr 17, 2014 12:13 am
Reply with quote

Some record counts would be good. Why do you want to split? Does it matter where the split is, or is there no relationship between records?

Has the file you want to split already been through a SORT (or something else) which can be amended to simply add a file containing the number of records?

Are you able to use the INTRDR for this task (so that JCL and control cards can be generated and submitted to run by a JOB)?

If nothing else, you can have more DD statements than needed, and clean up the unused datasets in a step afterwards.
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Thu Apr 17, 2014 12:19 am
Reply with quote

Actually, working with a co-worker on this we did end up creating more DD statements than needed and just clean them up afterwards.

This all came about because the requestor had a 4 million record mainframe file they wanted split up in order to load easier to their SQL Server. So we split it up into 1mil chunks creating 5 datasets with the last one having a handful or records and then getting rid of the unused datasets in another step as you mentioned.

This should serve our purposes for the amount of times we do this which isn't often but good to have a way of doing it.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Apr 17, 2014 12:42 am
Reply with quote

Good work. Thanks for letting us know.

Can you post the code you cam up with? It may be useful for other people in the future.
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Thu Apr 17, 2014 1:03 am
Reply with quote

Sure. We're going to expand it out to 10 datasets and make a generic template out of it for our group but this is what we came up with. Feel free to make it better icon_smile.gif

Thanks everyone for their input.

Code:

//SPLITDSN JOB (31000,G5),'SPLIT',CLASS=F,MSGCLASS=2,         
//         NOTIFY=&SYSUID TYPRUN=HOLD                         
//S1SORT   EXEC PGM=SORT,REGION=3072K                         
//SORTIN   DD DISP=SHR,DSN=TST.WAY.G500414.X47CSV             
//OUT1    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G502015.X47CSV1,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                     
//OUT2    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G502015.X47CSV2,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                     
//OUT3    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G502015.X47CSV3,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                     
//OUT4    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G502015.X47CSV4,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                     
//OUT5    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G502015.X47CSV5,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                     
//OUT6    DD DISP=(NEW,CATLG,DELETE),                         
//           DSN=TST.WAY.G502015.X47CSV6,                     
//           SPACE=(CYL,(500,500),RLSE,,ROUND),UNIT=TEST,     
//           LABEL=RETPD=180,                                 
//           DCB=*.SORTIN                                     
//SYSOUT  DD SYSOUT=6                                         
//SYSIN   DD *                                                 
  OPTION COPY                                                 
  OUTFIL FNAMES=(OUT1,OUT2,OUT3,OUT4,OUT5,OUT6),SPLIT1R=20000 
//*                                                           
//S2IDCAMS EXEC PGM=IDCAMS                                     
//SYSPRINT DD SYSOUT=*                                         
//OUT1     DD DSN=TST.WAY.G502015.X47CSV1,DISP=SHR             
//OUT2     DD DSN=TST.WAY.G502015.X47CSV2,DISP=SHR             
//OUT3     DD DSN=TST.WAY.G502015.X47CSV3,DISP=SHR             
//OUT4     DD DSN=TST.WAY.G502015.X47CSV4,DISP=SHR             
//OUT5     DD DSN=TST.WAY.G502015.X47CSV5,DISP=SHR             
//OUT6     DD DSN=TST.WAY.G502015.X47CSV6,DISP=SHR             
//SYSIN    DD *                                                 
  PRINT INFILE(OUT1) CHARACTER COUNT(1)                         
  IF LASTCC = 4 THEN DELETE 'TST.WAY.G502015.X47CSV1' PURGE     
  PRINT INFILE(OUT2) CHARACTER COUNT(1)                         
  IF LASTCC = 4 THEN DELETE 'TST.WAY.G502015.X47CSV2' PURGE     
  PRINT INFILE(OUT3) CHARACTER COUNT(1)                         
  IF LASTCC = 4 THEN DELETE 'TST.WAY.G502015.X47CSV3' PURGE     
  PRINT INFILE(OUT4) CHARACTER COUNT(1)                         
  IF LASTCC = 4 THEN DELETE 'TST.WAY.G502015.X47CSV4' PURGE     
  PRINT INFILE(OUT5) CHARACTER COUNT(1)                         
  IF LASTCC = 4 THEN DELETE 'TST.WAY.G502015.X47CSV5' PURGE     
  PRINT INFILE(OUT6) CHARACTER COUNT(1)                         
  IF LASTCC = 4 THEN DELETE 'TST.WAY.G502015.X47CSV6' PURGE   
/*                                                             
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Apr 17, 2014 3:19 am
Reply with quote

Thanks. The only thing I'd suggest is removing the DCB from the output DD statements. SORT will generate the correct DCB info.

If you needed some data manipulation which changed the record-lengths from the input, you'd not need to change the JCL to get the correct output. Then have two places to maintain it.

Doesn't matter as it stands, since the records are not changed, but just so it doesn't get copied like it is.

I assumed the number on the SPLIT1R is either testing or got chopped in the paste...
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Thu Apr 17, 2014 3:30 am
Reply with quote

Thanks for the tip. Yeah, we were just testing SPLIT1R with a smaller file.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM

 


Similar Topics
Topic Forum Replies
No new posts combine multiple unique records into ... DFSORT/ICETOOL 2
No new posts batch SFTP job using AOPBATCH unable ... All Other Mainframe Topics 7
No new posts JES datasets IO Error ABENDS & Debugging 3
No new posts Concatenate 2 input datasets and give... JCL & VSAM 2
No new posts SORT JCL to merge multiple tow into s... DFSORT/ICETOOL 6
Search our Forums:

Back to Top