IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Split by SORT?


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
naive

New User


Joined: 26 Apr 2005
Posts: 46
Location: LA

PostPosted: Thu Mar 30, 2006 12:22 am
Reply with quote

Hullo

we have a file that we have to split in 5 files. Now the parameters we are using is:

OPTION COPY
OUTFIL FNAMES=(OUT1,OUT2,OUT3,OUT4,OUT5),SPLITBY=500000

But in the output we see

RECORDS - IN: 45042279, OUT: 45042279
OUT1 : DELETED = 36000000, REPORT = 0, DATA = 9042279
OUT1 : TOTAL IN = 45042279, TOTAL OUT = 9042279
OUT2 : DELETED = 36042279, REPORT = 0, DATA =9000000
OUT2 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT3 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT3 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT4 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT4 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT5 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT5 : TOTAL IN = 45042279, TOTAL OUT = 9000000

My question is: how is the DFSORT utility reacting when we specify SPLITBY?
And is it automatically adjusting the record count in each file from the specified 500000 ??

I am not able to check the output files coz they are on tape. So I need to know whether the SPLITBY option is working or not.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Mar 30, 2006 12:44 am
Reply with quote

SPLITBY is working as it's supposed to. You used SPLITBY=500000, so the first 500000 records go to OUT1, the second 500000 to OUT2, the third 500000 to OUT3, the fourth 500000 to OUT4 and the fifth 500000 to OUT5. That totals 2500000 records. But your input file has more records - 45042279 records according to the messages. So SPLITBY=500000 rotates back to the first ddname and writes the sixth 500000 to OUT1, the seventh 500000 to OUT2, and so on. Note that the records in each file are NOT contiguous since we start over again at OUT1 after OUT1-OUT5 each get 500000 records. If you add up all of the TOTAL OUT records, they add up to the total number of input records.
Back to top
View user's profile Send private message
naive

New User


Joined: 26 Apr 2005
Posts: 46
Location: LA

PostPosted: Thu Mar 30, 2006 1:14 am
Reply with quote

Wow! Thanks Frank!

So is there any way to make sure the records are contiguous if we do not know the count of records in the input file??

Also in the manual for DFSORT that I have, it says the examples etc are available in the Application Programming Guide. Would you be having a softcopy or a link to this document??

Our online help on the m.f is woefully inadequate!
Thanks a lot again for the clarifications!!
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Mar 30, 2006 2:05 am
Reply with quote

Quote:
So is there any way to make sure the records are contiguous if we do not know the count of records in the input file??


Divide the number of records in the input file by the number of files and add 1. Use that number for SPLITBY.

For example, in the example given, it would be:

(45042279 / 5) + 1 = 9008455 + 1 = 9008456

so you'd use SPLITBY=9008456

Of course, you could determine that dynamically by getting a count of the number of input records, doing the math, and generating the appropriate OUTFIL control statement with the calculated n value.

Quote:
Also in the manual for DFSORT that I have, it says the examples etc are available in the Application Programming Guide. Would you be having a softcopy or a link to this document??


You can access all of the DFSORT books in pdf and bookmanager format from:

Use [URL] BBCode for External Links

I'd suggest using the pdf books as they're formatted better.
Back to top
View user's profile Send private message
naive

New User


Joined: 26 Apr 2005
Posts: 46
Location: LA

PostPosted: Thu Mar 30, 2006 2:34 am
Reply with quote

Thanks Frank!
Really appreciate your help!
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Apr 26, 2006 3:44 am
Reply with quote

With z/OS DFSORT V1R5 PTF UK90007 or DFSORT R14 PTF UK90006 (April, 2006), you can use the new SPLIT1R function to make splitting records a bit easier. Whereas SPLITBY can rotate back to the first data set, resulting in non-contiguous records, SPLIT1R only does one rotation so the records are always contiguous.

For the example discussed here, you could use SPLIT1R=9008455 (45042279/5) and get 9008455 records for OUT1, OUT2, OUT3 and OUT4 and 9008459 records for OUT5:

Code:

   OUTFIL FNAMES=(OUT1,OUT2,OUT3,OUT4,OUT5),
      SPLIT1R=9008455


Here's a DFSORT job that uses SPLIT1R dynamically to divide any number of input records among any number of output files:


Code:

//S1    EXEC  PGM=ICETOOL
//TOOLMSG   DD  SYSOUT=*
//DFSMSG    DD  SYSOUT=*
//IN DD DSN=... input file
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(TRK,(1,1)),DISP=(,PASS)
//C1 DD DSN=&&C1,UNIT=SYSDA,SPACE=(TRK,(1,1)),DISP=(,PASS)
//CTL3CNTL DD *
  OUTFIL FNAMES=(OUT01,OUT02,...,OUTnn),  <--- change for nn
//    DD DSN=*.C1,VOL=REF=*.C1,DISP=(OLD,PASS)
//OUT01 DD DSN=... output file01
//OUT02 DD DSN=... output file02
...
//OUTnn DD DSN=... output filenn  <--- change for nn
//TOOLIN DD *
* Get the record count.
COPY FROM(IN) USING(CTL1)
* Generate:
* SPLIT1R=x where x = count/nn.
* nn is the number of output files.
COPY FROM(T1) TO(C1) USING(CTL2)
* Use SPLIT1R=x to split records contiguously among
* the nn output files.
COPY FROM(IN) USING(CTL3)
/*
//CTL1CNTL DD *
  OUTFIL FNAMES=T1,REMOVECC,NODETAIL,
    TRAILER1=(COUNT=(M11,LENGTH=8))
/*
//CTL2CNTL DD *
  OUTREC BUILD=(2X,C'SPLIT1R=',
    1,8,ZD,DIV,+nn,               <--- set to nn
      TO=ZD,LENGTH=8,80:X)
/*


For complete information on SPLIT1R and the other new DFSORT/ICETOOL functions available with the April, 2006 PTFs, see:

Use [URL] BBCode for External Links
Back to top
View user's profile Send private message
naive

New User


Joined: 26 Apr 2005
Posts: 46
Location: LA

PostPosted: Wed Apr 26, 2006 6:09 am
Reply with quote

great stuff!!
but we went ahead with the earlier solution. In fact we had a problem in the first run too. Just to share with you, we were allocating the output files on tape (coz the files are big).
In the tape definiton. to make it faster, we had used VOL paramter to refer to the previous storage device (VOL=REF=*.OUT1). This helps because the TMS does not have to load multiple volumes (and hence takes less time).

There was one small thing I overlooked (rather diint realise). The sort-split function allocates all the output files at the same time right at the start. Now this caused my job to fail as you cant re-use a tape volume before it is released.
Not sure if this made sense, but just to summarize, to use the SORT-SPLIT commands, we need to allocate the output files on distinct volumes/devices.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Need to set RC4 through JCL SORT DFSORT/ICETOOL 5
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts JCL sort card - get first day and las... JCL & VSAM 9
No new posts Sort First/last record of a subset th... DFSORT/ICETOOL 7
No new posts how to calculate SUM value for VB fil... DFSORT/ICETOOL 1
Search our Forums:

Back to Top