Split by SORT?

naive · New User Joined: 26 Apr 2005 Posts: 46 Location: LA

Hullo

we have a file that we have to split in 5 files. Now the parameters we are using is:

OPTION COPY
OUTFIL FNAMES=(OUT1,OUT2,OUT3,OUT4,OUT5),SPLITBY=500000

But in the output we see

RECORDS - IN: 45042279, OUT: 45042279
OUT1 : DELETED = 36000000, REPORT = 0, DATA = 9042279
OUT1 : TOTAL IN = 45042279, TOTAL OUT = 9042279
OUT2 : DELETED = 36042279, REPORT = 0, DATA =9000000
OUT2 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT3 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT3 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT4 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT4 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT5 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT5 : TOTAL IN = 45042279, TOTAL OUT = 9000000

My question is: how is the DFSORT utility reacting when we specify SPLITBY?
And is it automatically adjusting the record count in each file from the specified 500000 ??

I am not able to check the output files coz they are on tape. So I need to know whether the SPLITBY option is working or not.

Frank Yaeger · Posted: Thu Mar 30, 2006 12:44 am

SPLITBY is working as it's supposed to. You used SPLITBY=500000, so the first 500000 records go to OUT1, the second 500000 to OUT2, the third 500000 to OUT3, the fourth 500000 to OUT4 and the fifth 500000 to OUT5. That totals 2500000 records. But your input file has more records - 45042279 records according to the messages. So SPLITBY=500000 rotates back to the first ddname and writes the sixth 500000 to OUT1, the seventh 500000 to OUT2, and so on. Note that the records in each file are NOT contiguous since we start over again at OUT1 after OUT1-OUT5 each get 500000 records. If you add up all of the TOTAL OUT records, they add up to the total number of input records.

naive · New User Joined: 26 Apr 2005 Posts: 46 Location: LA

Wow! Thanks Frank!

So is there any way to make sure the records are contiguous if we do not know the count of records in the input file??

Also in the manual for DFSORT that I have, it says the examples etc are available in the Application Programming Guide. Would you be having a softcopy or a link to this document??

Our online help on the m.f is woefully inadequate!
Thanks a lot again for the clarifications!!

Frank Yaeger · Posted: Thu Mar 30, 2006 2:05 am

naive · New User Joined: 26 Apr 2005 Posts: 46 Location: LA

Thanks Frank!
Really appreciate your help!

Frank Yaeger · Posted: Wed Apr 26, 2006 3:44 am

With z/OS DFSORT V1R5 PTF UK90007 or DFSORT R14 PTF UK90006 (April, 2006), you can use the new SPLIT1R function to make splitting records a bit easier. Whereas SPLITBY can rotate back to the first data set, resulting in non-contiguous records, SPLIT1R only does one rotation so the records are always contiguous.

For the example discussed here, you could use SPLIT1R=9008455 (45042279/5) and get 9008455 records for OUT1, OUT2, OUT3 and OUT4 and 9008459 records for OUT5:

naive · New User Joined: 26 Apr 2005 Posts: 46 Location: LA

great stuff!!
but we went ahead with the earlier solution. In fact we had a problem in the first run too. Just to share with you, we were allocating the output files on tape (coz the files are big).
In the tape definiton. to make it faster, we had used VOL paramter to refer to the previous storage device (VOL=REF=*.OUT1). This helps because the TMS does not have to load multiple volumes (and hence takes less time).

There was one small thing I overlooked (rather diint realise). The sort-split function allocates all the output files at the same time right at the start. Now this caused my job to fail as you cant re-use a tape volume before it is released.
Not sure if this made sense, but just to summarize, to use the SORT-SPLIT commands, we need to allocate the output files on distinct volumes/devices.