RECORDS - IN: 45042279, OUT: 45042279
OUT1 : DELETED = 36000000, REPORT = 0, DATA = 9042279
OUT1 : TOTAL IN = 45042279, TOTAL OUT = 9042279
OUT2 : DELETED = 36042279, REPORT = 0, DATA =9000000
OUT2 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT3 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT3 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT4 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT4 : TOTAL IN = 45042279, TOTAL OUT = 9000000
OUT5 : DELETED = 36042279, REPORT = 0, DATA = 9000000
OUT5 : TOTAL IN = 45042279, TOTAL OUT = 9000000
My question is: how is the DFSORT utility reacting when we specify SPLITBY?
And is it automatically adjusting the record count in each file from the specified 500000 ??
I am not able to check the output files coz they are on tape. So I need to know whether the SPLITBY option is working or not.
SPLITBY is working as it's supposed to. You used SPLITBY=500000, so the first 500000 records go to OUT1, the second 500000 to OUT2, the third 500000 to OUT3, the fourth 500000 to OUT4 and the fifth 500000 to OUT5. That totals 2500000 records. But your input file has more records - 45042279 records according to the messages. So SPLITBY=500000 rotates back to the first ddname and writes the sixth 500000 to OUT1, the seventh 500000 to OUT2, and so on. Note that the records in each file are NOT contiguous since we start over again at OUT1 after OUT1-OUT5 each get 500000 records. If you add up all of the TOTAL OUT records, they add up to the total number of input records.
With z/OS DFSORT V1R5 PTF UK90007 or DFSORT R14 PTF UK90006 (April, 2006), you can use the new SPLIT1R function to make splitting records a bit easier. Whereas SPLITBY can rotate back to the first data set, resulting in non-contiguous records, SPLIT1R only does one rotation so the records are always contiguous.
For the example discussed here, you could use SPLIT1R=9008455 (45042279/5) and get 9008455 records for OUT1, OUT2, OUT3 and OUT4 and 9008459 records for OUT5:
but we went ahead with the earlier solution. In fact we had a problem in the first run too. Just to share with you, we were allocating the output files on tape (coz the files are big).
In the tape definiton. to make it faster, we had used VOL paramter to refer to the previous storage device (VOL=REF=*.OUT1). This helps because the TMS does not have to load multiple volumes (and hence takes less time).
There was one small thing I overlooked (rather diint realise). The sort-split function allocates all the output files at the same time right at the start. Now this caused my job to fail as you cant re-use a tape volume before it is released.
Not sure if this made sense, but just to summarize, to use the SORT-SPLIT commands, we need to allocate the output files on distinct volumes/devices.