Utility for splitting a large file into packs???

hermit_reloaded · New User Joined: 23 Apr 2007 Posts: 26 Location: India

Hello All,

We have a requirement to split a single pack file with about 10-20 lakh records, into multiple pack file with a limit of 1, 99, 999 records in each pack.

Pack implies a set of records having a specific header and trailer. So basically we need to insert a header and trailer for every 1, 99,999 records and the remainder thereafter.

This has to be run as a fully automated production job having some more functionality in the later steps. So, we can not use anything which requires manual calculation and allocation of files, etc. like in SYNCSORT, where we need to specify the number of files and also write the same in SYSIN using INREC and OUTREC. But the number of packs which would be formed on splitting is not fixed as the Input single pack file may have anything from < 2 Lakh records to say 20-30 lakh records. The initial packs on the output split files will have 1, 99, 999 records but the last pack will have only the remainder number of records.

Example if we have 16,91,425 records in the Input file + a header and a trailer, we would require a split up file with 9 packs the first 8 packs having 1, 99, 999 records + a header and trailer, and the last having the remainder (16, 91425 ? 8 * 1,99, 999) = 91433 records.

This thing may be done using a COBOL program, but that might not be very efficient way of doing it? Is it possible in SYNCS0RT with some method other than INREC OUTREC? Specially, considering that we need to edit the inserted headers and trailers as per some EMI standards, for example for record count in a pack, etc.

William Thompson · Posted: Tue Apr 24, 2007 12:29 pm

Your "requirements" fairly much demand it be done programmaticly, so stick to that, it will be efficient enough. What is a "lakh"?

enrico-sorichetti · Posted: Tue Apr 24, 2007 12:43 pm

hermit_reloaded · New User Joined: 23 Apr 2007 Posts: 26 Location: India

i have thought bout cobol
but the logic will require quite a lot of IOs which will reduce the speed.
I wanted a more efficient solution.

murmohk1 · Posted: Wed Apr 25, 2007 10:31 am

hermit,

I feel DFSORT helps you in achieving your goal. Search for SORTTRCK pdf in DFSORT forum & go through the topic Split a file to n output files dynamically.

hermit_reloaded · New User Joined: 23 Apr 2007 Posts: 26 Location: India

Thanx for the reply murmokh1

I looked at the doc mentioned by you but the process described in it is only valid if we know the number of files to boot. Which is not the case here. Number of files depends on the number of records.

murmohk1 · Posted: Wed Apr 25, 2007 4:19 pm

hermit,

If you are expecting records between 1M and 2M (as said in original post), I belive the said topic is useful to you,

The following is taken from the said topic in pdf -

dick scherrer · Posted: Wed Apr 25, 2007 7:46 pm

Hello,

Also, 1-2million records is actually not so many for a single-pass, sequential operation - especially if there is a good blocking factor. My requirements often run to the 10's or 100's of millions. . .

Regardless of the method you use, the input will all have to be read and each record will have to be written - taking some number of i/o's.

From this

hermit_reloaded · New User Joined: 23 Apr 2007 Posts: 26 Location: India

murmohk1 · Posted: Thu Apr 26, 2007 9:28 am

hermit,