IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Utility for splitting a large file into packs???


IBM Mainframe Forums -> All Other Mainframe Topics
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
hermit_reloaded

New User


Joined: 23 Apr 2007
Posts: 26
Location: India

PostPosted: Tue Apr 24, 2007 10:22 am
Reply with quote

Hello All,

We have a requirement to split a single pack file with about 10-20 lakh records, into multiple pack file with a limit of 1, 99, 999 records in each pack.



Pack implies a set of records having a specific header and trailer. So basically we need to insert a header and trailer for every 1, 99,999 records and the remainder thereafter.



This has to be run as a fully automated production job having some more functionality in the later steps. So, we can not use anything which requires manual calculation and allocation of files, etc. like in SYNCSORT, where we need to specify the number of files and also write the same in SYSIN using INREC and OUTREC. But the number of packs which would be formed on splitting is not fixed as the Input single pack file may have anything from < 2 Lakh records to say 20-30 lakh records. The initial packs on the output split files will have 1, 99, 999 records but the last pack will have only the remainder number of records.



Example if we have 16,91,425 records in the Input file + a header and a trailer, we would require a split up file with 9 packs the first 8 packs having 1, 99, 999 records + a header and trailer, and the last having the remainder (16, 91425 ? 8 * 1,99, 999) = 91433 records.



This thing may be done using a COBOL program, but that might not be very efficient way of doing it? Is it possible in SYNCS0RT with some method other than INREC OUTREC? Specially, considering that we need to edit the inserted headers and trailers as per some EMI standards, for example for record count in a pack, etc.
Back to top
View user's profile Send private message
William Thompson

Global Moderator


Joined: 18 Nov 2006
Posts: 3156
Location: Tucson AZ

PostPosted: Tue Apr 24, 2007 12:29 pm
Reply with quote

Your "requirements" fairly much demand it be done programmaticly, so stick to that, it will be efficient enough. What is a "lakh"?
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Tue Apr 24, 2007 12:43 pm
Reply with quote

Quote:

A lakh (Hindi/Nepali : लाख, Urdu: لکھ, Bengali: লাখ, Telugu : లక్ష, Tamil : இலட்சம்) is a unit in the Indian numbering system, widely used both in official and other contexts in Bangladesh, India, Nepal, Sri Lanka, and Pakistan. One lakh is equal to a hundred thousand (105). A hundred lakhs make a crore or ten million. The word is particularly notable because it is used almost exclusively in English articles written for Indian audiences (as opposed to writing "hundred thousand").

This system of measurement also introduces separators into numbers in a place that is different from that which is common in certain other number systems. For example, 3 million (30 lakh) would be written as 30,00,000 instead of 3,000,000.


quoted from
http://en.wikipedia.org/wiki/Lakh
Back to top
View user's profile Send private message
hermit_reloaded

New User


Joined: 23 Apr 2007
Posts: 26
Location: India

PostPosted: Wed Apr 25, 2007 9:55 am
Reply with quote

i have thought bout cobol
but the logic will require quite a lot of IOs which will reduce the speed.
I wanted a more efficient solution.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Wed Apr 25, 2007 10:31 am
Reply with quote

hermit,

I feel DFSORT helps you in achieving your goal. Search for SORTTRCK pdf in DFSORT forum & go through the topic Split a file to n output files dynamically.

Quote:
10-20 lakh records

Also one suggestion, dont use lakh (as a matter of fact Indian measurement units) in the forum as these are understandable to only Indians. You might have noticed this from the replies you got. icon_biggrin.gif
Back to top
View user's profile Send private message
hermit_reloaded

New User


Joined: 23 Apr 2007
Posts: 26
Location: India

PostPosted: Wed Apr 25, 2007 3:02 pm
Reply with quote

Thanx for the reply murmokh1

I looked at the doc mentioned by you but the process described in it is only valid if we know the number of files to boot. Which is not the case here. Number of files depends on the number of records.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Wed Apr 25, 2007 4:19 pm
Reply with quote

hermit,

If you are expecting records between 1M and 2M (as said in original post), I belive the said topic is useful to you,

The following is taken from the said topic in pdf -
Quote:
I have an input file and I can't predict the number of records in it. It varies from 0 to 20000 records



If you still feel this topic is not helpful to you, try using the next topic/section (which happens to be Five ways to split a data set).
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Wed Apr 25, 2007 7:46 pm
Reply with quote

Hello,

Also, 1-2million records is actually not so many for a single-pass, sequential operation - especially if there is a good blocking factor. My requirements often run to the 10's or 100's of millions. . .

Regardless of the method you use, the input will all have to be read and each record will have to be written - taking some number of i/o's.


From this
Quote:
but the process described in it is only valid if we know the number of files to boot
i suspect it will be an issue for a COBOL program as well. . . Unless you use the info (posted elsewhere in the forums) about dynamic file allocation, a cobol program will need to know how many output files as well.

How did this "splitting" rule come into being? What process depends on this breakdown? If we know where the output was to be used, we might be able to make other suggestions icon_smile.gif
Back to top
View user's profile Send private message
hermit_reloaded

New User


Joined: 23 Apr 2007
Posts: 26
Location: India

PostPosted: Thu Apr 26, 2007 9:12 am
Reply with quote

dick scherrer wrote:


How did this "splitting" rule come into being? What process depends on this breakdown? If we know where the output was to be used, we might be able to make other suggestions icon_smile.gif


hi Dick,

The input file contains records which need to follow a particular format. The maximum number of records constraint also comes under the rules of this format. After the file has been slippted the output will go into the validation of the records checking the compliance of the records according to the aforementioned format.
Back to top
View user's profile Send private message
murmohk1

Senior Member


Joined: 29 Jun 2006
Posts: 1436
Location: Bangalore,India

PostPosted: Thu Apr 26, 2007 9:28 am
Reply with quote

hermit,

Quote:
need to follow a particular format


If you can post the format, we may help you.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> All Other Mainframe Topics

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 2
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 8
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
Search our Forums:

Back to Top