I have a file in the following format having headers and trailers for every group.
group1starts
rec1
rec2
group1end
group2 start
rec1
rec2
rec3
group2 ends
group3 starts and so on
while doing FTP if the bytes to transfer is more than 75mbytes the job abends, I need to split the file in numbers of files if it exceeds 75mbytes
but when I am splitting I have to make sure that no group is divided in two different file. all the records of one specific group should be in one file only
i have tried with 'GROUP BY' CLAUSE but did not got solution
Please help , if it is possible by ICETOOL, or I have to write cobol for it
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
Quote:
I need to split the file in numbers of files if it exceeds 75mbytes
What is the maximum number of output files you want to allow for the split? (For example, if the maximum size of the input file could be 700MB, you'd want to allow for a maximum number of output files of 10 or 11.)
What is the RECFM and LRECL of the input file?
How do you identify the first record of a group (be specific - like it contains 'starts' in positions 11-16)?
How do you identify the last record of a group (be specific - like it contains 'ends' in positions 11-14)?
1) you are right , we can split the input file for 'n' number of outfiles each of 75 MB
2)for input file RECFM=FB , LRECL =145
3)First record is identified by '<JOB ID=' at position 1-8 and last record for the group is identified by '</JOB>' at postion 1-6 and then again the first record of new group starts with '<JOB ID='
thanks for your interest , please let me know if any other input is required for the same
well so far ,the files length we received is such that we have divided it in 3(max) but this time for safety , we can keep the maximum number of output files to be 4
for example :
the maximum record count this we received is = 1646622, for LRECL=145
Thanks a lot for spending your valuable time for my requirement , the output is almost correct
but the following are the cases where it fails to give correct result
note:every day, in Input file number of records will vary
1) when the input file length is less than 500000
the code abends while doing following comparison
INCLUDE=(146,8,ZD,GT,SPL1,AND,146,8,ZD,LE,SPL2), because the SPL1 and SPL2 is empty , SPL3, SPL4 also empty
2) or lets say input files is of 1000000, that means SPL3 will be empty and thus for the comparison INCLUDE=(146,8,ZD,GT,SPL2,AND,146,8,ZD,LE,SPL3), it abends saying SPL3 is Empty.
Q-2)
can I have 4 records of main header and 2 records of main trailor in the all the output files it creates
ex. in input file
main header1
main header2
main header3
main header4
group1starts
rec1
rec2
group1end
group2 start
rec1
rec2
rec3
group2 ends
group3 starts and so on
main trailor1
main trailor2
Q-3) is there any way to decide the number of output files that it may required dynamically , i mean allocate outfiles dynamically? I think 'no '
it is still fine with current logic
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
I took a shot based on what I thought you asked for. I don't really have time to design a job that will meet all of your requirements. I suggest you write a program.
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
Nehal Soni,
The following DFSORT/ICETOOL JCL will give you the desired results. It even takes care of pulling the 4 headers and 2 trailers for all the output files. The tricky part here is to split the records into 75 mb group
Code:
1 megabyte = 1,048,576 bytes
LRECL= 145
No: of records in 1mb = int(1048576/145) = 7231
since we are dealing with groups of records I narrowed that number to 7000. So the first copy operator starts a sequence number with 7000 using when=init at pos 146 of every record.
using another when=init, I divide that seqnum by 7000 so that every set of 7000 records have the same seqnum and put at pos 154
Now inorder to split the file into 75mb files, I divide the number at pos 154 by 74(1 mb buffer to hold group records).
I also used GROUP function to sequence the groups.
Using report functions I generate file t1 with 74mb limits like this
1. If your input has less than min of 74MB records than OUT4 will have 6 records which are the header and trailer records
2. If your input has more than 222 MB of data then your OUT4 will have rest of the data as the first 3 files will each have 74mb. For example if your input is 500 MB, the first 3 files will have Less than or equal 75 mb each while out4 will have the rest 278 MB data
We can come up with a DYNAMIC split but it becomes a little bit more complicated.