View previous topic :: View next topic
|
Author |
Message |
Gopalakrishnan V
Active User
Joined: 28 Jun 2010 Posts: 102 Location: chennai
|
|
|
|
Hi,
I need to split a file into many, i.e the number of output files will be based on some condition. Please find the details below,
Input file layout:(length 200)
1--------------------------------------------------200
Field1=2 top
field2=1
field3=ABC
--
--
--
field100 bot
Field1=2 top
field2=1
field3=BBC
--
--
--
field100 bot
Field1=2 top
field2=1
field3=ABC
--
--
--
field100 bot
Output :(length 200)
File1 :
Field1=2 top
field2=1
field3=ABC
--
--
--
field100 bot
File2:
Field1=2 top
field2=1
field3=BBC
--
--
--
field100 bot
Condition:
1.Need to split the file by validating top and bot keyword
2. No of output files should be based on unique field3
3. Field3 should not be duplicate and write only once
4. Each different field3 need to write in each files, so we are not sure about no of files
Please help me how to proceed this with ICETOOL or SORT card.
Thanks in advance. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Is there a maximum number of output files? |
|
Back to top |
|
|
Gopalakrishnan V
Active User
Joined: 28 Jun 2010 Posts: 102 Location: chennai
|
|
|
|
Quote: |
Is there a maximum number of output files? |
Maximum possibility is 1000 |
|
Back to top |
|
|
Gopalakrishnan V
Active User
Joined: 28 Jun 2010 Posts: 102 Location: chennai
|
|
|
|
Usually it will be 200 files only.. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Will it matter that you'll not know which file contains what data?
You'll need OUTFIL statements and DD statements for all your files. I think I'd look at perhaps doing them in groups of, say, 220, in seperate jobs. The second and subsequent jobs need only be run if data is known to be present for them.
You can use JOINKEYS to get the field3 onto the first record, then WHEN=GROUP with PUSH for ID, OUTFIL INCLUDE on the ID, with a SAVE for any beyond 220. The SAVE to go into the next job.
Is the data already in "field3 order"?
Could look to generate some delete control cards for the unused files. |
|
Back to top |
|
|
Gopalakrishnan V
Active User
Joined: 28 Jun 2010 Posts: 102 Location: chennai
|
|
|
|
Field3 will not be in order...it is a report file...so there may be possibility of duplicate of field3.The top and bot word indicates that is one set of report. Based on that only we will write into different files.
My doubt is how can we identify whether one particular field3 is already available, also that top word is first row at same time how can we validate field3 presenting at third row |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Do you mean you want to split the reports by the value of field3? So if you have ABC as first and 30th reports, then you want them on the same output file?
The line1 vs line3 is just a technical thing, covered above. The first part is to be exactly clear about what you want. |
|
Back to top |
|
|
Gopalakrishnan V
Active User
Joined: 28 Jun 2010 Posts: 102 Location: chennai
|
|
|
|
Quote: |
So if you have ABC as first and 30th reports, then you want them on the same output file?
|
In that case we no need to write into file. Beacuse for that ABC type already written. Thats enough. |
|
Back to top |
|
|
daveporcelan
Active Member
Joined: 01 Dec 2006 Posts: 792 Location: Pennsylvania
|
|
|
|
It may just be me, but I find this problem description very confusing.
When I see the word 'field' used multiple times, I consider each to be on the same record. Yet they are shown on seperate lines in the description.
Are we talking about fields or is the actual data field3=BBC ?
Why doesn't the poster show some actual input and output data (with code tags)?
Only Bill has jumped on board (overboard?), so maybe others are confused as well. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
There's a file.
It contains lots of reports which have a top'n'tail allowing identification.
Each report is for "the value of field3" (presumably a piece of data from the heading of a report).
The reports are to be split into separate files, but with only one report, the first, for "the value of field3" no matter how many reports there are for that value.
A normal day could have 200 reports, but there can be up to 1000.
The tricky part is the "only one report". |
|
Back to top |
|
|
Gopalakrishnan V
Active User
Joined: 28 Jun 2010 Posts: 102 Location: chennai
|
|
|
|
Thanks Bill to make my requirement clear...
Will batch COBOL program make this as simple instead of SORT ? |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
With a batch Cobol program the knowing which have been written is easy, because you can store it in a table. The 200+ output files are more "wordy" than in DFSORT.
You could perhaps consider an E15 Exit, in Cobol, for the main-task. Check it in the manual and see what you think. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10872 Location: italy
|
|
|
|
if You had posted the data using the code tags,
most probably Your explanation would have been less confusing.
posting properly is a way to reward people who spend time trying to help You
not doing so, just wastes people time . |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Suggest the process be implemented to handle the 1000 file situation.
Sort, COBOL, etc will NOT be able to write 1000 output DDs in a single step . . . Unless something is done with dynamic allocation, there will be multiple steps. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Gopalakrishnan,
How are you thinking of handling the output datasets? Unless you do something about it, you won't know what data is where. Does that cause you a problem?
I'm thinking to write all the reports to the OUTFILs and have an additional OUTFIL with the reference (file number and field3) which can then be processed to identify duplicate filed3s and generate code to delete those datasets, along with the numerous empty datasets from that day. How would that fit with what you want to do with the output?
Alternatively, in one step put the field3 onto the "top", and use GROUP to propagate that across all records on the report along with an ID, SORT on topfield3 and ID (with EQUALS or a sequence number for the lines) and then a second step to do the OUTFILs which can identify the duplicate topfield3s and not write them. This would require an additional pas of the entire data and a SORT. |
|
Back to top |
|
|
|