IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to split the files by validating some conditions?


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Tue Nov 06, 2012 6:34 pm
Reply with quote

Hi,
I need to split a file into many, i.e the number of output files will be based on some condition. Please find the details below,

Input file layout:(length 200)
1--------------------------------------------------200
Field1=2 top
field2=1
field3=ABC
--
--
--
field100 bot
Field1=2 top
field2=1
field3=BBC
--
--
--
field100 bot
Field1=2 top
field2=1
field3=ABC
--
--
--
field100 bot



Output :(length 200)

File1 :
Field1=2 top
field2=1
field3=ABC
--
--
--
field100 bot

File2:
Field1=2 top
field2=1
field3=BBC
--
--
--
field100 bot


Condition:
1.Need to split the file by validating top and bot keyword
2. No of output files should be based on unique field3
3. Field3 should not be duplicate and write only once
4. Each different field3 need to write in each files, so we are not sure about no of files



Please help me how to proceed this with ICETOOL or SORT card.

Thanks in advance.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Nov 06, 2012 6:50 pm
Reply with quote

Is there a maximum number of output files?
Back to top
View user's profile Send private message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Tue Nov 06, 2012 7:17 pm
Reply with quote

Quote:

Is there a maximum number of output files?


Maximum possibility is 1000
Back to top
View user's profile Send private message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Tue Nov 06, 2012 7:18 pm
Reply with quote

Usually it will be 200 files only..
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Nov 06, 2012 8:09 pm
Reply with quote

Will it matter that you'll not know which file contains what data?

You'll need OUTFIL statements and DD statements for all your files. I think I'd look at perhaps doing them in groups of, say, 220, in seperate jobs. The second and subsequent jobs need only be run if data is known to be present for them.

You can use JOINKEYS to get the field3 onto the first record, then WHEN=GROUP with PUSH for ID, OUTFIL INCLUDE on the ID, with a SAVE for any beyond 220. The SAVE to go into the next job.

Is the data already in "field3 order"?

Could look to generate some delete control cards for the unused files.
Back to top
View user's profile Send private message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Wed Nov 07, 2012 2:30 pm
Reply with quote

Field3 will not be in order...it is a report file...so there may be possibility of duplicate of field3.The top and bot word indicates that is one set of report. Based on that only we will write into different files.

My doubt is how can we identify whether one particular field3 is already available, also that top word is first row at same time how can we validate field3 presenting at third row
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Nov 07, 2012 3:16 pm
Reply with quote

Do you mean you want to split the reports by the value of field3? So if you have ABC as first and 30th reports, then you want them on the same output file?

The line1 vs line3 is just a technical thing, covered above. The first part is to be exactly clear about what you want.
Back to top
View user's profile Send private message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Wed Nov 07, 2012 6:02 pm
Reply with quote

Quote:

So if you have ABC as first and 30th reports, then you want them on the same output file?



In that case we no need to write into file. Beacuse for that ABC type already written. Thats enough.
Back to top
View user's profile Send private message
daveporcelan

Active Member


Joined: 01 Dec 2006
Posts: 792
Location: Pennsylvania

PostPosted: Wed Nov 07, 2012 6:41 pm
Reply with quote

It may just be me, but I find this problem description very confusing.

When I see the word 'field' used multiple times, I consider each to be on the same record. Yet they are shown on seperate lines in the description.

Are we talking about fields or is the actual data field3=BBC ?

Why doesn't the poster show some actual input and output data (with code tags)?

Only Bill has jumped on board (overboard?), so maybe others are confused as well.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Nov 07, 2012 7:46 pm
Reply with quote

There's a file.

It contains lots of reports which have a top'n'tail allowing identification.

Each report is for "the value of field3" (presumably a piece of data from the heading of a report).

The reports are to be split into separate files, but with only one report, the first, for "the value of field3" no matter how many reports there are for that value.

A normal day could have 200 reports, but there can be up to 1000.

The tricky part is the "only one report".
Back to top
View user's profile Send private message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Thu Nov 08, 2012 1:05 pm
Reply with quote

Thanks Bill to make my requirement clear...

Will batch COBOL program make this as simple instead of SORT ?
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Nov 08, 2012 1:25 pm
Reply with quote

With a batch Cobol program the knowing which have been written is easy, because you can store it in a table. The 200+ output files are more "wordy" than in DFSORT.

You could perhaps consider an E15 Exit, in Cobol, for the main-task. Check it in the manual and see what you think.
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10872
Location: italy

PostPosted: Thu Nov 08, 2012 2:37 pm
Reply with quote

if You had posted the data using the code tags,
most probably Your explanation would have been less confusing.

posting properly is a way to reward people who spend time trying to help You
not doing so, just wastes people time .
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Nov 08, 2012 8:15 pm
Reply with quote

Hello,

Suggest the process be implemented to handle the 1000 file situation.

Sort, COBOL, etc will NOT be able to write 1000 output DDs in a single step . . . Unless something is done with dynamic allocation, there will be multiple steps.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Nov 08, 2012 8:59 pm
Reply with quote

Gopalakrishnan,

How are you thinking of handling the output datasets? Unless you do something about it, you won't know what data is where. Does that cause you a problem?

I'm thinking to write all the reports to the OUTFILs and have an additional OUTFIL with the reference (file number and field3) which can then be processed to identify duplicate filed3s and generate code to delete those datasets, along with the numerous empty datasets from that day. How would that fit with what you want to do with the output?

Alternatively, in one step put the field3 onto the "top", and use GROUP to propagate that across all records on the report along with an ID, SORT on topfield3 and ID (with EQUALS or a sequence number for the lines) and then a second step to do the OUTFILs which can identify the duplicate topfield3s and not write them. This would require an additional pas of the entire data and a SORT.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Write line by line from two files DFSORT/ICETOOL 7
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts How to load to DB2 with column level ... DB2 6
No new posts Validating record count of a file is ... DFSORT/ICETOOL 13
Search our Forums:

Back to Top