Split files based on characters (not on number of records)

vinothsubramanian · Posted: Wed Mar 11, 2009 3:41 pm

Hi,

Is there a way to split a file into 'n' number of files based on the character contained in that file using syncsort.

For Ex:

I/P file:

1234567 890123 200812
1234567 890123 200812
1234568 901234 200811
1234567 890123 200810
1234567 890123 200810
1234568 901234 200809
1234567 890123 200808
1234567 890123 200808
1234568 901234 200807
1234567 890123 200807
1234567 890123 200806
1234568 901234 200805

I want the output files to be split as shown below based on the third column. All columns are of fixed length.
o/p FILE1:
1234567 890123 200812
1234567 890123 200812

o/p FILE2:
1234568 901234 200811

o/p FILE3:
1234567 890123 200810
1234567 890123 200810

o/p FILE4:
1234568 901234 200809

o/p FILE5:
1234567 890123 200808
1234567 890123 200808

o/p FILE6:
1234568 901234 200807
1234567 890123 200807

o/p FILE7:
1234567 890123 200806

o/p FILE8:
1234568 901234 200805

Regards,
Ram.

krisprems · Posted: Wed Mar 11, 2009 5:14 pm

vinothsubramanian, check the below example and see if this suffices your requirement.

vinothsubramanian · Posted: Wed Mar 11, 2009 10:19 pm

Hi Krishna,

Thanks for your reply.

This is one way to achieve it if know the values in the third column.

Is there any other way to achieve it if we don't know the values in the third column. But the number of distinct values in the third column is known say 10. So we need to split the file into 10 files based on the third column without knowing the values in the third column.

Regards,
Ram.

Skolusu · Posted: Thu Mar 12, 2009 11:39 pm

vinothsubramanian,

I am assuming that your input file is already sorted on your third column. If not change the SORT FIELDS=COPY to SORT FIELDS=(16,6,CH,A). you can easily tag an ID number to the third column using the new WHEN=GROUP function of DFSORT available with z/OS DFSORT V1R5 PTF UK90013 (July, 2008) and use that ID to split the records into different files like this:

If you have more than 10 unq values the leftover file will have all the other records

Arun Raj · Posted: Fri Mar 13, 2009 10:02 am

vinothsubramanian,

Here's another way of achieving the same without using 'WHEN=GROUP'. As pointed out by Kolusu, your input data seems to be sorted on third field.
If not, you might want to modify the SORT statement.

vinothsubramanian · Posted: Fri Mar 13, 2009 12:28 pm

Hi Kolusu,

Thanks for your reply. As pointed by you, the WHEN=GROUP didn't work in our system.

Hi arcvns,

The reply posted by you works exactly the way I wanted. Thanks for the solution. However could you please kindly explain me what does this line exactly does:
105:SEQNUM,8,ZD,97:81,8,ZD,SUB,105,8,ZD,M11,LENGTH=8

Thanks to both of you once again for helping to resolve this request.

Regards,
Ram.

Arun Raj · Posted: Fri Mar 13, 2009 12:49 pm

vinothsubramanian,

It's a small sequence number trick by which we assign a unique sequence number to each group.

Arun Raj · Posted: Fri Mar 13, 2009 12:55 pm

I just noticed that your initial post says you have SyncSort. Guess Kolusu also missed that part.

vinothsubramanian · Posted: Fri Mar 13, 2009 2:40 pm

Hi arcvns,

So Kind of you.

I understood the logic and thanks for your help.

Regards,
Ram.

Arun Raj · Posted: Fri Mar 13, 2009 2:50 pm

You're welcome.