SPLIT FILE into multiple files using Syncsort

sonali12_9 · Posted: Thu Dec 10, 2009 8:11 pm

I want to split file into 4 different files as input file is very huge and cannot be ftped. Number of records in input file is not fixed and number of output files is fixed i.e 4

Input
AAS
AVG
FSDF
DGG
SFF
SDGDG
GFG
AGH

i tried by
SORT FIELDS=COPY
OUTFIL FNAMES=(OUT1,OUT2,OUT3,OUT4),SPLIT
In o/p m getting
first file
AAS
SFF

2ND FILE
AVG
SDGDG

3RD FILE
FDSF
GFG
4th file
DGG
AGH

BUT i want o/p like

1st file
AAS
AVG
2nd file
FSDF
DGG

PeterHolland · Posted: Thu Dec 10, 2009 8:34 pm

Im not quite a SORT fan, too difficult control cards.
For this kind of things i always used SAS or Easytrev,
or wrote some assembler stuff.

So if you have SAS or Easytrev installed use that.

I guess im getting a lifelong ban now.

expat · Posted: Thu Dec 10, 2009 8:42 pm

Peter,

I'll beg to differ with you on this one,

The sort products these days are far more efficient at handling large numbers of records than most other products. The products have come a long long way in a short time and can be really useful for lots of things.

enrico-sorichetti · Posted: Thu Dec 10, 2009 8:45 pm

the main issue with splitting is a performance one...

alternate splitting is the fastest, it needs just one pass over the input file

sequential splitting on the other side requires two passes, the first to get the count, the second one to do the splitting

Alissa Margulies · Posted: Thu Dec 10, 2009 10:00 pm

If all 4 files do not have to have the same number of records, then you can code something similar to the following:

PeterHolland · Posted: Thu Dec 10, 2009 10:58 pm

enrico-sorichetti · Posted: Thu Dec 10, 2009 11:13 pm

Hi Peter,
I just made up the terminology

given a dataset with 10 records
0,1,2,3,4,5,6,7,8,9

by alternating I meant
1st file 0,2,4,6,8
2nd file 1,3,5,7,9
just as You read a record write to the next file of the split

by sequential
1st file 01234
2nd file 56789

that' s all !
I was tempted to use the term oscillating (like in sort, using tapes as workfiles )
but I thought it would have been less clear

dick scherrer · Posted: Fri Dec 11, 2009 1:08 am

Hello,

What creates the large file?

If you have control of that process, modify it to generate a record count. Use the generated record count to determine the "split" values for the sort control statements.

sonali12_9 · Posted: Fri Dec 11, 2009 11:30 am

Thanks to everyone for your interest in helping me.

Retrieving records from DB2 table is creating a large file.
ACtually, my requirement is to convert code in focus to COBOL since focus is taking long time for execution. So i need to match the output of focus and cobol.

I am looking for some option to split file sequentially as doing it in cobol will be inefficient and increase select statements in program.

Hi enrico,
As you said -"sequential splitting on the other side requires two passes, the first to get the count, the second one to do the splitting "

is there any way to implement this logic if i have total number of records in one file??

dick scherrer · Posted: Fri Dec 11, 2009 8:23 pm

Hello,

Yes.

Divide the total number of records by 4 and generate the appropriate sort control statements (like ALissa posted above). Instead of a SYSIN DD * you would use a SYSIN DD DSN=. . .

sonali12_9 · Posted: Sat Dec 12, 2009 12:25 pm

Thanks. It worked now

dick scherrer · Posted: Sun Dec 13, 2009 2:52 am

Good to hear it is working - thanks for letting us know

d