View previous topic :: View next topic
|
Author |
Message |
sonali12_9
New User
Joined: 13 Feb 2009 Posts: 20 Location: United States of america
|
|
|
|
I want to split file into 4 different files as input file is very huge and cannot be ftped. Number of records in input file is not fixed and number of output files is fixed i.e 4
Input
AAS
AVG
FSDF
DGG
SFF
SDGDG
GFG
AGH
i tried by
SORT FIELDS=COPY
OUTFIL FNAMES=(OUT1,OUT2,OUT3,OUT4),SPLIT
In o/p m getting
first file
AAS
SFF
2ND FILE
AVG
SDGDG
3RD FILE
FDSF
GFG
4th file
DGG
AGH
BUT i want o/p like
1st file
AAS
AVG
2nd file
FSDF
DGG |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
Im not quite a SORT fan, too difficult control cards.
For this kind of things i always used SAS or Easytrev,
or wrote some assembler stuff.
So if you have SAS or Easytrev installed use that.
I guess im getting a lifelong ban now. |
|
Back to top |
|
|
expat
Global Moderator
Joined: 14 Mar 2007 Posts: 8796 Location: Welsh Wales
|
|
|
|
Peter,
I'll beg to differ with you on this one,
The sort products these days are far more efficient at handling large numbers of records than most other products. The products have come a long long way in a short time and can be really useful for lots of things.
Quote: |
I guess im getting a lifelong ban now. |
Only if Frank or Allissa see your post |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10889 Location: italy
|
|
|
|
the main issue with splitting is a performance one...
alternate splitting is the fastest, it needs just one pass over the input file
sequential splitting on the other side requires two passes, the first to get the count, the second one to do the splitting |
|
Back to top |
|
|
Alissa Margulies
SYNCSORT Support
Joined: 25 Jul 2007 Posts: 496 Location: USA
|
|
|
|
If all 4 files do not have to have the same number of records, then you can code something similar to the following:
Code: |
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL FILES=01,ENDREC=10000
OUTFIL FILES=02,STARTREC=10001,ENDREC=20000
OUTFIL FILES=03,STARTREC=20001,ENDREC=30000
OUTFIL FILES=04,SAVE
/* |
|
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
enrico-sorichetti wrote: |
the main issue with splitting is a performance one...
alternate splitting is the fastest, it needs just one pass over the input file
sequential splitting on the other side requires two passes, the first to get the count, the second one to do the splitting |
Im not sure if i understand this "alternate splitting", please enlighten me. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10889 Location: italy
|
|
|
|
Hi Peter,
I just made up the terminology
given a dataset with 10 records
0,1,2,3,4,5,6,7,8,9
by alternating I meant
1st file 0,2,4,6,8
2nd file 1,3,5,7,9
just as You read a record write to the next file of the split
by sequential
1st file 01234
2nd file 56789
that' s all !
I was tempted to use the term oscillating (like in sort, using tapes as workfiles )
but I thought it would have been less clear |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
What creates the large file?
If you have control of that process, modify it to generate a record count. Use the generated record count to determine the "split" values for the sort control statements. |
|
Back to top |
|
|
sonali12_9
New User
Joined: 13 Feb 2009 Posts: 20 Location: United States of america
|
|
|
|
Thanks to everyone for your interest in helping me.
Retrieving records from DB2 table is creating a large file.
ACtually, my requirement is to convert code in focus to COBOL since focus is taking long time for execution. So i need to match the output of focus and cobol.
I am looking for some option to split file sequentially as doing it in cobol will be inefficient and increase select statements in program.
Hi enrico,
As you said -"sequential splitting on the other side requires two passes, the first to get the count, the second one to do the splitting "
is there any way to implement this logic if i have total number of records in one file?? |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Yes.
Divide the total number of records by 4 and generate the appropriate sort control statements (like ALissa posted above). Instead of a SYSIN DD * you would use a SYSIN DD DSN=. . . |
|
Back to top |
|
|
sonali12_9
New User
Joined: 13 Feb 2009 Posts: 20 Location: United States of america
|
|
|
|
Thanks. It worked now |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Good to hear it is working - thanks for letting us know
d |
|
Back to top |
|
|
|