View previous topic :: View next topic
|
Author |
Message |
shobhit garg
New User
Joined: 09 Jan 2009 Posts: 8 Location: Pune
|
|
|
|
Hi,
I have a file which has near about 2500000 (twenty five hundred thousand) lines.
Since it is a big dataset, it is taking hell lot of time when it runs so idea here is if we can devide this dataset into number of datasets where each dataset contains only 25000(twenty five thousand) of lines.
I am trying it but not able to succeed. Can anybody help me out ?
Let me know if yoy need more information regarding that |
|
Back to top |
|
|
expat
Global Moderator
Joined: 14 Mar 2007 Posts: 8797 Location: Welsh Wales
|
|
|
|
First - please search the forum as this question has been asked, and answered, quite a few times already.
Quote: |
Since it is a big dataset, it is taking hell lot of time when it runs so idea here is if we can devide this dataset into number of datasets where each dataset contains only 25000(twenty five thousand) of lines. |
Do you mean that it takes a hell of a long time to process ? If so, what do you expect to gain by splitting the file into multiples as you will still need to process all of the records at some stage and the only gain that I can see is if you run multiple processes in parallel. And that in itself can spawn a number of other problems to deal with.
Please give some more information about how the file is processed, and of course the bog standard questions - the LRECL and RECFM of the file. |
|
Back to top |
|
|
shobhit garg
New User
Joined: 09 Jan 2009 Posts: 8 Location: Pune
|
|
|
|
Thanks for the answer.
Actualy even I am not sure how these files will be processed. My job is to just create such JCL which can split the file into number of files where each file should contain only 25000 lines. |
|
Back to top |
|
|
ksk
Active User
Joined: 08 Jun 2006 Posts: 355 Location: New York
|
|
|
|
If you have DFSORT in your shop, you can split the input file into multiples files. Just search in forum, you will get examples using DFSORT.
OR you canuse the SPLITBY function using DFSORT. Below is the card.
Code: |
//SYSIN DD *
SORT FIELDS=(21,5,FS,A)
OUTFIL FNAMES=(OUT1,OUT2,OUT3),SPLITBY=10
|
This card splits your input file into 3 files and each file contains 10 records if your input file contains 30 records. You have to mention 3 files with DD names mantioned in the OUTFIL FNAMES.
But in your case, you have to mention 100 file names. I am not sure how effective this is in your case. |
|
Back to top |
|
|
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
|
|
Back to top |
|
|
shobhit garg
New User
Joined: 09 Jan 2009 Posts: 8 Location: Pune
|
|
|
|
Thanks all of you,
SORT with SPLITBY is really useful and it did work for me. Once again Mnay thanks all of you |
|
Back to top |
|
|
|