View previous topic :: View next topic
Author
Message
ozgurseyrek New User Joined: 22 Feb 2008Posts: 70 Location: Turkey
Hello,
I want to select records from a dataset which have 3.000.000.000 records.
I want to include records with 220.000 different conditions;
I could't achieve that with INCLUDE statements, because there is a limit about 2000 Include COND statements.
In forum I saw a solution with "INREC IFTHEN"..
Code:
OPTION COPY
INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'0001',OR,
...
13,4,CH,EQ,C'1999'),
OVERLAY=(81:C'Y')),
INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'2000',OR,
...
13,4,CH,EQ,C'3999'),
OVERLAY=(81:C'Y')),
INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'4000',OR,
...
13,4,CH,EQ,C'4999'),
OVERLAY=(81:C'Y'))
......
OUTFIL INCLUDE=(81,1,CH,EQ,C'Y'),
OUTREC=(1,80)
but with that solution, the performance will be very bad.
I saw another solution offer with splice. But I don't know how can I use the splice?
Can you give me a sample SPLICE statement that parse 220000 include condition with in 3000000000 records.
Thank you?
Back to top
dick scherrer Moderator Emeritus Joined: 23 Nov 2006Posts: 19244 Location: Inside the Matrix
Hello,
When you mention 220.000 conditions , do you mean 220.000 keys to be compared against the larger file?
It may help if you post some sample input data and some sample "conditions" and the output you want from those inputs.
Mention the recfm and lrecl of the files and the relevant positions in the sample data.
Back to top
ozgurseyrek New User Joined: 22 Feb 2008Posts: 70 Location: Turkey
hello,
The sample SORT clouse is like that;
The input dataset 's lrecl is 480 and FB.
and the output will be the same...
The input dataset has 3000000000 lines.
Code:
SORT FIELDS=COPY
INCLUDE COND=(23,16,CH,EQ,C'TESTDATA00000001',OR,
23,16,CH,EQ,C'TESTDATA00000002',OR,
23,16,CH,EQ,C'TESTDATA00000003',OR,
23,16,CH,EQ,C'TESTDATA00000004',OR,
...
...
...
23,16,CH,EQ,C'TESTDATA00219998',OR,
23,16,CH,EQ,C'TESTDATA00219999',OR,
23,16,CH,EQ,C'TESTDATA00220000')
Back to top
Frank Yaeger DFSORT Developer Joined: 15 Feb 2005Posts: 7129 Location: San Jose, CA
Can you have duplicates in the input file (for example, two records with C'TESTDATA00000001')?
Are you willing to put your constants in an input file with RECFM=FB and LRECL=480?
Back to top
ozgurseyrek New User Joined: 22 Feb 2008Posts: 70 Location: Turkey
Yes, there would be duplicate records...
input file is RECFM=FB and LRECL=480....
Back to top
Frank Yaeger DFSORT Developer Joined: 15 Feb 2005Posts: 7129 Location: San Jose, CA
Yes, I know the input file is RECFM=FB and LRECL=480. That's not what I asked.
Let me ask it another way. You have 220000 conditions (=constants). Can you create another FB input file that has one record for each of the 220000 constants with the constant in positions 1-16 like this:
Code:
TESTDATA00000001
TESTDATA00000002
...
Back to top
ozgurseyrek New User Joined: 22 Feb 2008Posts: 70 Location: Turkey
Yes, I can create a file, just like your sample..
Back to top
Skolusu Senior Member Joined: 07 Dec 2007Posts: 2205 Location: San Jose
create the condition file as FB 480 so that we can concatenate it to the input file as is with following layout.
Have '$$' in pos 1-2 and the key in pos 23 for 16 bytes and rest spaces
ex like this
Code:
----+----1----+----2----+----3----+----4----+
$$ TESTDATA00000001
$$ TESTDATA00000002
$$ TESTDATA00000003
$$ TESTDATA00000004
$$ TESTDATA00000005
Once you create that dataset , concatenate it with input file and make sure it is first in the list and use this 1 pass solution which will give you the desired results
Code:
//STEP0100 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=your created 480 byte condition file,DISP=SHR
// DD DSN=your 480 byte input file,DISP=SHR
//SORTOUT DD SYSOUT=*
//SYSIN DD *
SORT FIELDS=(23,16,CH,A),EQUALS
OUTREC IFTHEN=(WHEN=GROUP,BEGIN=(1,2,CH,EQ,C'$$'),PUSH=(481:23,16))
OUTFIL BUILD=(1,480),
INCLUDE=(23,16,CH,EQ,481,16,CH,AND,1,2,CH,NE,C'$$')
/*
Back to top
ozgurseyrek New User Joined: 22 Feb 2008Posts: 70 Location: Turkey
thanks for your helps but "WHEN=GROUP" took error.
I think it is releated about the SORT version. We have "Z/OS DFSORT V1R5".
Back to top
chaky New User Joined: 28 May 2009Posts: 20 Location: Bangalore
ozgurseyrek,
I am not aware how can we give 2,20000 condition at once, but if your conditions are as you have shown in example. Then i can think we can do it
Look at the below code if it make some sense to your problem....
Code:
//STEP1 EXEC PGM=SORT
//SORTIN DD DSN=........ input file,
// DISP=SHR
//SORTOUT DD DSN=........ output file,
// DISP=(NEW,CATLG,DELETE),
// SPACE=(TRK,(500,150),RLSE),
// DCB=(RECFM=FB,LRECL=480,BLKSIZE=4800)
//SYSOUT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSIN DD *
SORT FIELDS=COPY
INCLUDE COND=(23,10,CH,EQ,C'TESTDATA00',AND,
33,6,CH,GE,C'000001',AND,
33,6,CH,LE,C'220000')
/*
//*
I hope it is helpful for you.
Back to top
Skolusu Senior Member Joined: 07 Dec 2007Posts: 2205 Location: San Jose
ozgurseyrek wrote:
thanks for your helps but "WHEN=GROUP" took error.
I think it is releated about the SORT version. We have "Z/OS DFSORT V1R5".
use this DFSORT/ICETOOL solution
Code:
//STEP0100 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD DSN=your created 480 byte condition file,DISP=SHR
// DD DSN=your 480 byte input file,DISP=SHR
//OUT DD SYSOUT=*
//TOOLIN DD *
SPLICE FROM(IN) TO(OUT) ON(23,16,CH) WITH(1,480) WITHALL USING(CTL1)
//CTL1CNTL DD *
INREC IFTHEN=(WHEN=(1,2,CH,EQ,C'$$'),OVERLAY=(481:1,2))
OUTFIL FNAMES=OUT,BUILD=(1,480),OMIT=(481,1,CH,EQ,C' ')
/*
Back to top
ozgurseyrek New User Joined: 22 Feb 2008Posts: 70 Location: Turkey
thank you very much Skolusu,
It is worked with a bit changes at the omit cond.
Code:
OMIT=(1,2,CH,EQ,C'$$',OR,
481,1,CH,EQ,C' ')
Back to top
ozgurseyrek New User Joined: 22 Feb 2008Posts: 70 Location: Turkey
Answer to chaky;
Unfortunately all 16 chars can change,
'TESTDATA00' is not constant...
thank you.
Back to top
Skolusu Senior Member Joined: 07 Dec 2007Posts: 2205 Location: San Jose
ozgurseyrek wrote:
thank you very much Skolusu,
It is worked with a bit changes at the omit cond.
Code:
OMIT=(1,2,CH,EQ,C'$$',OR,
481,1,CH,EQ,C' ')
ozgurseyrek,
why do you need to add the condition for $$? Do you have dup key values in your condition file? The $$ will be the base records and will be omitted in splice for a matching key and for the unmatched keys it will be eliminated as I don't have the KEEPBASE or KEEPNODUPS parm. Unless you have duplicates in your condition file, I don't see any reason to change the omit card
Back to top
ozgurseyrek New User Joined: 22 Feb 2008Posts: 70 Location: Turkey
Yes, you are right, there was duplicate key values in condition file,
I cleaned the duplicates in condition file and your original code worked.
Thanks...
Back to top
Please enable JavaScript!