IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Selecting Records with SPLICE?


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 2:03 pm
Reply with quote

Hello,
I want to select records from a dataset which have 3.000.000.000 records.
I want to include records with 220.000 different conditions;

I could't achieve that with INCLUDE statements, because there is a limit about 2000 Include COND statements.

In forum I saw a solution with "INREC IFTHEN"..

Code:
         OPTION COPY                                 
         INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'0001',OR,   
                             ...                     
                             13,4,CH,EQ,C'1999'),     
                       OVERLAY=(81:C'Y')),           
         INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'2000',OR,   
                             ...                     
                             13,4,CH,EQ,C'3999'),     
                       OVERLAY=(81:C'Y')),           
         INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'4000',OR,   
                             ...                     
                             13,4,CH,EQ,C'4999'),     
                       OVERLAY=(81:C'Y'))
             ......           
             OUTFIL INCLUDE=(81,1,CH,EQ,C'Y'),       
               OUTREC=(1,80)                         

but with that solution, the performance will be very bad.

I saw another solution offer with splice. But I don't know how can I use the splice?

Can you give me a sample SPLICE statement that parse 220000 include condition with in 3000000000 records.
Thank you?
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri May 29, 2009 8:08 pm
Reply with quote

Hello,

When you mention 220.000 conditions, do you mean 220.000 keys to be compared against the larger file?

It may help if you post some sample input data and some sample "conditions" and the output you want from those inputs.

Mention the recfm and lrecl of the files and the relevant positions in the sample data.
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 8:19 pm
Reply with quote

hello,
The sample SORT clouse is like that;
The input dataset 's lrecl is 480 and FB.
and the output will be the same...

The input dataset has 3000000000 lines.

Code:
  SORT FIELDS=COPY                                 
  INCLUDE COND=(23,16,CH,EQ,C'TESTDATA00000001',OR,
                23,16,CH,EQ,C'TESTDATA00000002',OR,
                23,16,CH,EQ,C'TESTDATA00000003',OR,
                23,16,CH,EQ,C'TESTDATA00000004',OR,
  ...                                               
  ...                                               
  ...                                               
                23,16,CH,EQ,C'TESTDATA00219998',OR,
                23,16,CH,EQ,C'TESTDATA00219999',OR,
                23,16,CH,EQ,C'TESTDATA00220000')   
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri May 29, 2009 8:55 pm
Reply with quote

Can you have duplicates in the input file (for example, two records with C'TESTDATA00000001')?

Are you willing to put your constants in an input file with RECFM=FB and LRECL=480?
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 9:02 pm
Reply with quote

Yes, there would be duplicate records...
input file is RECFM=FB and LRECL=480....
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri May 29, 2009 9:20 pm
Reply with quote

Yes, I know the input file is RECFM=FB and LRECL=480. That's not what I asked.

Let me ask it another way. You have 220000 conditions (=constants). Can you create another FB input file that has one record for each of the 220000 constants with the constant in positions 1-16 like this:

Code:

TESTDATA00000001
TESTDATA00000002
...
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 9:47 pm
Reply with quote

Yes, I can create a file, just like your sample..
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Fri May 29, 2009 10:11 pm
Reply with quote

create the condition file as FB 480 so that we can concatenate it to the input file as is with following layout.

Have '$$' in pos 1-2 and the key in pos 23 for 16 bytes and rest spaces

ex like this

Code:

----+----1----+----2----+----3----+----4----+
$$                    TESTDATA00000001       
$$                    TESTDATA00000002       
$$                    TESTDATA00000003       
$$                    TESTDATA00000004       
$$                    TESTDATA00000005       


Once you create that dataset , concatenate it with input file and make sure it is first in the list and use this 1 pass solution which will give you the desired results

Code:

//STEP0100 EXEC PGM=SORT                                               
//SYSOUT   DD SYSOUT=*                                                 
//SORTIN   DD DSN=your created 480 byte condition file,DISP=SHR
//         DD DSN=your 480 byte input file,DISP=SHR
//SORTOUT  DD SYSOUT=*                                                 
//SYSIN    DD *                                                     
  SORT FIELDS=(23,16,CH,A),EQUALS                                   
  OUTREC IFTHEN=(WHEN=GROUP,BEGIN=(1,2,CH,EQ,C'$$'),PUSH=(481:23,16))
                                                                     
  OUTFIL BUILD=(1,480),                                             
  INCLUDE=(23,16,CH,EQ,481,16,CH,AND,1,2,CH,NE,C'$$')               
/*
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 11:18 pm
Reply with quote

thanks for your helps but "WHEN=GROUP" took error.
I think it is releated about the SORT version. We have "Z/OS DFSORT V1R5".
Back to top
View user's profile Send private message
chaky

New User


Joined: 28 May 2009
Posts: 20
Location: Bangalore

PostPosted: Fri May 29, 2009 11:48 pm
Reply with quote

ozgurseyrek,

I am not aware how can we give 2,20000 condition at once, but if your conditions are as you have shown in example. Then i can think we can do it icon_smile.gif
Look at the below code if it make some sense to your problem....
Code:
//STEP1    EXEC PGM=SORT
//SORTIN   DD   DSN=........ input file,
//          DISP=SHR                                 
//SORTOUT  DD   DSN=........ output file,
//         DISP=(NEW,CATLG,DELETE),                 
//         SPACE=(TRK,(500,150),RLSE),             
//         DCB=(RECFM=FB,LRECL=480,BLKSIZE=4800)       
//SYSOUT   DD   SYSOUT=*                             
//SYSPRINT DD   SYSOUT=*                             
//SYSIN    DD   *                                   
 SORT FIELDS=COPY                                   
 INCLUDE COND=(23,10,CH,EQ,C'TESTDATA00',AND,       
               33,6,CH,GE,C'000001',AND,             
               33,6,CH,LE,C'220000')                 
/*                                                   
//*                                                 

I hope it is helpful for you.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Fri May 29, 2009 11:52 pm
Reply with quote

ozgurseyrek wrote:
thanks for your helps but "WHEN=GROUP" took error.
I think it is releated about the SORT version. We have "Z/OS DFSORT V1R5".


use this DFSORT/ICETOOL solution

Code:

//STEP0100 EXEC PGM=ICETOOL   
//TOOLMSG  DD SYSOUT=*         
//DFSMSG   DD SYSOUT=*         
//IN       DD DSN=your created 480 byte condition file,DISP=SHR
//         DD DSN=your 480 byte input file,DISP=SHR
//OUT      DD SYSOUT=*                                               
//TOOLIN   DD *                                                       
  SPLICE FROM(IN) TO(OUT) ON(23,16,CH) WITH(1,480) WITHALL USING(CTL1)
//CTL1CNTL DD *                                                       
  INREC IFTHEN=(WHEN=(1,2,CH,EQ,C'$$'),OVERLAY=(481:1,2))             
  OUTFIL FNAMES=OUT,BUILD=(1,480),OMIT=(481,1,CH,EQ,C' ')             
/*
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Sat May 30, 2009 12:31 am
Reply with quote

thank you very much Skolusu,
It is worked with a bit changes at the omit cond.
Code:
OMIT=(1,2,CH,EQ,C'$$',OR,
      481,1,CH,EQ,C' ') 
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Sat May 30, 2009 12:33 am
Reply with quote

Answer to chaky;

Unfortunately all 16 chars can change,
'TESTDATA00' is not constant...
thank you.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Sat May 30, 2009 2:45 am
Reply with quote

ozgurseyrek wrote:
thank you very much Skolusu,
It is worked with a bit changes at the omit cond.
Code:
OMIT=(1,2,CH,EQ,C'$$',OR,
      481,1,CH,EQ,C' ') 


ozgurseyrek,

why do you need to add the condition for $$? Do you have dup key values in your condition file? The $$ will be the base records and will be omitted in splice for a matching key and for the unmatched keys it will be eliminated as I don't have the KEEPBASE or KEEPNODUPS parm. Unless you have duplicates in your condition file, I don't see any reason to change the omit card
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Sun May 31, 2009 4:29 pm
Reply with quote

Yes, you are right, there was duplicate key values in condition file,
I cleaned the duplicates in condition file and your original code worked.
Thanks...
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts To fetch records that has Ttamp value... DFSORT/ICETOOL 4
No new posts ICETOOL returns no records JCL & VSAM 1
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 3
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts Compare only first records of the fil... SYNCSORT 7
Search our Forums:

Back to Top