Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

Selecting Records with SPLICE?

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 2:03 pm    Post subject: Selecting Records with SPLICE?
Reply with quote

Hello,
I want to select records from a dataset which have 3.000.000.000 records.
I want to include records with 220.000 different conditions;

I could't achieve that with INCLUDE statements, because there is a limit about 2000 Include COND statements.

In forum I saw a solution with "INREC IFTHEN"..

Code:
         OPTION COPY                                 
         INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'0001',OR,   
                             ...                     
                             13,4,CH,EQ,C'1999'),     
                       OVERLAY=(81:C'Y')),           
         INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'2000',OR,   
                             ...                     
                             13,4,CH,EQ,C'3999'),     
                       OVERLAY=(81:C'Y')),           
         INREC IFTHEN=(WHEN=(13,4,CH,EQ,C'4000',OR,   
                             ...                     
                             13,4,CH,EQ,C'4999'),     
                       OVERLAY=(81:C'Y'))
             ......           
             OUTFIL INCLUDE=(81,1,CH,EQ,C'Y'),       
               OUTREC=(1,80)                         

but with that solution, the performance will be very bad.

I saw another solution offer with splice. But I don't know how can I use the splice?

Can you give me a sample SPLICE statement that parse 220000 include condition with in 3000000000 records.
Thank you?
Back to top
View user's profile Send private message

dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Fri May 29, 2009 8:08 pm    Post subject:
Reply with quote

Hello,

When you mention 220.000 conditions, do you mean 220.000 keys to be compared against the larger file?

It may help if you post some sample input data and some sample "conditions" and the output you want from those inputs.

Mention the recfm and lrecl of the files and the relevant positions in the sample data.
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 8:19 pm    Post subject:
Reply with quote

hello,
The sample SORT clouse is like that;
The input dataset 's lrecl is 480 and FB.
and the output will be the same...

The input dataset has 3000000000 lines.

Code:
  SORT FIELDS=COPY                                 
  INCLUDE COND=(23,16,CH,EQ,C'TESTDATA00000001',OR,
                23,16,CH,EQ,C'TESTDATA00000002',OR,
                23,16,CH,EQ,C'TESTDATA00000003',OR,
                23,16,CH,EQ,C'TESTDATA00000004',OR,
  ...                                               
  ...                                               
  ...                                               
                23,16,CH,EQ,C'TESTDATA00219998',OR,
                23,16,CH,EQ,C'TESTDATA00219999',OR,
                23,16,CH,EQ,C'TESTDATA00220000')   
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Fri May 29, 2009 8:55 pm    Post subject:
Reply with quote

Can you have duplicates in the input file (for example, two records with C'TESTDATA00000001')?

Are you willing to put your constants in an input file with RECFM=FB and LRECL=480?
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 9:02 pm    Post subject:
Reply with quote

Yes, there would be duplicate records...
input file is RECFM=FB and LRECL=480....
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Fri May 29, 2009 9:20 pm    Post subject:
Reply with quote

Yes, I know the input file is RECFM=FB and LRECL=480. That's not what I asked.

Let me ask it another way. You have 220000 conditions (=constants). Can you create another FB input file that has one record for each of the 220000 constants with the constant in positions 1-16 like this:

Code:

TESTDATA00000001
TESTDATA00000002
...
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 9:47 pm    Post subject:
Reply with quote

Yes, I can create a file, just like your sample..
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Fri May 29, 2009 10:11 pm    Post subject:
Reply with quote

create the condition file as FB 480 so that we can concatenate it to the input file as is with following layout.

Have '$$' in pos 1-2 and the key in pos 23 for 16 bytes and rest spaces

ex like this

Code:

----+----1----+----2----+----3----+----4----+
$$                    TESTDATA00000001       
$$                    TESTDATA00000002       
$$                    TESTDATA00000003       
$$                    TESTDATA00000004       
$$                    TESTDATA00000005       


Once you create that dataset , concatenate it with input file and make sure it is first in the list and use this 1 pass solution which will give you the desired results

Code:

//STEP0100 EXEC PGM=SORT                                               
//SYSOUT   DD SYSOUT=*                                                 
//SORTIN   DD DSN=your created 480 byte condition file,DISP=SHR
//         DD DSN=your 480 byte input file,DISP=SHR
//SORTOUT  DD SYSOUT=*                                                 
//SYSIN    DD *                                                     
  SORT FIELDS=(23,16,CH,A),EQUALS                                   
  OUTREC IFTHEN=(WHEN=GROUP,BEGIN=(1,2,CH,EQ,C'$$'),PUSH=(481:23,16))
                                                                     
  OUTFIL BUILD=(1,480),                                             
  INCLUDE=(23,16,CH,EQ,481,16,CH,AND,1,2,CH,NE,C'$$')               
/*
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Fri May 29, 2009 11:18 pm    Post subject:
Reply with quote

thanks for your helps but "WHEN=GROUP" took error.
I think it is releated about the SORT version. We have "Z/OS DFSORT V1R5".
Back to top
View user's profile Send private message
chaky

New User


Joined: 28 May 2009
Posts: 20
Location: Bangalore

PostPosted: Fri May 29, 2009 11:48 pm    Post subject:
Reply with quote

ozgurseyrek,

I am not aware how can we give 2,20000 condition at once, but if your conditions are as you have shown in example. Then i can think we can do it icon_smile.gif
Look at the below code if it make some sense to your problem....
Code:
//STEP1    EXEC PGM=SORT
//SORTIN   DD   DSN=........ input file,
//          DISP=SHR                                 
//SORTOUT  DD   DSN=........ output file,
//         DISP=(NEW,CATLG,DELETE),                 
//         SPACE=(TRK,(500,150),RLSE),             
//         DCB=(RECFM=FB,LRECL=480,BLKSIZE=4800)       
//SYSOUT   DD   SYSOUT=*                             
//SYSPRINT DD   SYSOUT=*                             
//SYSIN    DD   *                                   
 SORT FIELDS=COPY                                   
 INCLUDE COND=(23,10,CH,EQ,C'TESTDATA00',AND,       
               33,6,CH,GE,C'000001',AND,             
               33,6,CH,LE,C'220000')                 
/*                                                   
//*                                                 

I hope it is helpful for you.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Fri May 29, 2009 11:52 pm    Post subject:
Reply with quote

ozgurseyrek wrote:
thanks for your helps but "WHEN=GROUP" took error.
I think it is releated about the SORT version. We have "Z/OS DFSORT V1R5".


use this DFSORT/ICETOOL solution

Code:

//STEP0100 EXEC PGM=ICETOOL   
//TOOLMSG  DD SYSOUT=*         
//DFSMSG   DD SYSOUT=*         
//IN       DD DSN=your created 480 byte condition file,DISP=SHR
//         DD DSN=your 480 byte input file,DISP=SHR
//OUT      DD SYSOUT=*                                               
//TOOLIN   DD *                                                       
  SPLICE FROM(IN) TO(OUT) ON(23,16,CH) WITH(1,480) WITHALL USING(CTL1)
//CTL1CNTL DD *                                                       
  INREC IFTHEN=(WHEN=(1,2,CH,EQ,C'$$'),OVERLAY=(481:1,2))             
  OUTFIL FNAMES=OUT,BUILD=(1,480),OMIT=(481,1,CH,EQ,C' ')             
/*
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Sat May 30, 2009 12:31 am    Post subject:
Reply with quote

thank you very much Skolusu,
It is worked with a bit changes at the omit cond.
Code:
OMIT=(1,2,CH,EQ,C'$$',OR,
      481,1,CH,EQ,C' ') 
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Sat May 30, 2009 12:33 am    Post subject:
Reply with quote

Answer to chaky;

Unfortunately all 16 chars can change,
'TESTDATA00' is not constant...
thank you.
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Sat May 30, 2009 2:45 am    Post subject:
Reply with quote

ozgurseyrek wrote:
thank you very much Skolusu,
It is worked with a bit changes at the omit cond.
Code:
OMIT=(1,2,CH,EQ,C'$$',OR,
      481,1,CH,EQ,C' ') 


ozgurseyrek,

why do you need to add the condition for $$? Do you have dup key values in your condition file? The $$ will be the base records and will be omitted in splice for a matching key and for the unmatched keys it will be eliminated as I don't have the KEEPBASE or KEEPNODUPS parm. Unless you have duplicates in your condition file, I don't see any reason to change the omit card
Back to top
View user's profile Send private message
ozgurseyrek

New User


Joined: 22 Feb 2008
Posts: 70
Location: Turkey

PostPosted: Sun May 31, 2009 4:29 pm    Post subject:
Reply with quote

Yes, you are right, there was duplicate key values in condition file,
I cleaned the duplicates in condition file and your original code worked.
Thanks...
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Limit duplicate records in the SORT pshongal SYNCSORT 6 Mon Nov 21, 2016 12:54 pm
No new posts How to split the records using the am... vnktrrd DFSORT/ICETOOL 24 Fri Oct 28, 2016 7:33 pm
No new posts Sort records based on numeric field. Alks SYNCSORT 2 Wed Oct 19, 2016 10:14 pm
No new posts abend sort based on count records in ... anatol DFSORT/ICETOOL 5 Mon Oct 17, 2016 10:10 pm
No new posts how to split records based on specifi... Venkata Ramayya DFSORT/ICETOOL 6 Wed Sep 28, 2016 3:20 am


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us