View previous topic :: View next topic
Author
Message
Claes Norreen Active User Joined: 20 Dec 2005Posts: 137 Location: Denmark
Hi,
I have a dataset (no 1) with customers (key: customerno) and their phone number(s), with one record per customer per phone number. Naturally, a customer can have 0-many phone numbers.
I have another dataset (no 2) with unique customerno's. If a match on customerno is found with dataset1, all records with this number must be removed from dataset1. How can I accomblish this? I tried several types of SPLICE, but the duplicates on customerno in dataset1 are teasing me...
Assume that input datasets are not sorted in the key order.
Sample input (dataset1):
Code:
CUST01PHONE01
CUST01PHONE02
CUST02PHONE03
CUST03PHONE04
CUST04PHONE05
CUST04PHONE06
CUST04PHONE07
CUST06PHONE08
CUST07PHONE09
CUST07PHONE10
CUST09PHONE11
CUST10PHONE12
CUST10PHONE13
Sample input (dataset2):
Code:
CUST00
CUST02
CUST03
CUST05
CUST07
CUST08
CUST10
CUST11
Sample output:
Code:
CUST01PHONE01
CUST01PHONE02
CUST04PHONE05
CUST04PHONE06
CUST04PHONE07
CUST06PHONE08
CUST09PHONE11
Thanks,
Claes
Back to top
Claes Norreen Active User Joined: 20 Dec 2005Posts: 137 Location: Denmark
Hmm, seems I didn't try this SPLICE:
Code:
//TOOLIN DD *
COPY FROM(IN2) TO(T1) USING(CTL1)
COPY FROM(IN1) TO(T1) USING(CTL2)
SPLICE FROM(T1) TO(OUT) ON(1,6,CH) -
WITH(7,7) -
WITH(80,1) -
WITHALL KEEPBASE KEEPNODUPS USING(CTL3)
//CTL1CNTL DD *
OUTFIL OVERLAY=(79:C'BB')
//CTL2CNTL DD *
OUTFIL OVERLAY=(79:C'VV')
//CTL3CNTL DD *
OUTFIL FNAMES=OUT,INCLUDE=(79,2,CH,EQ,C'VV')
...which works..
(Where IN1 is dataset1 and IN2 is dataset2).
/Claes
Back to top
sqlcode1 Active Member Joined: 08 Apr 2010Posts: 577 Location: USA
Claes Norreen,
Please check if below mentioned single pass solution works for you...I am assuming your both input files are 80 bytes FB.
Frank/Kolusu,
Since OP's 2nd file is unique for customerno, could below method be used as a single pass solution?
Code:
//SORT00 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD *
//OUT DD DSN=&&HDR,DISP=(,PASS),SPACE=(TRK,(1,0),RLSE)
//SYSIN DD *
OPTION COPY
OUTFIL FNAMES=OUT,REMOVECC,NODETAIL,TRAILER1=('HDR',80:X)
/*
//*
//SORT01 EXEC PGM=SORT
//SORTIN DD DSN=&HDR,DISP=SHR,VOL=REF=*.SORT00.OUT
// DD *
CUST01PHONE01
CUST01PHONE02
CUST02PHONE03
CUST03PHONE04
CUST04PHONE05
CUST04PHONE06
CUST04PHONE07
CUST06PHONE08
CUST07PHONE09
CUST07PHONE10
CUST09PHONE11
CUST10PHONE12
CUST10PHONE13
/*
// DD DSN=&HDR,DISP=SHR,VOL=REF=*.SORT00.OUT
// DD *
CUST00
CUST02
CUST03
CUST05
CUST07
CUST08
CUST10
CUST11
/*
//SORTOUT DD SYSOUT=*
//SYSIN DD *
INREC IFTHEN=(WHEN=GROUP,BEGIN=(1,3,CH,EQ,C'HDR'),
PUSH=(81:ID=1))
SORT FIELDS=(1,6,CH,A,81,1,ZD,D)
OUTREC IFTHEN=(WHEN=GROUP,BEGIN=(81,1,ZD,EQ,2),PUSH=(82:1,6))
OUTFIL INCLUDE=(1,6,CH,NE,82,6,CH),BUILD=(1,80)
//SYSOUT DD SYSOUT=*
//SYSPRINT DD SYSOUT=*
//SYSUDUMP DD SYSOUT=D
OUTPUT
Code:
CUST01PHONE01
CUST01PHONE02
CUST04PHONE05
CUST04PHONE06
CUST04PHONE07
CUST06PHONE08
CUST09PHONE11
Thanks,
Back to top
Skolusu Senior Member Joined: 07 Dec 2007Posts: 2205 Location: San Jose
Claes Norreen,
Ideally I would have used JOINKEYS for such requirement.
Code:
//STEP0010 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//INA DD *
CUST01PHONE01
CUST01PHONE02
CUST02PHONE03
CUST03PHONE04
CUST04PHONE05
CUST04PHONE06
CUST04PHONE07
CUST06PHONE08
CUST07PHONE09
CUST07PHONE10
CUST09PHONE11
CUST10PHONE12
CUST10PHONE13
//INB DD *
CUST00
CUST02
CUST03
CUST05
CUST07
CUST08
CUST10
CUST11
//SORTOUT DD SYSOUT=*
//SYSIN DD *
OPTION COPY
JOINKEYS F1=INA,FIELDS=(1,6,A)
JOINKEYS F2=INB,FIELDS=(1,6,A)
JOIN UNPAIRED,F1,ONLY
//*
If Claes Norreen does not have the PTF which supports Joinkeys , then here is an alternative version of the job using WHEN=GROUP
Code:
//STEP0100 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD *
//SORTOUT DD DSN=&&HDR,DISP=(,PASS),SPACE=(TRK,(1,0),RLSE)
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL REMOVECC,NODETAIL,BUILD=(80X),HEADER1=('$$$')
//*
//STEP0200 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=&&HDR,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT
// DD *
CUST00
CUST02
CUST03
CUST05
CUST07
CUST08
CUST10
CUST11
// DD DSN=&&HDR,DISP=SHR,VOL=REF=*.STEP0100.SORTOUT
// DD *
CUST01PHONE01
CUST01PHONE02
CUST02PHONE03
CUST03PHONE04
CUST04PHONE05
CUST04PHONE06
CUST04PHONE07
CUST06PHONE08
CUST07PHONE09
CUST07PHONE10
CUST09PHONE11
CUST10PHONE12
CUST10PHONE13
//SORTOUT DD SYSOUT=*
//SYSIN DD *
INREC IFTHEN=(WHEN=GROUP,BEGIN=(1,3,CH,EQ,C'$$$'),PUSH=(81:ID=1))
SORT FIELDS=(1,6,CH,A),EQUALS
OUTREC IFTHEN=(WHEN=INIT,OVERLAY=(83:SEQNUM,8,ZD,RESTART=(1,6))),
IFTHEN=(WHEN=GROUP,BEGIN=(83,8,ZD,EQ,1),PUSH=(82:81,1))
OUTFIL INCLUDE=(81,2,ZD,EQ,22),BUILD=(1,80)
//*
sqlcode1,
You don't have to sort on the indicator , if you concatenated the unique file first in the list.
By doing so you don't even to push down the key for all the records and then compare.
Back to top
sqlcode1 Active Member Joined: 08 Apr 2010Posts: 577 Location: USA
Kolusu,
I tried using unique file as first one but I was trying to prevent an extra IFTHEN condition in the OUTREC.
Only, now I have understood that actually that was indeed more efficient way.
Thanks again for the feedback.
Thanks,
Back to top
Claes Norreen Active User Joined: 20 Dec 2005Posts: 137 Location: Denmark
Thanks for the feedback. In my initial post, I should have noted, that the input files do not have the same LRECL in "the real world". Actually, as this is part of a PROC, dataset1's LRECL will vary - so I doubt I can use a single pass? You may use LRECL=13 for dataset1 and LRECL=6 for dataset2, if you want to have a go at it.
Unfortunately we do not have the PTF for JOINKEY at my site yet.
Back to top
Skolusu Senior Member Joined: 07 Dec 2007Posts: 2205 Location: San Jose
Claes Norreen,
Use the following DFSORT JCL which will give you the desired results.
Code:
//STEP0100 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=Your input 6 byte FB file,DISP=SHR
//SORTOUT DD DSN=&&TM13,DISP=(,PASS),SPACE=(CYL,(X,Y),RLSE)
//SYSIN DD *
SORT FIELDS=COPY
OUTFIL REMOVECC,BUILD=(1,6,13:X),
HEADER1=('$$$'),TRAILER1=('$$$')
//*
//STEP0200 EXEC PGM=SORT
//SYSOUT DD SYSOUT=*
//SORTIN DD DSN=&&TM13,DISP=SHR
// DD DSN=Your input 13 byte FB file,DISP=SHR
//SORTOUT DD SYSOUT=*
//SYSIN DD *
INREC IFTHEN=(WHEN=GROUP,BEGIN=(1,3,CH,EQ,C'$$$'),PUSH=(14:ID=1))
SORT FIELDS=(1,6,CH,A),EQUALS
OUTREC IFTHEN=(WHEN=INIT,OVERLAY=(16:SEQNUM,8,ZD,RESTART=(1,6))),
IFTHEN=(WHEN=GROUP,BEGIN=(16,8,ZD,EQ,1),PUSH=(15:14,1))
OUTFIL INCLUDE=(14,2,ZD,EQ,22),BUILD=(1,13)
//*
Back to top
Claes Norreen Active User Joined: 20 Dec 2005Posts: 137 Location: Denmark
Great stuff! Thank you.
Back to top
Claes Norreen Active User Joined: 20 Dec 2005Posts: 137 Location: Denmark
Ran a performance test on yours vs. mine, and it turns out that mine are 50% faster - despite the extra pass of dataset1..!? How come, I wonder?
Dataset1: approx. 6,6 mill. records (LRECL=78)
Dataset2: approx. 16,000 records (LRECL=40)
Total CPU used:
Mine: 10,08 secs
Yours: 15,45 secs
Back to top
Skolusu Senior Member Joined: 07 Dec 2007Posts: 2205 Location: San Jose
Claes Norreen wrote:
Ran a performance test on yours vs. mine, and it turns out that mine are 50% faster - despite the extra pass of dataset1..!? How come, I wonder?
Dataset1: approx. 6,6 mill. records (LRECL=78)
Dataset2: approx. 16,000 records (LRECL=40)
Total CPU used:
Mine: 10,08 secs
Yours: 15,45 secs
I would like to see the complete sysout from the 2 jobs.
Please run both jobs adding this statement to both jobs
Code:
//SORTDIAG DD DUMMY
And send me the complete sysout to skolusu@us.ibm.com
Back to top
Claes Norreen Active User Joined: 20 Dec 2005Posts: 137 Location: Denmark
Will do tomorrow - work is off for today. ;-)
Back to top
Please enable JavaScript!