View previous topic :: View next topic
|
Author |
Message |
sunnyk
New User
Joined: 20 Oct 2004 Posts: 59
|
|
|
|
Hi all,
I have one query on sort.
I am having records(in a set of three) in a file which is having duplicates after a rerun of that particulat step.Let me elaborate it:
There is a Stepxxx which produces a dataset abc.xyz
abc.xyz
Rec1:12342005.12.12.57H732832938jkdkdsdk1111
Rec2: 12342005.12.12.57D732832938jkdkdsdk1111
Rec3: 12342005.12.12.57I732832938jkdkdsdk1111
Dups(after rerun of stepxxx):
Rec1:12342005.12.12.30H732832938jkdkdsdk1111
Rec2: 12342005.12.12.30D732832938jkdkdsdk1111
Rec3: 12342005.12.12.30I732832938jkdkdsdk1111
All the above six records are in the same output dataset abc.xyz.
Now how to sort the above file so that i get only latest run records in the output file aaa.xyz.
Like the date 2005.12.12.57 records with H,D,I after the date parameter i shud get in the output b`coz 57 is > 30 in the duplicate(after rerun of the file).This is only parameter that changes after rerun.
If u don`t get it,i will try explaining more.
Thanks
sunny |
|
Back to top |
|
 |
sivaplv
New User
Joined: 15 Mar 2005 Posts: 17 Location: Toronto, Canada
|
|
|
|
Hi Sunnyk,
If I understand your issue correctly, here is how you can get only the latest run records into an output file from the 'so called' duplicate records file.
If the date stamp is same for all the records, then you can have this date field in INCLUDE statement of SORT, to have all the records with this date stamp written in the same order as the input file,
The SYSIN DD statement would be:
//SYSIN DD*
SORT FIELDS=COPY
INCLUDE COND=(5,13,CH,EQ,C'2005.12.12.57')
//
If the date stamp is greater than or equal to '2005.12.12.57' then
The SYSIN DD statement would be:
//SYSIN DD*
SORT FIELDS=COPY
INCLUDE COND=(5,13,CH,GE,C'2005.12.12.57')
//
Hope this helps.
Regards, |
|
Back to top |
|
 |
Frank Yaeger
DFSORT Developer

Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
sunnyk,
I assume that you don't want to hardcode the actual most current timestamp as Siva suggests since the timestamp will change each time you do the run.
You talk about the records being duplicates. "Duplicates" means that a pair of records has the same values in a particular field or fields. In your case, the pairs of records have different timestamps so they are obviously not duplicates on the timestamp. So I'll assume that the other fields besides the timestamp (for example, 1234 and H732832938jkdkdsdk1111 for the H pair) are what make the records duplicates. Given that assumption, you can use the following DFSORT/ICETOOL job to get the record with the latest timestamp for each pair of "duplicate" records:
Code: |
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD *
12342005.12.12.57H732832938jkdkdsdk1111
12342005.12.12.57D732832938jkdkdsdk1111
12342005.12.12.57I732832938jkdkdsdk1111
12342005.12.12.30H732832938jkdkdsdk1111
12342005.12.12.30D732832938jkdkdsdk1111
12342005.12.12.30I732832938jkdkdsdk1111
/*
//OUT DD SYSOUT=*
//TOOLIN DD *
SELECT FROM(IN) TO(OUT) ON(1,4,CH) ON(18,22,CH) FIRST USING(CTL1)
/*
//CTL1CNTL DD *
SORT FIELDS=(1,4,CH,A,18,22,CH,A,5,13,CH,A)
/*
|
OUT will have:
Code: |
12342005.12.12.57D732832938jkdkdsdk1111
12342005.12.12.57H732832938jkdkdsdk1111
12342005.12.12.57I732832938jkdkdsdk1111
|
Note that the output records are in sorted order (D, H, I). |
|
Back to top |
|
 |
sunnyk
New User
Joined: 20 Oct 2004 Posts: 59
|
|
|
|
Hi frank,
Thanks for ur quick response.But the problem is still half solved.Actually i want the output in the form H,D,I sequence i.e same as input dataset.But as ur output shows its sorted on that field too(field number 18).So is there any way to keep it as it is in H/D/I sequence.
And once again thanks
regds
sunny |
|
Back to top |
|
 |
Frank Yaeger
DFSORT Developer

Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
Quote: |
is there any way to keep it as it is in H/D/I sequence. |
Yes, by adding a sequence number we can sort on to get the records back in their original order, but it will take a couple more passes over the data. Here's the DFSORT/ICETOOL job to do it:
Code: |
//S1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//IN DD *
12342005.12.12.57H732832938jkdkdsdk1111
12342005.12.12.57D732832938jkdkdsdk1111
12342005.12.12.57I732832938jkdkdsdk1111
12342005.12.12.30H732832938jkdkdsdk1111
12342005.12.12.30D732832938jkdkdsdk1111
12342005.12.12.30I732832938jkdkdsdk1111
/*
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//T2 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD SYSOUT=*
//TOOLIN DD *
COPY FROM(IN) TO(T1) USING(CTL1)
SELECT FROM(T1) TO(T2) ON(1,4,CH) ON(18,22,CH) FIRST USING(CTL2)
SORT FROM(T2) TO(OUT) USING(CTL3)
/*
//CTL1CNTL DD *
INREC FIELDS=(1,80,81:SEQNUM,8,ZD)
/*
//CTL2CNTL DD *
SORT FIELDS=(1,4,CH,A,18,22,CH,A,5,13,CH,A)
/*
//CTL3CNTL DD *
SORT FIELDS=(81,8,ZD,A)
/*
|
|
|
Back to top |
|
 |
|
|