Reg SORT on different records

sunnyk · New User Joined: 20 Oct 2004 Posts: 59

Hi all,
I have one query on sort.
I am having records(in a set of three) in a file which is having duplicates after a rerun of that particulat step.Let me elaborate it:
There is a Stepxxx which produces a dataset abc.xyz

abc.xyz

Rec1:12342005.12.12.57H732832938jkdkdsdk1111
Rec2: 12342005.12.12.57D732832938jkdkdsdk1111
Rec3: 12342005.12.12.57I732832938jkdkdsdk1111
Dups(after rerun of stepxxx):
Rec1:12342005.12.12.30H732832938jkdkdsdk1111
Rec2: 12342005.12.12.30D732832938jkdkdsdk1111
Rec3: 12342005.12.12.30I732832938jkdkdsdk1111

All the above six records are in the same output dataset abc.xyz.
Now how to sort the above file so that i get only latest run records in the output file aaa.xyz.
Like the date 2005.12.12.57 records with H,D,I after the date parameter i shud get in the output b`coz 57 is > 30 in the duplicate(after rerun of the file).This is only parameter that changes after rerun.

If u don`t get it,i will try explaining more.
Thanks
sunny

sivaplv · Posted: Mon Apr 18, 2005 7:25 pm

Hi Sunnyk,

If I understand your issue correctly, here is how you can get only the latest run records into an output file from the 'so called' duplicate records file.

If the date stamp is same for all the records, then you can have this date field in INCLUDE statement of SORT, to have all the records with this date stamp written in the same order as the input file,

The SYSIN DD statement would be:

//SYSIN DD*
SORT FIELDS=COPY
INCLUDE COND=(5,13,CH,EQ,C'2005.12.12.57')
//

If the date stamp is greater than or equal to '2005.12.12.57' then

The SYSIN DD statement would be:

//SYSIN DD*
SORT FIELDS=COPY
INCLUDE COND=(5,13,CH,GE,C'2005.12.12.57')
//

Hope this helps.

Regards,

Frank Yaeger · Posted: Mon Apr 18, 2005 9:13 pm

sunnyk,

I assume that you don't want to hardcode the actual most current timestamp as Siva suggests since the timestamp will change each time you do the run.

You talk about the records being duplicates. "Duplicates" means that a pair of records has the same values in a particular field or fields. In your case, the pairs of records have different timestamps so they are obviously not duplicates on the timestamp. So I'll assume that the other fields besides the timestamp (for example, 1234 and H732832938jkdkdsdk1111 for the H pair) are what make the records duplicates. Given that assumption, you can use the following DFSORT/ICETOOL job to get the record with the latest timestamp for each pair of "duplicate" records:

sunnyk · New User Joined: 20 Oct 2004 Posts: 59

Hi frank,
Thanks for ur quick response.But the problem is still half solved.Actually i want the output in the form H,D,I sequence i.e same as input dataset.But as ur output shows its sorted on that field too(field number 18).So is there any way to keep it as it is in H/D/I sequence.

And once again thanks
regds
sunny

Frank Yaeger · Posted: Tue Apr 19, 2005 6:54 pm