How can I use sort on the following FB LRECL=500 file to remove duplicate records. There are 2 record types. In this file example the first four records are unique, the remaining 8 are duplicates of the first four. I need to maintain the order of the first four records and have the duplicates (last eight removed):
Currently using this, which is causing the correct recs to be written (no dups) just in ascending order, which is not desired:
SORT FIELDS=(1,500,A),FORMAT=CH
SUM FIELDS=NONE
END
Is there some command to skip every other record so that only the first four records are written in the existing order?
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
You could use any supported numeric format for the sequence numbers - ZD, PD, BI or FS. I like to use ZD because it's readable. But any of the others would work as well. For example, instead of SEQNUM,8,ZD, you could use SEQNUM,5,PD and save some bytes (but sacrifice readability when you're debugging).
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
andrearak,
Here is an alternate way doing it using the new WHEN=GROUP function in one pass. You want to consider every 2 records as a single record and remove the duplicates. Using when=Group we push the first record on to the second record and sort on the full 1000 bytes and remove the duplicates.
Using OUTFIL we write out the original order once again