I have a file which has duplicate records and a count field in each record. I have to replace all the duplicate records with a single record. The single record count field should have the sum of count field from the duplicate records.
The input/output file is FB.
The sort keys are 1,3,ch and 5,2,ch and count is 8,1,zd
input file:
Code:
aaa xx 1
aaa xx 1
bbb xx 2
bbb xx 1
bbb yy 1
ccc xx 1
ccc yy 2
ccc yy 1
ddd xx 1
the output file should be:
Code:
aaa xx 2
bbb xx 3
bbb yy 1
ccc xx 1
ccc yy 3
ddd xx 1
I have come as far as identifying the duplicate records and extracting them to a temporary file. Then I am extracting the first of the duplicate record. But I am stuck at how to sum the count field to put in the single record of the duplicate records.
This will give me the first of duplicate records in OUT1 and the non-duplicate records in OUT2. Once i get each record in OUT1 updated with the sum of count fields, i can sort it back in to OUT2 to get the desired output.