I want to do a sort on the file so that for each country I keep the three records of the most populous cities. Sometimes I will have less then three cities for a country to start with, sometimes more, and sometimes exactly three.
So for this input:
Code:
CAN CIT1 25
CAN CIT2 16
CAN CIT3 15
CAN CIT4 14
CAN CIT5 13
GBR CITA 65
GBR CITB 45
GBR CITC 12
USA CITX 66
USA CITZ 45
I would want to end up with:
Code:
CAN CIT1 25
CAN CIT2 16
CAN CIT3 15
GBR CITA 65
GBR CITB 45
GBR CITC 12
USA CITX 66
USA CITZ 45
I've provided the records already sorted in descending population with respect to each Country. In case it is relevant, I know that all Country names are unique and all City names are unique (ie can not occur from Country to Country). From any run to another I will not know what Countries or Cities I will encounter.
Thanks for any help and I apologize in advance if this has a simple solution, I have tried to find answer on my own but got nowhere.
I keep the three records of the most populous cities
Is there any logic to pick popular cities else you can try below approach.
You can group by country and sequence number and restart the sequence number for next Group and so on and in OUTFIL filter to include up 3 records per group.
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
Rohit Umarjikar wrote:
Quote:
I keep the three records of the most populous cities
Is there any logic to pick popular cities else you can try below approach.
Rohit - He needs the most 'populous', highest to least population.
Regardless, the idea is more or less the same. Sort descending on the population for each country, Assign a SEQuence number in the OUTREC for each GROUP, where KEYBEGINs when country changes. And in OUTFIL INCLUDE SEQuences less than or equals 3.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
If your data is already in order, just use a SEQNUM with RESTART for the key and the INCLUDE= and BUILD on the OUTFIL. Your sequence number should be large enough for the maximum number of cities. One is probably cutting it fine, eight is a bit much :-)