But the file is of fixed block. In the file itself it is possible to have the first data of a record can be more than one [As Record 1 and Record 4 are having the same AAAAAAAAA record at the beginning]. Now I want to remove such kind of duplicates and write it down to a new output file. But as I don't know the position of ','.
Can you please give me an idea to do so!!!! Is it possible thru SYNCSORT?
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
In addition to answering devzee's question, please answer the more inclusive previous questions.
Also, in your original post, the "key" match on the A's is obvious, but which record should be selected/discarded? The one with more B's or the one with less B's?
Please create better sample input data and the output(s) you want when this input is processed.
Hi Dick & Devzee,
Thanx a lot for ur responses. I am also sorry for not to clear my queries properly. However now I am trying to explain the input in proper manner -
I am having the input like -
Obviously the first word(appearing before the first ',') is the main key field. If any duplicate remains in these fields, we have to remove the duplicates having the next word whose length is less. E.g - in my exmaple cited before, my process should pick the first record (not the fourth one) as it is having more length of B's than in fourth record.
Now coming to your query Dick, if there are records like - AAAAA,BBDDBBDD and another with AAAAA,XXDDXXDD; my process can select any one of them-I mean there is no such restriction to chose which records for the case where length of the second word (here - BBDDBBDD and XXDDXXDD) are same.
Another point to be noted is that there can be more than one occurances of ',' in a statement. So
Code:
CCC,DDDDDDDDDDDDD,XXXXXXXXXXXXXX,XXXXXXXXXX
is also possible.
Please let me know what will be suitable process to solve my problem!
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
I'm still not clear on this part of your requirement
Quote:
Another point to be noted is that there can be more than one occurances of ',' in a statement. So
Code:
CCC,DDDDDDDDDDDDD,XXXXXXXXXXXXXX,XXXXXXXXXX
is also possible.
If the input had
Code:
CCC,DDDDDDDDDDD,XXXXXXXXXXXXXX,XXXXXXXXXXXX
and
Code:
CCC,DDDDDDDDDDDDD,XXXXXXXXXXXXXX,XXXXXXX
which should be kept - the one with more D's or the one with the greater length overall?
The more the requirement is clarified, the more i'd lean towards using program code rather than trying to meet the need with sort control statements.
Hi Dick,
According to your query and options provided,
Quote:
Code:
CCC,DDDDDDDDDDD,XXXXXXXXXXXXXX,XXXXXXXXXXXX
and
Code:
CCC,DDDDDDDDDDDDD,XXXXXXXXXXXXXX,XXXXXXX
I would like to clear one thing that - The first word i.e CCC is the key and the other fields are like just simple records associated with that key-CCC.
Code:
CCCCCCC,DDDDDDDDDDDDD,XXXXXXXXXXXXXX,XXXXXXX
<-Key-> <-Simple Record ->
Now as I said in my last post -
Quote:
In my exmaple cited before, my process should pick the first record (not the fourth one) as it is having more length of B's than in fourth record.
By this I wanted to say that my process will choose that record having the simple record's length larger that means according to ur example, first record will be chosen.
And if the simple record's length are same, chose any one- no issue with that.[/code]
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
Is there a way to know what is the maximum number of "sets" there can be for any particular "key". For example, might there be 4 sets with CCC as the key - might there be 30?
I believe it is time to switch gears and implement this using code rahter than sort control statements. If it were my requirement, i would be concerned that a newly discovered "rule" would cause the sort control statements to no longer work (if they existed in the first place) and then coding would still be needed.
Hi Dick,
I know man it is quite tough to implement thru SORT and may be it will not be a stable system then! I have already taken the code based approach. So u can say may be I am trying to be over-smart but nothing like that. I JUST WANT TO KNOW - is there any way to do this kind of processing thru SORT and if yes, how? U can say I am trying to explore SORT to handle these kind of scenarios (If any comes to me in future). Ha ha ha !!!
Hey Dick FYI - LRECL - 400, RECFM - FB.
Waitin 2 listen from u all guys! Give me some approach or so to handle this kind of situation !!!!
Dick - Pls dnt mind, Dick! I just want to know ... Dnt take it in other sense.
U know I am a bit
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello Amitava,
Not to worry - the more alternatives you have, the better choice you may be able to make.
As far as doing this with Syncsort, you may want to look into what they are going to release (early?) next year - doesn't help just now, but may later when "any comes to me in future". The next major release of the sort and a new version of Synctool with documentation are planned, but i've seen no actual release date so far.