View previous topic :: View next topic
|
Author |
Message |
ashish_explorer
New User
Joined: 13 Feb 2007 Posts: 2 Location: noida
|
|
|
|
Hi,
Can anybody help me out to solve this.
I have a query about finding duplicates in two different files on a given condition.
I have a file having duplicate records based on some field say last three bytes (Any record can have One or more than one duplicate or no duplicate at all).
And there is another file having having same attributes.
Now I need to find out records having more than one duplicates in first file but for the same records in second file, number of duplicates are not same as first file.
for example,
First file contains record like
012 890
236 456
124 999
345 999
245 215
103 888
234 888
789 888
154 458
And second file consist of records like
785 259
124 999
245 215
234 888
154 458
since only 234 888 satisfy the criteria so only this record is to be written to output file.
Please advise for the same and let me know if the information is not enough.
If it helps assume files are sorted on last three bytes.
Best Regards,
Ashish |
|
Back to top |
|
|
Deepa.m
New User
Joined: 28 Apr 2005 Posts: 99
|
|
|
|
As per your description and the example provided the below record is also qualified to write in output
124 999
as it has more than 1 duplicate in first file and also occurs in 2nd file.
Please clarify. |
|
Back to top |
|
|
Frank Yaeger
DFSORT Developer
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
|
|
|
|
Ashish,
Your "rules" are rather difficult to understand based on the example you show. As Deepa points out, shouldn't the 124 999 record be in the output - if not, why not?
Can file2 have duplicates? If no, then say so and ignore the rest of this post.
If yes, do you want all of the file2 duplicates? And do you want file2 records that have the same number of duplicates as the file1 records?
What output would you expect for this input:
Input file1
012 890
236 456
124 999
345 999
245 215
103 888
234 888
789 888
154 458
222 000
222 000
333 001
333 001
444 002
Input file2
785 259
124 999
245 215
234 888
154 458
222 000
222 000
222 000
333 001
333 001
444 002
444 002 |
|
Back to top |
|
|
|