View previous topic :: View next topic
|
Author |
Message |
rmd3003
New User
Joined: 03 Jul 2006 Posts: 55
|
|
|
|
Hello everybody. I've tried but looks like it's too advanced for me.
I need to find duplicate rows in DB2 table but simple "GROUP BY / HAVING COUNT > 1" doesn't work.
Let's say I have two columns: Rows will be considered duplicates if ITEM-NO matches and SETUP_DATE is within 5 days (+/- 5 days range) of each other. In example below rows 1,2 and 4,5 are considered duplicates.
ITEM-NO SETUP_DATE
1. 1111111 10-10-2007
2. 1111111 10-12-2007 >>> +2 days
3. 1111111 11-30-2007
4. 2222222 01-14-2007
5. 2222222 01-11-2007 >>> -3 days |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
What should be done with the "duplicates"? To determine duplication, there must be 2 rows - is data from both to be used?
Is it possible for an item_no to have multiple rows with the same setup_date? Are these considered "duplicates" or does it need to be a different date in the allowable date range to be "duplicate"? How would these be treated?
Using the following data:
Code: |
1. 111111 10-10-2007
1a 111111 10-11-2007
2. 111111 10-12-2007
3. 111111 11-30-2007
4. 222222 01-14-2007
5. 222222 01-11-2007 |
With the addition of the 1a row, what should happen?
Until there is more clear definition of the requirement, any suggestion wouild be a guess - might be a good guess, but still a guess. . . |
|
Back to top |
|
|
rmd3003
New User
Joined: 03 Jul 2006 Posts: 55
|
|
|
|
Thanks for reply. Yes, if you add row 1a, I want to see it too. Basically, in your case I want to see rows 1, 1a, 2 and 4 and 5. Row 3 we will skip because it's more than +/- 5 days to the closest date even though item-no is the same. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Is there a maximum number of rows in the table for any particular item_no? |
|
Back to top |
|
|
TG Murphy
Active User
Joined: 23 Mar 2007 Posts: 148 Location: Ottawa Canada
|
|
|
|
Looks to me like you need to join the table to itself where a.ITEM_NO = b.ITEM_NO and A.SETUP_DATE - B.SETUP_DATE BETWEEN -5 AND 5
This if off the top of my head and may or may not work. Or may require other conditions. But it is worth pursuing.. |
|
Back to top |
|
|
martin9
Active User
Joined: 01 Mar 2006 Posts: 290 Location: Basel, Switzerland
|
|
|
|
hy,
try i subselect in the where/having clause...
i.e.
select columns-you-need from table
where 1 < (select count(*) from ....
where item-no is the same
and setup-date between -5 to +5)
just code the correct join correlations...
you can try something similar like that...
maybe you need to group it first also...
regards,
martin9 |
|
Back to top |
|
|
|