Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

How to delete duplicate records from a table using sql query

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DB2
View previous topic :: :: View next topic  
Author Message
rahuindo

New User


Joined: 09 Apr 2008
Posts: 83
Location: Chennai

PostPosted: Sat May 03, 2008 5:06 pm    Post subject: How to delete duplicate records from a table using sql query
Reply with quote

Hi,
I have duplicate records in a table and i want to retain the latest record and delete all the older duplicate records. Can i achieve it through a single sql query??
Thanks in advance..
Back to top
View user's profile Send private message

Craq Giegerich

Senior Member


Joined: 19 May 2007
Posts: 1512
Location: Virginia, USA

PostPosted: Sat May 03, 2008 6:26 pm    Post subject:
Reply with quote

What defines duplicate and what defines latest?
Back to top
View user's profile Send private message
Help-Me-Out

New User


Joined: 09 Dec 2006
Posts: 56
Location: Pune

PostPosted: Mon May 05, 2008 4:56 pm    Post subject: Reply to: How to delete duplicate records from a table using
Reply with quote

Hi,
try this

DELETE FROM tab_EMP
where EMP_ID IN (SELECT EMP_ID FROM TAB_EMP
GROUP BY EMP_ID
HAVING COUNT(*) > 1)
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Mon May 05, 2008 5:02 pm    Post subject:
Reply with quote

help-me-out,

Have you tested your SQL?

looks like it will delete any emp_id that exists more that once in the table.

The OP wants to retain one of the dups, apparently one with the oldest 'date'.

and since the OP has not responed since saturday, have to assume he is as bored with this question as i am.
Back to top
View user's profile Send private message
rahuindo

New User


Joined: 09 Apr 2008
Posts: 83
Location: Chennai

PostPosted: Tue May 06, 2008 8:46 am    Post subject: Reply to: How to delete duplicate records from a table using
Reply with quote

Hi All,
Thanks for your replies..i was out of town and hence sorry for the late replies.
The oldest record means the records which were the first one created. Later on, due to some problems in the application, duplicate records got inserted and hence these are the latest records.
help-me-out..the query will delete all the records which i do not want!!
i think the only way out would be to write a small program to delete these duplicate records. I think dbzTHEdinosauer and Craq Giegerich will agree..whats your take??
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue May 06, 2008 8:58 am    Post subject:
Reply with quote

Hello rahuindo,

I may be mis-reading your posts, but i read from the first post as wanting to "retain the latest" and from the last post, it is the later rows that are duplicates and should be removed.

In neither case will the sql posted by Sandy come even close. As has been noted, all duplicates would be deleted and there is no menton of a date consideration in that code. I suppose if all of them were deleted, there would be no need to check the date. . .

I'm probably old-fashioned (i really like having a backup before i do surgery on some data), but i'd probably unload the table (which would be my backup), sort the records in the proper sequence (id/date), and then discard the rows i no longer needed creating a new file i'd use to reload the table.
Back to top
View user's profile Send private message
rahuindo

New User


Joined: 09 Apr 2008
Posts: 83
Location: Chennai

PostPosted: Tue May 06, 2008 9:59 am    Post subject:
Reply with quote

Hi dick..
oops.i wrote the opposite..actually i want to retain the latest records only..i have a small doubt to the solution you have provided. after sorting the records in the proper sequence, how do i delete the rows i no longer need..will it be manual because there is no sync between the dates of the insertion of the duplicate records and the original records. and then there are more than thousand records having a duplicate records.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue May 06, 2008 10:58 am    Post subject:
Reply with quote

Hello rahuindo,

Quote:
will it be manual
No - that would take too long and be too error-prone icon_wink.gif

Quote:
i have a small doubt to the solution you have provided. after sorting the records in the proper sequence, how do i delete the rows i no longer need
Once the records are sorted by "the id" (ascending) and "the date" descending, you could probaly use your sort product or a bit of cobol to code to read the sorted data and discard all but the first record for each id. The resulting output from that step will be the ones you want to reload.

Quote:
because there is no sync between the dates of the insertion of the duplicate records and the original records
This should not be an issue (if i understand the requirement correctly) because what will retained is the "single, newest" records that meets your rule. All of the older duplicates will be discarded.
Back to top
View user's profile Send private message
rahuindo

New User


Joined: 09 Apr 2008
Posts: 83
Location: Chennai

PostPosted: Tue May 06, 2008 1:04 pm    Post subject:
Reply with quote

thanks dick..i think this will be the best option..
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Tue May 06, 2008 1:39 pm    Post subject: Reply to: How to delete duplicate records from a table using
Reply with quote

You're welcome - good luck icon_smile.gif
Back to top
View user's profile Send private message
sainathvinod

New User


Joined: 01 Apr 2008
Posts: 11
Location: Chennai

PostPosted: Tue May 06, 2008 2:40 pm    Post subject: Reply to: How to delete duplicate records from a table using
Reply with quote

If you are particular about doing it in a querry........ just try this out.................its working 4 me:-

DELETE FROM TAB_EMP
WHERE EMP_ID IN
(SELECT T1.EMP_ID FROM TAB_EMP T1
WHERE T1.DATE <
(
SELECT MAX(T2.DATE) FROM TAB_EMP T2
WHERE T1.EMP_ID = T2.EMP_ID
)
GROUP BY T1.EMP_ID
HAVING COUNT(*) >= 1)
Back to top
View user's profile Send private message
rahuindo

New User


Joined: 09 Apr 2008
Posts: 83
Location: Chennai

PostPosted: Tue May 06, 2008 2:51 pm    Post subject:
Reply with quote

Hey Sai..thanks i will give a try..this is another option which i can suggest.
Thanks a lot to all of you for all your support.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Wed May 07, 2008 5:16 am    Post subject:
Reply with quote

Hello,

No matter how you delete the unneeded rows, you still need a backup.

An audit trail of what you deleted would be appropriate also.
Back to top
View user's profile Send private message
rahuindo

New User


Joined: 09 Apr 2008
Posts: 83
Location: Chennai

PostPosted: Fri May 09, 2008 11:32 am    Post subject:
Reply with quote

Hi all,
I deleted the records with the following query:
DELETE FROM empl A
WHERE A.empl_id IN
( SELECT B.empl_id FROM empl B
GROUP BY B.empl_id
HAVING COUNT(B.empl_id) > 1 )
AND A.CRT_TS <
( SELECT MAX(C.CRT_TS) FROM empl C
WHERE C.empl= A.empl) ;
Before deleting i unloaded the table and kept a backup. The query deleted all the un-needed records.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Sat May 10, 2008 4:52 am    Post subject: Reply to: How to delete duplicate records from a table using
Reply with quote

Thank you for posting your solution icon_smile.gif

d
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DB2 All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts JCL to delete component in a package sundaram.naveen Compuware & Other Tools 14 Tue Nov 29, 2016 6:21 pm
No new posts Limit duplicate records in the SORT pshongal SYNCSORT 6 Mon Nov 21, 2016 12:54 pm
This topic is locked: you cannot edit posts or make replies. Db2 SQL query to convert rows into co... subratarec DB2 5 Thu Nov 17, 2016 4:51 pm
No new posts Two Selects in a single select query Rohit Umarjikar DB2 1 Fri Nov 04, 2016 8:46 pm
No new posts How to split the records using the am... vnktrrd DFSORT/ICETOOL 24 Fri Oct 28, 2016 7:33 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us