IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Removing selective rows using DELETE SQL


IBM Mainframe Forums -> DB2
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
himanshupant

New User


Joined: 21 Mar 2007
Posts: 46
Location: India

PostPosted: Tue Feb 08, 2011 6:58 pm
Reply with quote

Hello

I have a query ( bad pun)

Suppose there is a table with following rows

Code:
Branch No          Account No          Date            TYPE     AMOUNT

11111                 22222               2010-12-01      C          35
11111                 22222               2010-12-01      C          10
11111                 22222               2011-01-01      C          10
11111                 22222               2011-01-01      C          10
11111                 22222               2011-02-01      C          10
11111                 22222               2011-03-01      C          30
11111                 22222               2011-04-01      D          10


We want to delete all duplicate rows in a way that

1) For a given DATE there should only be ONE C type record.Furthermore that C record should be the one which has the least AMOUNT values amongst all the C records for that particular branch and account on that particular day.
2) In case two or more records have same value of amount and type C on a given day then only one occurrence should be there.

So the above table after execution of the SQL should look like as below:-

Code:
Branch No          Account No          Date            TYPE     AMOUNT

11111                 22222               2010-12-01      C          10
11111                 22222               2011-01-01      C          10
11111                 22222               2011-02-01      C          10
11111                 22222               2011-03-01      C          30
11111                 22222               2011-04-01      D          10


A point to be noted is that branch + account combination is unique. In other words a same account number could be existing in another branch number. Similarly a single branch would have multiple accounts.

I think we need to use NESTED SELECT inside the WHERE clause of DELETE SQL but am not able to get this out.
Back to top
View user's profile Send private message
Kjeld

Active User


Joined: 15 Dec 2009
Posts: 365
Location: Denmark

PostPosted: Wed Feb 09, 2011 2:55 am
Reply with quote

Is this question a student assigment?

Mass deletions are not DB2's favorite. Depending on data quantities, I would consider unloading the data, sort the unload in Branch #, Account #, Date, Type, Amount. The sort routine could possibly be configured only to output the first occurrence of the records of type 'C' for a given combination of the first 4 keys. Alternatively, I would code a small batch program to do it.

Then you load the result dataset to the table with replace option.
Back to top
View user's profile Send private message
himanshupant

New User


Joined: 21 Mar 2007
Posts: 46
Location: India

PostPosted: Wed Feb 09, 2011 7:10 am
Reply with quote

this is not a student assignment..One time data cleanup as described above needs to be done that is why SQL DELETE would be preferred
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Wed Feb 09, 2011 9:10 am
Reply with quote

Hello,

The better solution would to be do as suggested and retain the "footprints" of what was done. A blind delete provides no audit trail. . .

Also, as this has happened, having a solid solution will help if something similar happens one day.

There is no reason to limit solutions to the smallest amount of sql code that can be written.
Back to top
View user's profile Send private message
himanshupant

New User


Joined: 21 Mar 2007
Posts: 46
Location: India

PostPosted: Wed Feb 09, 2011 9:27 am
Reply with quote

Sure Dick...

I concur with your point that DELETE does leave no audit trail.. I had the routine safety measures in mind like taking a pre DELETE IMAGE COPY , exhaustive checkouts once DELETE is done so ensure that no unwanted data is remaining and no genuine data is lost.

Development effort for batch program would have made sense had this been a regular cleanup ...

However I just thought of a altogether different approach to tweak the data retrieval SELECT queries so that the records which I want to be deleted are not FETCHED in the first place in any application program.

Something like the below

Code:

SELECT                               
      BRANCH NO,
      ACCOUNT NO,         
      DATE,               
      TYPE,               
  MIN(AMOUNT)                     
      FROM TABLE     
      GROUP BY                       
      BRANCH NO,             
      ACCOUNT NO,       
      DATE,               
      TYPE                 
      HAVING  TYPE = 'C'
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Wed Feb 09, 2011 9:40 am
Reply with quote

Hello,

Quote:
different approach to tweak the data retrieval SELECT queries so that the records which I want to be deleted are not FETCHED in the first place in any application program.
I don't understand how this deals with the requirement that was being talked about originally icon_confused.gif

If the table has become corrupted with rows that should not be in the table, how will this SELECT really help? I suggest you correct the data in the table and not make a mess througout the entire application as well as any "other" processes that need to read the table. What about ad-hoc queries/reporting?
Back to top
View user's profile Send private message
himanshupant

New User


Joined: 21 Mar 2007
Posts: 46
Location: India

PostPosted: Thu Feb 10, 2011 12:45 pm
Reply with quote

Hi Dick

I am wanting to tweak the SELECT because very few rows are corrupted and the development effort required to cleanup them up would be difficult to justify in my establishment.

The above SELECT query which I have highlighted is helping me to get the required rows where TYPE is "C"

But the requirement is to get all the possible rows for each branch / account between two dates. The requirement for TYPE = C still stands . So for a branch / account if on a give day between the given date range , if there are multiple 'C' type records we only select single minimum amount from it.
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Thu Feb 10, 2011 1:18 pm
Reply with quote

you shouldn't use HAVING for fields which are know before grouping.
there is no point in grouping everything and then discarding <> "C"

Code:
SELECT BRANCH NO, ACCOUNT NO, DATE, TYPE, MIN(AMOUNT)
      FROM TABLE1
      where TYPE = 'C'
GROUP BY BRANCH NO, ACCOUNT NO, DATE, TYPE
Back to top
View user's profile Send private message
Kjeld

Active User


Joined: 15 Dec 2009
Posts: 365
Location: Denmark

PostPosted: Thu Feb 10, 2011 1:26 pm
Reply with quote

The requirement was to DELETE the rows. I am not sure if it is at all possible to delete specific rows from a select with aggregate functions imbedded in it.
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Thu Feb 10, 2011 1:30 pm
Reply with quote

you can use aggregate functions in a subselect of a delete, but :
I don't think you can Delete 1 of 2 "completely the same rows".
maybe with RID(), but haven't tried that yet.
Back to top
View user's profile Send private message
himanshupant

New User


Joined: 21 Mar 2007
Posts: 46
Location: India

PostPosted: Thu Feb 10, 2011 2:48 pm
Reply with quote

Kjeld wrote:
The requirement was to DELETE the rows. I am not sure if it is at all possible to delete specific rows from a select with aggregate functions imbedded in it.



Unfortunately it looks like this only so I have started thinking on changing the data retrieval routines so that unwanted rows are not in the result set.. Maybe a collection of UNION with SELECT along with nested queries...
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Thu Feb 10, 2011 4:41 pm
Reply with quote

DB2 is mostly limited by the knowledge of its users.

Code:
select BRANCHNO, ACCNO, DT1, type, AMOUNT , rid(A)
,case when rid(A) is distinct from (      select min(rid(b)) from table1 B
         where a.BRANCHNO = b.BRANCHNO
           and a.ACCNO = b.accno
           and a.DT1 = b.dt1
           and a.type = b.type
           and a.AMOUNT = (select min(amount) from table1 C
            where a.BRANCHNO = C.BRANCHNO
              and a.ACCNO = C.accno
              and a.DT1 = C.dt1
              and a.type = C.type )
          )
      then '<=Del' else '' end
      from  table1 A
     where A.type = 'C'



Code:
delete
  from table1 A
 where A.type = 'C'
   and rid(A) is distinct from  (
      select min(rid(b)) from table1 B
         where a.BRANCHNO = b.BRANCHNO
           and a.ACCNO = b.accno
           and a.DT1 = b.dt1
           and a.type = b.type
           and a.AMOUNT = (select min(amount) from table1 C
            where a.BRANCHNO = C.BRANCHNO
              and a.ACCNO = C.accno
              and a.DT1 = C.dt1
              and a.type = C.type )
          )
Back to top
View user's profile Send private message
Kjeld

Active User


Joined: 15 Dec 2009
Posts: 365
Location: Denmark

PostPosted: Thu Feb 10, 2011 5:07 pm
Reply with quote

GuyC wrote:
DB2 is mostly limited by the knowledge of its users.

True, but often knowledge is aquired when it is actively needed. And then your true skills in aquiring knowledge efficiently will make the difference.

Anyway, it looks like a neat solution, Guy icon_cool.gif
Back to top
View user's profile Send private message
himanshupant

New User


Joined: 21 Mar 2007
Posts: 46
Location: India

PostPosted: Thu Feb 10, 2011 9:25 pm
Reply with quote

Kjeld wrote:
GuyC wrote:
DB2 is mostly limited by the knowledge of its users.

True, but often knowledge is aquired when it is actively needed. And then your true skills in aquiring knowledge efficiently will make the difference.

Anyway, it looks like a neat solution, Guy icon_cool.gif


I do agree it does look very neat.. and I do agree with you that knowledge is aquired when actively needed . One more example of this is various possible usage of IBM DFSORT
Back to top
View user's profile Send private message
himanshupant

New User


Joined: 21 Mar 2007
Posts: 46
Location: India

PostPosted: Thu Feb 10, 2011 9:29 pm
Reply with quote

GuyC wrote:
DB2 is mostly limited by the knowledge of its users.

Code:
select BRANCHNO, ACCNO, DT1, type, AMOUNT , rid(A)
,case when rid(A) is distinct from (      select min(rid(b)) from table1 B
         where a.BRANCHNO = b.BRANCHNO
           and a.ACCNO = b.accno
           and a.DT1 = b.dt1
           and a.type = b.type
           and a.AMOUNT = (select min(amount) from table1 C
            where a.BRANCHNO = C.BRANCHNO
              and a.ACCNO = C.accno
              and a.DT1 = C.dt1
              and a.type = C.type )
          )
      then '<=Del' else '' end
      from  table1 A
     where A.type = 'C'



Code:
delete
  from table1 A
 where A.type = 'C'
   and rid(A) is distinct from  (
      select min(rid(b)) from table1 B
         where a.BRANCHNO = b.BRANCHNO
           and a.ACCNO = b.accno
           and a.DT1 = b.dt1
           and a.type = b.type
           and a.AMOUNT = (select min(amount) from table1 C
            where a.BRANCHNO = C.BRANCHNO
              and a.ACCNO = C.accno
              and a.DT1 = C.dt1
              and a.type = C.type )
          )



Correct me if I am wrong but the DELETE query looks like it would delete only one occurence of the minimum C type record. In case there are 3 'C' entries with same value , only one value should remain
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri Feb 11, 2011 12:55 am
Reply with quote

Hello,

I'm a bit late to the party, but this:
Quote:
I am wanting to tweak the SELECT because very few rows are corrupted and the development effort required to cleanup them up would be difficult to justify in my establishment.
makes no sense. . . The "development effort" should be minutes possibly even an hour or 2. . .

Versus the "effort" of propogating this thru every query that does (or ever will be written) query the table. The small number of rows should not even be considered. It is the code that will take the time and cause confusion.

Had the first reply/suggestion from Kjeld been implemented this would have been complete 2 days ago. . .

Just because data lives in a database is no reason to use only the database to work on the data. Wholesale processes are often better served operating outside the database/sql.
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Fri Feb 11, 2011 1:52 pm
Reply with quote

DB2 version ?
I think RID() came in DB2 v9 NFM.

If two rows are completely identical(all columns), the only difference is the RID.

pre v9 you have a few possibilities ( some already explained) :
1) do an unload/sort/reload
2) write a small program with a cursor and do a DELETE WHERE CURRENT OF
3) create a new table and do an insert (select group by)
...
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Fri Feb 11, 2011 3:07 pm
Reply with quote

himanshupant wrote:
looks like it


had you tested, and based on the results of the test, reached the same conclusion as your assumption,
i would pay attention to your criticizm of the sql.

you didn't, so I won't.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Fri Feb 11, 2011 8:33 pm
Reply with quote

Testing - we don' need no steenkin' testing. . . icon_wink.gif

d
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DB2

 


Similar Topics
Topic Forum Replies
No new posts DELETE SPUFI DB2 1
No new posts To get the count of rows for every 1 ... DB2 3
No new posts DSNTIAUL driven delete IBM Tools 0
No new posts Exclude rows with > than x occurre... DFSORT/ICETOOL 6
No new posts How to delete a user's alias from the... JCL & VSAM 11
Search our Forums:

Back to Top