View previous topic :: View next topic
|
Author |
Message |
theone1987
New User
Joined: 14 Sep 2009 Posts: 60 Location: Milan
|
|
|
|
Hello everybody,
I need help.
I need to clear a very large number of rows in the DB. The records that I have to delete are contained in a PS file.
What is the quickest way to do it?
load? delete?
I hope that is clear ..
thanks |
|
Back to top |
|
|
Kjeld
Active User
Joined: 15 Dec 2009 Posts: 365 Location: Denmark
|
|
|
|
You probably don't want to do a query and delete on each individual row from your PS, due to the number of rows involved.
I will suggest 2 approaches for total unload with load of retained rows:
Find a common set of criteria for the rows in your PS. How were these rows selected for cleaning out? You could use that for selecting rows that do not satisfy this criteria (i.e. use a selection specifying the rows to keep) in a selective load from a total unload. You could write a program to read the key values from the PS and update one or more columns in the selected rows that distinctively marks them as obsolete. Subsequently make the total unload, and selectively load the table with the rows that do not have the obsolete criteriae set. |
|
Back to top |
|
|
theone1987
New User
Joined: 14 Sep 2009 Posts: 60 Location: Milan
|
|
|
|
I did not understand the first approach.
while the second approach: it is convenient to make a program that reads each row from the file and made an update on the table?
I would spend less time possible. |
|
Back to top |
|
|
Kjeld
Active User
Joined: 15 Dec 2009 Posts: 365 Location: Denmark
|
|
|
|
If you know what criteriae the records (rows) in your input file is selected from, you could use that knowledge to select what rows to keep = re-load in the table, and avoid using the dataset.
The second approach: You decide... If you want to use the information in your PS file I don't think you can avoid some coding.
A third approach could be processing the unload file to discard all rows found in the PS input file, that could even be done with some batch utility, with applicable control statements. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
I need to clear a very large number of rows in the DB |
What is "very large" ?
How many rows are identified to be deleted? How many total rows are in the table currently? Is this a one-time process or will it be run regularly? |
|
Back to top |
|
|
theone1987
New User
Joined: 14 Sep 2009 Posts: 60 Location: Milan
|
|
|
|
dick scherrer wrote: |
Hello,
Quote: |
I need to clear a very large number of rows in the DB |
What is "very large" ? |
the table have over 80 milion of rows. I heve to run the "delete process" one time a month. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
And how many rows will be deleted each month?
How long does it take to unload the table? How long to reload?
Knowing these may help decide which way to proceed. . . |
|
Back to top |
|
|
theone1987
New User
Joined: 14 Sep 2009 Posts: 60 Location: Milan
|
|
|
|
dick scherrer wrote: |
Hello,
How long does it take to unload the table? How long to reload?
|
i didn't try jet to unload and reload the table. more than 2 hour for sure...
I had thought of doing partition load. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Sigh. . . One bit at a time. . .
Quote: |
And how many rows will be deleted each month? |
|
|
Back to top |
|
|
theone1987
New User
Joined: 14 Sep 2009 Posts: 60 Location: Milan
|
|
|
|
what was your idea? |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
To find out how many rows will be deleted each month. . . |
|
Back to top |
|
|
theone1987
New User
Joined: 14 Sep 2009 Posts: 60 Location: Milan
|
|
|
|
dick scherrer wrote: |
To find out how many rows will be deleted each month. . . |
least 500 thousand |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
least 500 thousand |
More valuable would be the "max" rather than the "at least". . .
Quote: |
the table have over 80 milion of rows. I heve to run the "delete process" one time a month. |
If the number of rows to be deleted was twice the "at least", that would only be 1 out of 80 million - which is just over 1% of the file.
I suspect the process will run fastest directly deleting the necessary rows if the rows are distributed randomly thru the table.
Have you run a test to see how long it takes to delete a million rows? Doing so, might provide useful info. . . |
|
Back to top |
|
|
Kjeld
Active User
Joined: 15 Dec 2009 Posts: 365 Location: Denmark
|
|
|
|
You should confirm with the DBA at your installation if direct delete of the rows is feasible. Delete statements take quite a heave load on the DB2 system, as every row to be deleted has to be written to the database log, so it can be restored to the table in case of a rollback situation.
You should consider designing your delete code to be able to commit frequently. Deleting 1.000.000 rows in one LUW (logical unit of work) can take a considerable amount of time, and in case of an abend the rollback process can take longer time than the delete process took, reading off the DB2 log and restoring into the tablespace and indexes. |
|
Back to top |
|
|
|