View previous topic :: View next topic
|
Author |
Message |
meetsupriyo
New User
Joined: 19 May 2009 Posts: 11 Location: Hyderabad
|
|
|
|
Hi,
I have a Requirement to Load around 90 Million records from a PS File into a DB2 Table.
I am thinking the below three options: Please suggest which one will provide more performance benifits:
Option 1: Build a one-off COBOL program that will read the data from File and will load into DB2 tables via Insert SQL. The program supports programmatic restart, say first day I have loaded 5M records and store the last record key into a parameter table. next Day, the program will skip 5M records by reading the file sequentially and start loading the records after 5M.
Option 2: Copy the PS File into a KSDS VSAM File so that the records skipping can be done by a single Start Command. Alll other process will be same as option 1
Option 3: Break the Large File into smaller number of files say Files of 10M each and use option 1 without restart logic.
Any other option other than DB2 load is very much appreciated. DB2 Load is not an option for us as it will issue a table level lock and the table would not be accessible during Load process. The Table should be available 24/7.
Thanks,
Supriyo. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
Quote: |
Any other option other than DB2 load is very much appreciated. DB2 Load is not an option for us as it will issue a table level lock and the table would not be accessible during Load process. The Table should be available 24/7.
|
using proper terminology is essential for effective communication .
looks like Your understanding of <load> is pretty foggy
during a load the table is anyway unavailable to other <transactions>
so why do You worry? |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
you could always CREATE a new table and use the db2 load to load that table.
the advantages are:
1. the load would happen rather quickly.
2. you would know what kind of space you need.
3. you would not affect the production table in the event of an abend - which would require a rollback, which is very resource intensive.
4. all the rows would be in a table,
which you could then RUNSTATS
and compare against the production table runstats.
After doing any reconfiguration necessary to the production table
in anticipation of adding 90 million rows,
you could accomplish the load from new to production
via sql.
using a where clause to limit each sql insert.
or
use MERGE if you happen to have a decent level of db2.
Blindly loading 90 million rows to a production table
w/o knowing the effect you will cause
is like stepping on the accelerator and just holding on.
hopefully you have online-reorg available,
as you will need to reorg your production table, runstats,
the whole kitten-kaboodle.
even if your prod db is in the giga row range,
90 million rows is a large increase.
you have to plan for it, and without the intermediate 'new' table,
you will be having more problems that providing for 24/7 availablility.
i don't imagine you have partitioned tables?
if you have to ask on an internet forum,
you obviously do not have the talent in-house
to properly plan for this size of an increase. |
|
Back to top |
|
|
meetsupriyo
New User
Joined: 19 May 2009 Posts: 11 Location: Hyderabad
|
|
|
|
Dear enrico,
The underline DB2 Table must be available during this Load process, so, I can't go with DB2 utility load.
Hope this makes sense. |
|
Back to top |
|
|
don.leahy
Active Member
Joined: 06 Jul 2010 Posts: 765 Location: Whitby, ON, Canada
|
|
|
|
meetsupriyo wrote: |
Dear enrico,
The underline DB2 Table must be available during this Load process, so, I can't go with DB2 utility load.
Hope this makes sense. |
Have you looked at the SHRLEVEL(CHANGE) option of the LOAD utility?
Note also that your application program will be taking a lot of locks, even if you are loading "only" 5M rows at a time. Unless you take frequent Commits, you may run into lock escalation. |
|
Back to top |
|
|
prino
Senior Member
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
|
|
|
|
Putting 90M rows in a table via an application program will log everything, another reason to use the LOAD utility... |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
The underline DB2 Table must be available during this Load process, so, I can't go with DB2 utility load. |
I believe this is overstated. . . Why does someone believe the entire table must be "available" DURING THIS LOAD?
What % of the data in the table does the 90million represent?
If the inserts are being done, for the duration of the process, the table content is not reliable as there can be rows inserted that some process "see" and some don not.
In a test environment, how long does it take to insert 90million individual rows? How long to LOAD the same with this the only active process? |
|
Back to top |
|
|
gylbharat
Active Member
Joined: 31 Jul 2009 Posts: 565 Location: Bangalore
|
|
|
|
Quote: |
The program supports programmatic restart, say first day I have loaded 5M records and store the last record key into a parameter table. next Day, the program will skip 5M records by reading the file sequentially and start loading the records after 5M.
|
In how many days you are planning to load the data? 90M/5M = 18 Days??
Is this data not critical?? Is it some kind of history data and not used by any application reports that data load can span for 18 days?.
Have you tried this process in test to see how much time this process takes? |
|
Back to top |
|
|
jerryte
Active User
Joined: 29 Oct 2010 Posts: 202 Location: Toronto, ON, Canada
|
|
|
|
If you use the SHRLEVEL CHANG load utility parms
Code: |
LOAD DATA RESUME YES SHRLEVEL CHANGE LOG YES |
then it functions just like a program by doing a series of sql inserts instead of a load. Below excerpt is from the DB2 Utilities manual
SHRLEVEL CHANGE "Specifies that applications can concurrently read from and write to the table space or partition into which LOAD is loading data."
I am not sure if it commits each row as it is inserted or if DB2 waits until the end to insert.
Some issues to consider:
- is the table clustered? If so then are you adding data in the middle or the end? You may need to do a reorg
- is the table partitioned? If so then you can do the load into one partitiion at a time thus leaving the other partitions free.
Consider using ICETOOL to break up the file into smaller files.
If you use a COBOL program then it will need to do frequent commits so that the table doesn't get locked. |
|
Back to top |
|
|
|