View previous topic :: View next topic
|
Author |
Message |
hailashwin
New User
Joined: 16 Oct 2008 Posts: 74 Location: Boston
|
|
|
|
Hi,
There is an existing job that reads a file to get the account number and an indicator and goes to the table to retrieve all the rows on the table for the condition. There could be more than one row returned and so we use a cursor. The table is quite huge and each time the file is read, the cursor is closed and opened. Since the account file is also quite huge, the performance of the job is a cause of concern.
The below are the alternate ways I know could produce the same output but not sure which would help me optimize on the performance.
1. DSNTIAUL - I would unload the whole table and work out a SORT-JOIN with the account file and produce the output. But when I ran this step..it seemed to run forever.
2. Multi-row retrieval - Here I would retrieve the rowset and process it and so there would be only one fetch for every read on the file.
3. Unload Utility - Unload the entire table and use a SORT-JOIN with the account file and produce the output. I would least prefer this method as i would have to get the DBA do the work for me to create utility jobs.
With my few years of experience, I have seen jobs that have performed well on the Testplex going haywire once in the prodplex. So thought I will get some expert opinion before working on this task.
Any suggestions/Comments are welcome.
Thanks,
Ashwin. |
|
Back to top |
|
|
prino
Senior Member
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
|
|
|
|
...quite huge..., ...it seemed to run forever...
Maybe you can be a bit more specific? |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
at least some numbers:
input file : n1 records
table : n2 rows
output : n3 records
Some info like clustering, indexes, partitioning of the table / order of the input file are also relevant |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
|
|
Maybe you can load the account file into a DB2 table and perform a JOIN
on the 2 tables. |
|
Back to top |
|
|
hailashwin
New User
Joined: 16 Oct 2008 Posts: 74 Location: Boston
|
|
|
|
Robert/GuyC,
The elapsed time for the DSNTIAUL step was 16 minutes and the CPU time was 4 minutes(these values are not bound to be consistent as it was run in the development environment) for one hundred thousand accounts on the file and the table had 7 million records. The select statement I used for this step had not predicates in the where clause as I had to match a indicator to one of the table columns after reading the flat file.
The table is not a partitioned table and the input is sorted on Account number.
Peter,
Creating a seperate table is out of my control here..
Thanks,
Ashwin. |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
100.000 / 7.000.000 = 1.4%
you didn't mention how many rows qualify so I'll assume 3 rows for each input record.
So that's 5% of the table that you actually need.
If you have a decent access path, selecting 5% of the table is probably faster than unloading/sorting.
The following , I didn't understand:
Quote: |
The select statement I used for this step had not predicates in the where clause as I had to match a indicator to one of the table columns after reading the flat file. |
If this is remotely close to what I suspect , I have no idea how you would solve that in your sort of the unloaded files. |
|
Back to top |
|
|
don.leahy
Active Member
Joined: 06 Jul 2010 Posts: 765 Location: Whitby, ON, Canada
|
|
|
|
You should also consider opening the cursor at the start of the program, and then use traditional match/merge logic between it and the account file. (Of course it goes without saying that the ORDER BY in the cursor must match the sort sequence of the account file).
Depending on how many records there are in the account file, this may be much more efficient than opening and closing the cursor thousands of time. The reason for this is that FETCHes are relatively cheap compared to OPENs. |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
1) I wouldn't fetch through the whole table, when only 5% of the rows are needed.
2) The statement :"FETCHes are relatively cheap compared to OPENs" is one I wouldn't repeat, without looking at the accesspath. |
|
Back to top |
|
|
don.leahy
Active Member
Joined: 06 Jul 2010 Posts: 765 Location: Whitby, ON, Canada
|
|
|
|
GuyC wrote: |
1) I wouldn't fetch through the whole table, when only 5% of the rows are needed.
2) The statement :"FETCHes are relatively cheap compared to OPENs" is one I wouldn't repeat, without looking at the accesspath. |
1) Neither would I...unless it is possible that the 5% will become 25% or higher.
2) It's a rule of thumb, not a law of physics. Your mileage may vary. |
|
Back to top |
|
|
|