Joined: 05 Jun 2009 Posts: 185 Location: Planet Earth
Hello all.
I was recently analyzing one long running job which I felt can finish in a faster time. I profiled the job using performance monitoring tool and observed that 99% of the service time is spent on a single FETCH CURSOR statement. Average Service time per fetch statement is about 45 seconds.
I am kind of confused about how this can happen. I have seen a lot of cases where a high CPU time/service time being spent on OPEN CURSOR statement because of inefficient coding. But fetch cursor consuming an average response time of 45 seconds is something that I could not understand the reason behind it. Can anybody show some light as in what can be the causes behind this and what are the options that might help in reducing the same ?
Joined: 05 Jun 2009 Posts: 185 Location: Planet Earth
Hello,
Quote:
Needs clarification. . . If there were only 2,000 requests, this would take more than a day. . .
Possibly there is misunderstanding of the performance statistics
The facts are actually correct. Luckily there are only 294 fetch calls making it complete in less than 4 hours.
Quote:
What does an EXPLAIN show?
I actually did not tried running EXPLAIN because, as of now I am of an undertsanding that a Fetch statement cannot perform different based on the access path and EXPLAIN results can only impact OPEN cursor statements.. My understanding is Open cursor will open and generate the rowset that qualifies for the declared cursor and fetch statement just fetches them into the program. Please correct me if i am wrong. I can run EXPLAIN if I can understand why I am running it or If you need any information from it. But as of now, I am not sure what information you are looking for from EXPLAIN results
Quote:
How are the declare and the fetch coded?
Does this cursor build a temporary result set?
Yes, the cursor will build a temporary result set. I might not be able to share the exact query for security reasons But I will try to WRITE the sample format as the declared cursor in my program.
Code:
DECLARE MAIN_CURSOR CURSOR FOR
SELECT
COL1
, VALUE (COL2,0) AS COL2
, VALUE (COL3,0) AS COL3
FROM
( SELECT
COL4
VALUE (COL5,0) AS COL5
FROM
TABLE1
WHERE
COL6 = ? AND COL7 = ?
) TAB1
INNER JOIN
(
SELECT
COL8
FROM
TABLE2
WHERE
COL9 = ? AND COL10 = ?
) TAB2
ON COLA = COLB AND COLC = COLD
INNER JOIN TABLE3 TAB3
ON COLE = COLF AND COLG = COLH
Fetch statement is a normal FETCH INTO ... statement. As there are only 294 fetches, I did not think about multirow fetch options as well.
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
Quote:
The facts are actually correct.
Well, facts have to be correct - else they are not facts
Quote:
Luckily there are only 294 fetch calls making it complete in less than 4 hours.
Can you find out which fetch statements take the longest? With only 294, and an average of 45 seconds, there may be one or 2 or some small number that accunt for most of the time. It might be helpful to add a bit of test code to calculate the time each fetch does take.
Your simplified SQL doesn't make sense : you are joining on and selecting columns that don't exist in your subselects.
How would you know that open will create a temporary result set if you don't explain ?
Depending on the accesspath and temporary result set, It is
or the OPEN that takes very long
or each FETCH.
It is very possible that each fetch does a tablescan of table3 voor each row of table1.
There is no way of telling without explain-info
and no decent advice can be given without a correct SQL and available index definitions on the relevant columns.
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
My bad - started to add this yesterday and became sidetracked . . .
Quote:
My understanding is Open cursor will open and generate the rowset that qualifies for the declared cursor and fetch statement just fetches them into the program.
Joined: 05 Jun 2009 Posts: 185 Location: Planet Earth
Sorry folks.. Lost access to internet yesterday and hence could not reply immediately.
Quote:
It might be helpful to add a bit of test code to calculate the time each fetch does take
True. But my problem is this code is currently running in production. The test system doesnot have sufficient test data to run so long. And the process involved in moving a code change into production is complex to promote 'test codes'.
Quote:
How long does the cursor open take?
Open cursor service time is 0.00740 seconds.
Quote:
Your simplified SQL doesn't make sense : you are joining on and selecting columns that don't exist in your subselects.
May be wrong representation from my side I will provide the actual code itself with minor changes(column names)
Code:
DECLARE MAIN_CURSOR CURSOR FOR
SELECT
TAB1_CUS_ID
,TAB1_TAB3_NB
,VALUE (TAB1_TAB3_NB_IND,0) AS TAB1_TAB3_NB_IND
,TAB1_QDC_CD
,TAB1_DIL_CD
,TAB1_MOC_CD
,TAB1_NB
,TAB1_HSU_TS
,VALUE (TAB3_PNT_BEN_ID,0) AS TAB3_PNT_BEN_ID
,VALUE (TAB3_BNX_SEQ_NB,0) AS TAB3_BNX_SEQ_NB
,VALUE (TAB3_DSP_QY,0) AS TAB3_DSP_QY
,VALUE (TAB3_ACC_NDC_ID,0) AS TAB3_ACC_NDC_ID
,VALUE (TAB3_NHU_CD,0) AS TAB3_NHU_CD
,VALUE (TAB3_NTW_PBR_ID,0) AS TAB3_NTW_PBR_ID
,VALUE (TAB2_CRL_ID,0) AS TAB2_CRL_ID
FROM
( SELECT
TAB1_CUS_ID
,VALUE(TAB1_TAB3_NB,1) AS TAB1_TAB3_NB
,TAB1_TAB3_NB AS TAB1_TAB3_NB_IND
,TAB1_QDC_CD
,TAB1_DIL_CD
,TAB1_MOC_CD
,TAB1_NB
,TAB1_HSU_TS
FROM
TTAB1_PROBLEM
WHERE
TAB1_QDC_CD = ?
AND TAB1_DIL_CD = ?
AND (TAB1_IAR_CD IS NULL OR TAB1_IAR_CD = 0)
) TABLE1
INNER JOIN
(
SELECT
TAB2_CUS_ID
,TAB2_TAB3_NB
,TAB2_QDC_CD
,TAB2_DIL_CD
,TAB2_TAB1_NB
,TAB2_CRL_ID
FROM
TTAB2_RULES
WHERE
TAB2_QDC_CD = ?
AND TAB2_DIL_CD = ?
AND TAB2_CRL_ID = ?
) TABLE2
ON TAB1_CUS_ID = TAB2_CUS_ID
AND TAB1_TAB3_NB = TAB2_TAB3_NB
AND TAB1_QDC_CD = TAB2_QDC_CD
AND TAB1_DIL_CD = TAB2_DIL_CD
AND TAB1_NB = TAB2_TAB1_NB
INNER JOIN TTAB3_REQUESTS TABLE3
ON TAB1_CUS_ID = TAB3_CUS_ID
AND TAB1_TAB3_NB = TAB3_NB
The following are the index details of the three tables involved,
Code:
INDEX PROD XTAB1001 0 U 0
COL TAB1_CUS_ID 1 A INTEGER 4 N N
COL TAB1_QDC_CD 2 A SMALLINT 2 N N
COL TAB1_DIL_CD 3 A SMALLINT 2 N N
COL TAB1_NB 4 A SMALLINT 2 N N
INDEX PROD XTAB1002 0 D 0
COL TAB1_HSU_TS 1 A TIMESTMP 10 N Y
INDEX PROD XTAB2001 0 D 0
COL TAB2_BEN_ID 1 A INTEGER 4 Y Y
COL TAB2_PBR_ID 2 A INTEGER 4 Y Y
COL TAB2_CRL_ID 3 A INTEGER 4 N N
INDEX PROD XTAB2002 0 U 0
COL TAB2_CUS_ID 1 A INTEGER 4 N N
COL TAB2_TAB3_NB 2 A SMALLINT 2 N N
COL TAB2_QDC_CD 3 A SMALLINT 2 N N
COL TAB2_DIL_CD 4 A SMALLINT 2 N N
COL TAB2_TAB1_NB 5 A SMALLINT 2 N N
INDEX PROD XTAB3001 0 U 0
COL TAB3_CUS_ID 1 A INTEGER 4 N N
COL TAB3_NB 2 A SMALLINT 2 N N
INDEX PROD XTAB3002 0 D 0
COL TAB3_PNT_BEN_ID 1 A INTEGER 4 Y Y
INDEX PROD XTAB3003 0 D 0
COL TAB3_BLS_TS 1 A TIMESTMP 10 N Y
COL TAB3_PHM_NABP_ID 2 A CHAR 7 Y Y S
Quote:
There is no way of telling without explain-info
and no decent advice can be given without a correct SQL and available index definitions on the relevant columns.
Index details just being given, I am providing the EXPLAIN results as well below,
Code:
QUERYNO PLANNO METHOD TNAME TABNO ACCESSTYPE MATCHCOLS ACCESSCREATOR ACCESSNAME INDEXONLY SORTN_UNIQ
------+---------+---------+---------+---------+---+---------+---------+---------+---------+---------+-------------+
000001 1 0 TTAB2_RULES 3 I 0 PROD XTAB2001 N N
000001 2 1 TTAB1_PROBLEM 1 I 4 PROD XTAB1001 N N
000001 3 1 TTAB3_REQUESTS 5 I 2 PROD XTAB3001 N N
SORTN_JOIN SORTN_ORDERBY SORTN_GROUPBY SORTC_UNIQ SORTC_JOIN SORTC_ORDERBY SORTC_GROUPBY TSLOCKMODE PREFETCH
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+------
N N N N N N N IS S
N N N N N N N IS
N N N N N N N IS
PS : I have chopped off a few columns from plan table because of the complexity in copying. Let me know if more columns are required.
And Guyc,
am sorry about my unawareness, but I seriously dont know how to find out from EXPLAIN results if the query might consume time in OPEN or FETCH statements. Please share some information regarding same if possible.
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
Hello,
Quote:
The test system doesnot have sufficient test data to run so long. And the process involved in moving a code change into production is complex to promote 'test codes'.
There either needs to be full-size data for testing or the diagnosis needs to be done in the production environment.
One way might be to promote some "one-time" diagnostic code rather than a double-promote of the actual code (once for testing and then back to the original code).
Sounds like another case where this was tested with very little data and when the "system" performed acceptably, it was promoted with no "real world" testing. . . Unfortunately very common these days. . .
- none of the sortN, sortC columns are 'Y' , which means no sorts are being done.
-Method=1 => Nested Loop : no intermediate tables needed
right now the explain says :
1) All problems are read via index XTAB2001
2) for every qualifying problem, all rules for this cus_id,... (join_cols) are read via index XTAB1001
3) for every qualifying rule the requests are read via index XTAB3001
When no sorts are done, there are no intermediate result sets , and thus the open doesn't do antyhing.
Each fetch will search for the next row and return that one.
I have no idea of the number of quailifying rows, or which criteria is most restrictive.
Very important in this kind of accesspath is the clusterratio of the involved indexes. If very low then a lot of synchronous I/O will cause a long response time.