I have a cursor as mentioned below:
DECLARE AUT-ADR-C2 CURSOR FOR
SELECT
U5TAXYR,
U5ADACCT,
U5STRNO,
U5STRNM,
U5UNIT,
U5CITY,
U5STATE,
U5ZIP,
U5ATYPCD
FROM AUT_MVADDR_T
WHERE ( U5ADACCT = :U5ADACCT )
AND
( U5TAXYR = :WW-TAXYR )
AND
( U5PRYRIN = :MAP-D-PRIOR-YR-IND )
AND
( U5ATYPCD = :WK01-AUT-MAIL
OR
U5ATYPCD = :WK01-TAX-ADDR )
ORDER BY U5ATYPCD
The indexes on AUT_MVADDR_T table are as follows
1) AUT_MVADDR_I
U5ATYPCD ASC
U5CRCTYR ASC
U5CRCTNO ASC
U5TAXYR DESC
U5ADACCT ASC
U5PRYRIN ASC
2) AUT_MVADDR_ALT2_I
U5CRCTYR ASC
U5TAXYR ASC
U5CRCTNO ASC
U5ADACCT ASC
3) AUT_MVADDR_ALT1_I
U5CRCTYR ASC
U5CRCTNO ASC
Basically the index AUT_MVADDR_I has 5 matching columns and the index AUT_MVADDR_ALT2_I has 2 matching columns. When I checked the DB2 access path in Plan_table , the SELECT statement is picking up AUT_MVADDR_ALT2_I. Can someone explain me why DB2 doesn't pick up the primary index AUT_MVADDR_I as it has more number of matching columns?
It could be many things but I'll try to help. The DB2 optimizer chooses one index over another by comparing, basically, the filter factor and cluster ratio (even when it's not defined as a clustering index).
The filter factor is calculated based on the FULL KEY CARDINALITY, the number of distinct values for the combined entire key. The DB2 optimizer divides the full key cardinality by the number of total rows in the table. If that value is close to 0, it means the index does not filter really well, thus DB2 will not use it. On the other hand, a good filter factor has a value close to 1.
On the cluster ratio. I don't really know how it functions internally, but I know DB2 physically organizes data based on the clustering index. If you haven't defined a clustering index, DB2 will use the first index created on the table. Was AUT_MVADDR_ALT2_I created before AUT_MVADDR_I? If so, AUT_MVADDR_ALT2_I's cluster ratio should be higher then AUT_MVADDR_I, which affects the DB2 optimizer choice.
You may check cluster ratio and fullkeycard by running this query:
Code:
SELECT NAME, CLUSTERING,
CLUSTERED, CLUSTERRATIOF*100 AS CLUSTERRATIO, FULLKEYCARDF
FROM SYSIBM.SYSINDEXES
WHERE NAME = 'AUT_MVADDR_ALT2_I' OR NAME = 'AUT_MVADDR_I'
So... the solution you ask? Run statistics on AUT_MVADDR_I and AUT_MVADDR_ALT2_I and on the columns that composes the indexes. It might help. You can also 'cheat' the DB2 optimizer by setting CLUSTERRATIOF for AUT_MVADDR_I to 1. Or you can define AUT_MVADDR_I as a clustering index and run a REORG on the tablespace.
Thank you so much for providing wonderful explanation.
The "AUT_MVADDR_I" is the first index created and defined as clustered.
Below is the result of your query.
NAME CLUSTERING CLUSTERED CLUSTERRATIO FULLKEYCARDF
------------------ ---------- --------- ------------ ------------
AUT_MVADDR_I Y Y 9.99980E+01 5.29384E+06
AUT_MVADDR_ALT2_I N Y 9.99912E+01 4.84196E+06
The "AUT_MVADDR_I" is the first index created and defined as clustered.
Below is the result of your query.
NAME CLUSTERING CLUSTERED CLUSTERRATIO FULLKEYCARDF
------------------ ---------- --------- ------------ ------------
AUT_MVADDR_I Y Y 9.99980E+01 5.29384E+06
AUT_MVADDR_ALT2_I N Y 9.99912E+01 4.84196E+06
Now that's odd. Both cluster ratio and filter are better for AUT_MVADDR_I than for AUT_MVADDR_ALT2_I. Maybe the DB2 optimizer can do a better job using U5TAXYR ASC over U5TAXYR DESC?
Try this:
1. Save the cluster ratio value of AUT_MVADDR_I
2. Set the cluster ratio to 1.
Code:
UPDATE SYSIBM.SYSINDEXES SET CLUSTERRATIOF = 1
WHERE NAME = 'AUT_MVADDR_I'
3. Run the query and check it's access path. DB2 Should choose AUT_MVADDR_I. But I can't guarantee you that the query will see a increase in performance.
By the way, you indexes have low cluster ratios. It is a inherent characteristic of using multiple indexes. You may improve the cluster ratio if you drop a index or two and do a REORG on the tablespace and indexes.