View previous topic :: View next topic
|
Author |
Message |
vini_srcna
Active User
Joined: 26 May 2005 Posts: 178 Location: Copenhagen, Denmark
|
|
|
|
Hi,
I have come across some presentations saying there should be some difference between the following queries.(not much may be litle bit).
SELECT A,B FROM TABLE WHERE A = :VAR
and
SELECT B FROM TABLE WHERE A = :VAR
I have read articles saying, Select the columns only you need. The unused columns may be additional over head.
Could there be any minor difference in CPU or IO for the above two queries..?.
Please justify. Thanks..!! |
|
Back to top |
|
|
expat
Global Moderator
Joined: 14 Mar 2007 Posts: 8797 Location: Welsh Wales
|
|
|
|
Why not try it for yourself and let us know your findings ? |
|
Back to top |
|
|
Santoshdorge
New User
Joined: 27 Jun 2006 Posts: 48 Location: Pune
|
|
|
|
Hi,
Basically this depends on the index of the table.In the above case if the index is created on column A definately performance of the first query would be better than that of second.
thanks,
Santosh |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
????????????? the only difference between the two queries is that the first returns two columns, the second only one column.
so, what do you think would take more time, dealing with two columns or only dealing with one?
Quote: |
Basically this depends on the index of the table.In the above case if the index is created on column A definately performance of the first query would be better than that of second.
|
that's quite a mouth-full but what does that have to do with the question? Whatever affect an index would have, it would have on both queries.
think about what you post...... |
|
Back to top |
|
|
Santoshdorge
New User
Joined: 27 Jun 2006 Posts: 48 Location: Pune
|
|
|
|
Hi Dick,
I do agree with you index if there any would affect both the queries.
As per my understanding that should not be preformance difference if you are pointing to the number of columns being selected in query.Because records from the db2 internal data sets are bring into buffer as a whole not a perticular column and then filtered out for particular column.
Plz do correct me if i am going in wrong direction.
thanks,
Santosh. |
|
Back to top |
|
|
vini_srcna
Active User
Joined: 26 May 2005 Posts: 178 Location: Copenhagen, Denmark
|
|
|
|
I still think there should be some difference. Nothing comes for free.
It may not be measurable. Will let you know when i get the right answer. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
It may not be measurable. Will let you know when i get the right answer. |
You may need to run a very large test to see the difference - if you can spot it at all.
If the 2 columns are contiguous in the database and the variables in the application are too, a single move from the database to the application might suffice and spotting the tiny bit of difference needed to define the length of the move would be nearly impossible. |
|
Back to top |
|
|
vini_srcna
Active User
Joined: 26 May 2005 Posts: 178 Location: Copenhagen, Denmark
|
|
|
|
Hi,
I tried running an Explain on this queries. Let me tell how and what i did this.
I ran an EXPLAIN in Production on one of the tables where it had the latest statistics. I have my own Plan table and DSN tables in PROD.
Table name: ICB_data
column names:
Tms --> TIMESTAMP NOT NULL WITH DEFAULT
REFNR --> CHAR(10) NOT NULL
and other columns which we dont need.
There is an Unique index built on TMS and REFNR in the order,
COLNAME ORDER
TMS A
REFNR A
I ran DB2 Explain on the below four queries for this table.
EXPLAIN PLAN SET QUERYNO = 96
SELECT REFNR, TMS FROM ICB_DATA WHERE REFNR = ?
EXPLAIN PLAN SET QUERYNO = 97
SELECT TMS FROM ICB_DATA WHERE REFNR = ?
EXPLAIN PLAN SET QUERYNO = 98
SELECT REFNR, TMS FROM ICB_DATA WHERE TMS = ?
EXPLAIN PLAN SET QUERYNO = 99
SELECT REFNR FROM ICB_DATA WHERE TMS = ?
Once I ran explain, I queried the DSN_STATEMENT table to check the COST_CATEGARY, Estimated milliseconds and Estimated Serivce units. I just had hope that the columns milliseconds and service units will be different in each case however it was not the case.
I dont understand why the queires 96 & 97 are going with MATCHCOLS = 0 though there is an index built on REFNR. I see it is the last column on the index however the ACCESSTYPE is still I. All the queries says it has INDEX ONLY ACCESS.
Here is the result from Dsn statement table.
EXPLAIN COST
QUERYNO TIME CATEGORY PROCMS PROCSU
-------- -------------------------- -------- ----------- -----------
96 2007-12-27-13.40.07.930000 A 173 4206
97 2007-12-27-13.41.25.100000 A 173 4206
98 2007-12-27-13.43.02.940000 A 1 1
99 2007-12-27-13.43.18.690000 A 1 1
So Is there any difference to measure..?. Is there any other place we can see more information..? I just did what all i could do as am not expertise.
Please let me know your comments on this. Thanks a lot for your time. |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 632 Location: Wisconsin
|
|
|
|
Looks like you showed yourself that the time is not different at all. |
|
Back to top |
|
|
vini_srcna
Active User
Joined: 26 May 2005 Posts: 178 Location: Copenhagen, Denmark
|
|
|
|
Sorry stodolas,
I didnt understand what you mean. |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 632 Location: Wisconsin
|
|
|
|
Your results show no difference in the PROCSU (processor service units column?). Which means that you answered your own question. There is no measurable difference in DB2 processor usage between your statements. |
|
Back to top |
|
|
vini_srcna
Active User
Joined: 26 May 2005 Posts: 178 Location: Copenhagen, Denmark
|
|
|
|
Hi Stodolas,
If am right, the Explain gives the estimated CPU cost however it wont give the I/O Cost.
Most of the things we are trying to measure might come under I/O. |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 632 Location: Wisconsin
|
|
|
|
The difference in I/O is negligible, probably immeasurable. DB2 stores rows in a special type of VSAM. I don't know for sure, but it probably reads a whole row to get a single column or to get multiple columns.
If you are looking at this to improve performance, I suggest looking elsewhere. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
Quote: |
Most of the things we are trying to measure might come under I/O. |
the explain works at the profiling level for the query involved,
by elucubrating on the different paths that can be used to reach the data
the I/O is far far away from here..
to measure the I/O too may things are involved..
buffering, concurrency,persistency...
from a theorethical point of view, assuming a single threaded process
the first query woiuld bear all the I/O burden
the second time the query is run..
it will execute exactly the same instruction path as far as the searching is concerned,
but will take advantage of the persistency of the data in the buffer pools..
Since, apart large objects, the columns which constitute a row are all grouped together in what could be defined a "logical record"
the overhead involved in fetching a non conditioning column is .... IRRELEVANT
maybe a couple of mvc... |
|
Back to top |
|
|
|