View previous topic :: View next topic
|
Author |
Message |
krunalbafna Warnings : 1 Active User
Joined: 18 Jan 2010 Posts: 143 Location: Pune
|
|
|
|
I have tow tables:
Tabel A: having employee id in decimal format
Tabel B: having employee id in Char format (PIC X(12))
I want to extract those records from table A which are not present in table B.
I used following query:
SELECT EMPID,EMPCDE FROM TABLEA
WHERE EMPID NOT IN (
SELECT
CAST (EMPID AS DECIMAL(12)) FROM TABLEB))
But above query is taking longer time to execute.
Any efficient way available to solve the above issue and imporve the performance.
Kindly provide your feedback.
Regards,
Krunal |
|
Back to top |
|
|
Gousiya Mulla
New User
Joined: 02 Jun 2008 Posts: 87 Location: Bangalore
|
|
|
|
I think we can make use of joins ... ! |
|
Back to top |
|
|
Phrzby Phil
Senior Member
Joined: 31 Oct 2006 Posts: 1050 Location: Richmond, Virginia
|
|
|
|
How's about a better title, ok? |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
not joins, but subselect "where not exists" usually performs better
People should try NOT to write "IN ()"-subselects
Only when your IN-list has a very limited number of elements it will perform good
Do you need columns from TabA which aren't in TabB?
is EMPID unique in TabA? TabB?
are there indexes on TabA, TabB on or beginning with Empid ?
You can also have a look at "EXCEPT", a new kind of union in DB2 9. maybe that is useful |
|
Back to top |
|
|
krunalbafna Warnings : 1 Active User
Joined: 18 Jan 2010 Posts: 143 Location: Pune
|
|
|
|
Hi...
My tables has huge amount of data around 75lacs.
I want records only from TABLEA |
|
Back to top |
|
|
krunalbafna Warnings : 1 Active User
Joined: 18 Jan 2010 Posts: 143 Location: Pune
|
|
|
|
The indexes being with EMPID in both tables.
As i said i want only records from TABLEA which are not present in TABLEB for combination of EMPID,EMPCDE and EMPNO. |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
like i said
Code: |
select ... from tab1
where not exists (select 1 from tab2 where tab1.col1 = tab2.col1 and tab1.col2 = tab2.col2) |
|
|
Back to top |
|
|
krunalbafna Warnings : 1 Active User
Joined: 18 Jan 2010 Posts: 143 Location: Pune
|
|
|
|
Hi GUYC,
TABLEA has empid in decimal (COBOL Comp-3) format
TABLEB has empid in char (COBOL X) format
Your query:
select ... from tab1
where not exists (select 1 from tab2 where tab1.col1 = tab2.col1 and tab1.col2 = tab2.col2)
What does Select 1 means:
may you please explain.
Thanks |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
select 1 : since "not exists()" only test if a row is there , there is no need to select columns in that subselect. the shortest column is simply literal 1.
char() is the function to change decimal to char but maybe that is not the exact correct format, so you might have to do substr()
Experiment with "select char(EmpId) from tabA " to see what you get
and see if it resembles tabB.Empid
if tab1.EmpID is DEC(12,0) it would be something like :
where tabB.EMPID =substr(char(tabA.EmpID),2,12) |
|
Back to top |
|
|
krunalbafna Warnings : 1 Active User
Joined: 18 Jan 2010 Posts: 143 Location: Pune
|
|
|
|
GuyC,
I have tried using above query but the process is even slower. It takes more than 45mins to execute.
Any possible method to increase the efficiency to get the results very fast.
Regards and Thanks |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
How can you reply "more than 45mins" 26mins after my post ?
besides : "elapsed time"? are you sure the workload on your mainframe was the same as previous test ? that the same rows were in the bufferpools ?
How about CPUsec ? GETPAGES ?
Do both tables have 7.500.000 rows ?
Are the indexes only on EmpID ? are the indexes clustered ? reorged ? unique ?
you did try where tabB.EMPID =substr(char(tabA.EmpID),2,12)
and not where cast(tabB.EMPID as dec(12)) =tabA.EmpID ?
This is a big difference. one is stage1 indexable the other one isn't |
|
Back to top |
|
|
krunalbafna Warnings : 1 Active User
Joined: 18 Jan 2010 Posts: 143 Location: Pune
|
|
|
|
hey i know post is 26mins old....
But i have to stop the query in between as it is not showing the output. I calculated the time and it may take 45mins.
Substr query is giving me sqlcost:753.938.432
select TAB1.EMPID,TAB1.EMPCD from tab1
where not exists (select 1 from tab2 where tab2.empid = substr(char(tab1.EmpID),2,12) and tab1.empcd = tab2.empcd)
Using cast query sqlcost: 29.051
select TAB1.EMPID,TAB1.EMPCD from tab1
where not exists (select 1 from tab2 where tab1.empid = cast(tab2.empid as decimal(12)) and tab1.empcd = tab2.empcd)
Kindly check and provide your commetns. |
|
Back to top |
|
|
krunalbafna Warnings : 1 Active User
Joined: 18 Jan 2010 Posts: 143 Location: Pune
|
|
|
|
GuyC,
Both tables have different record counts. |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
Is there an index on tab2 on or beginning with (Empid,empcd) or (Empcd,Empid)? you really need one to get this performant.
if there is one, it really doesn't make sense and you'll need to provide explain info.
another possibility is (if you have DB2 V9 )
Code: |
select EMPID , empcd from tabA
EXCEPT ALL
select cast(EMPID as dec(12)) ,empcd from tabB |
or
Code: |
select substr(char(tab1.EmpID),2,12) , empcd from tabA
EXCEPT ALL
select EMPID ,empcd from tabB |
|
|
Back to top |
|
|
Craq Giegerich
Senior Member
Joined: 19 May 2007 Posts: 1512 Location: Virginia, USA
|
|
|
|
Some consistency in defining the same data item in different tables would help. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
But i have to stop the query in between as it is not showing the output. I calculated the time and it may take 45mins. |
If the data "is not showing", how was the calculation made. . .
You might consider unloading the necessary columns from both tables and comparing the files in batch . . . |
|
Back to top |
|
|
krunalbafna Warnings : 1 Active User
Joined: 18 Jan 2010 Posts: 143 Location: Pune
|
|
|
|
Our Db2 version is 8. I am getting following error when i execute:
-084, ERROR: UNACCEPTABLE SQL STATEMENT
= 42612 SQLSTATE RETURN CODE
Thanks..
Any modification will make us imporve performance? |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
krunalbafna,
you are wasting a post to tell us that db2 vsn 9 syntax does not work on db2 Vsn 8.
at this point you need to run an explain on both queries- substr(char)
- cast as decimal
and based on the explain output - add/modify indexes
- insure that all predicates are indexable
|
|
Back to top |
|
|
|