Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

Data from table.

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DB2
View previous topic :: :: View next topic  
Author Message
krunalbafna
Warnings : 1

Active User


Joined: 18 Jan 2010
Posts: 143
Location: Pune

PostPosted: Wed Aug 11, 2010 2:06 pm    Post subject: Data from table.
Reply with quote

I have tow tables:
Tabel A: having employee id in decimal format
Tabel B: having employee id in Char format (PIC X(12))

I want to extract those records from table A which are not present in table B.

I used following query:
SELECT EMPID,EMPCDE FROM TABLEA
WHERE EMPID NOT IN (
SELECT
CAST (EMPID AS DECIMAL(12)) FROM TABLEB))

But above query is taking longer time to execute.
Any efficient way available to solve the above issue and imporve the performance.

Kindly provide your feedback.

Regards,
Krunal
Back to top
View user's profile Send private message

Gousiya Mulla

New User


Joined: 02 Jun 2008
Posts: 87
Location: Bangalore

PostPosted: Wed Aug 11, 2010 4:50 pm    Post subject:
Reply with quote

I think we can make use of joins ... !
Back to top
View user's profile Send private message
Phrzby Phil

Active Member


Joined: 31 Oct 2006
Posts: 961
Location: Richmond, Virginia

PostPosted: Wed Aug 11, 2010 5:34 pm    Post subject:
Reply with quote

How's about a better title, ok?
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1278
Location: Belgium

PostPosted: Thu Aug 12, 2010 12:18 pm    Post subject:
Reply with quote

not joins, but subselect "where not exists" usually performs better

People should try NOT to write "IN ()"-subselects
Only when your IN-list has a very limited number of elements it will perform good

Do you need columns from TabA which aren't in TabB?
is EMPID unique in TabA? TabB?
are there indexes on TabA, TabB on or beginning with Empid ?

You can also have a look at "EXCEPT", a new kind of union in DB2 9. maybe that is useful
Back to top
View user's profile Send private message
krunalbafna
Warnings : 1

Active User


Joined: 18 Jan 2010
Posts: 143
Location: Pune

PostPosted: Thu Aug 12, 2010 3:19 pm    Post subject: Reply to: Data from table.
Reply with quote

Hi...
My tables has huge amount of data around 75lacs.

I want records only from TABLEA
Back to top
View user's profile Send private message
krunalbafna
Warnings : 1

Active User


Joined: 18 Jan 2010
Posts: 143
Location: Pune

PostPosted: Thu Aug 12, 2010 3:22 pm    Post subject:
Reply with quote

The indexes being with EMPID in both tables.

As i said i want only records from TABLEA which are not present in TABLEB for combination of EMPID,EMPCDE and EMPNO.
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1278
Location: Belgium

PostPosted: Thu Aug 12, 2010 3:45 pm    Post subject:
Reply with quote

like i said
Code:
select ... from tab1
where not exists (select 1 from tab2 where tab1.col1 = tab2.col1 and tab1.col2 = tab2.col2)
Back to top
View user's profile Send private message
krunalbafna
Warnings : 1

Active User


Joined: 18 Jan 2010
Posts: 143
Location: Pune

PostPosted: Thu Aug 12, 2010 3:53 pm    Post subject:
Reply with quote

Hi GUYC,
TABLEA has empid in decimal (COBOL Comp-3) format
TABLEB has empid in char (COBOL X) format

Your query:
select ... from tab1
where not exists (select 1 from tab2 where tab1.col1 = tab2.col1 and tab1.col2 = tab2.col2)

What does Select 1 means:

may you please explain.

Thanks
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1278
Location: Belgium

PostPosted: Thu Aug 12, 2010 4:07 pm    Post subject:
Reply with quote

select 1 : since "not exists()" only test if a row is there , there is no need to select columns in that subselect. the shortest column is simply literal 1.

char() is the function to change decimal to char but maybe that is not the exact correct format, so you might have to do substr()

Experiment with "select char(EmpId) from tabA " to see what you get
and see if it resembles tabB.Empid

if tab1.EmpID is DEC(12,0) it would be something like :
where tabB.EMPID =substr(char(tabA.EmpID),2,12)
Back to top
View user's profile Send private message
krunalbafna
Warnings : 1

Active User


Joined: 18 Jan 2010
Posts: 143
Location: Pune

PostPosted: Thu Aug 12, 2010 4:33 pm    Post subject:
Reply with quote

GuyC,
I have tried using above query but the process is even slower. It takes more than 45mins to execute.

Any possible method to increase the efficiency to get the results very fast.

Regards and Thanks
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1278
Location: Belgium

PostPosted: Thu Aug 12, 2010 4:47 pm    Post subject:
Reply with quote

How can you reply "more than 45mins" 26mins after my post ?
besides : "elapsed time"? are you sure the workload on your mainframe was the same as previous test ? that the same rows were in the bufferpools ?
How about CPUsec ? GETPAGES ?

Do both tables have 7.500.000 rows ?
Are the indexes only on EmpID ? are the indexes clustered ? reorged ? unique ?

you did try where tabB.EMPID =substr(char(tabA.EmpID),2,12)
and not where cast(tabB.EMPID as dec(12)) =tabA.EmpID ?

This is a big difference. one is stage1 indexable the other one isn't
Back to top
View user's profile Send private message
krunalbafna
Warnings : 1

Active User


Joined: 18 Jan 2010
Posts: 143
Location: Pune

PostPosted: Thu Aug 12, 2010 4:59 pm    Post subject:
Reply with quote

hey i know post is 26mins old....
But i have to stop the query in between as it is not showing the output. I calculated the time and it may take 45mins.

Substr query is giving me sqlcost:753.938.432

select TAB1.EMPID,TAB1.EMPCD from tab1
where not exists (select 1 from tab2 where tab2.empid = substr(char(tab1.EmpID),2,12) and tab1.empcd = tab2.empcd)

Using cast query sqlcost: 29.051

select TAB1.EMPID,TAB1.EMPCD from tab1
where not exists (select 1 from tab2 where tab1.empid = cast(tab2.empid as decimal(12)) and tab1.empcd = tab2.empcd)


Kindly check and provide your commetns.
Back to top
View user's profile Send private message
krunalbafna
Warnings : 1

Active User


Joined: 18 Jan 2010
Posts: 143
Location: Pune

PostPosted: Thu Aug 12, 2010 5:00 pm    Post subject:
Reply with quote

GuyC,
Both tables have different record counts.
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1278
Location: Belgium

PostPosted: Thu Aug 12, 2010 5:40 pm    Post subject:
Reply with quote

Is there an index on tab2 on or beginning with (Empid,empcd) or (Empcd,Empid)? you really need one to get this performant.
if there is one, it really doesn't make sense and you'll need to provide explain info.

another possibility is (if you have DB2 V9 )
Code:
select EMPID , empcd from tabA
EXCEPT ALL
select cast(EMPID as dec(12)) ,empcd from tabB

or
Code:
select substr(char(tab1.EmpID),2,12) , empcd from tabA
EXCEPT ALL
select EMPID ,empcd from tabB
Back to top
View user's profile Send private message
Craq Giegerich

Senior Member


Joined: 19 May 2007
Posts: 1512
Location: Virginia, USA

PostPosted: Thu Aug 12, 2010 5:48 pm    Post subject:
Reply with quote

Some consistency in defining the same data item in different tables would help.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Thu Aug 12, 2010 9:48 pm    Post subject:
Reply with quote

Hello,

Quote:
But i have to stop the query in between as it is not showing the output. I calculated the time and it may take 45mins.
If the data "is not showing", how was the calculation made. . . icon_confused.gif

You might consider unloading the necessary columns from both tables and comparing the files in batch . . .
Back to top
View user's profile Send private message
krunalbafna
Warnings : 1

Active User


Joined: 18 Jan 2010
Posts: 143
Location: Pune

PostPosted: Fri Aug 13, 2010 2:26 pm    Post subject:
Reply with quote

Our Db2 version is 8. I am getting following error when i execute:

-084, ERROR: UNACCEPTABLE SQL STATEMENT
= 42612 SQLSTATE RETURN CODE


Thanks..
Any modification will make us imporve performance?
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Fri Aug 13, 2010 3:05 pm    Post subject:
Reply with quote

krunalbafna,

you are wasting a post to tell us that db2 vsn 9 syntax does not work on db2 Vsn 8.

at this point you need to run an explain on both queries
  • substr(char)
  • cast as decimal
and based on the explain output
  • add/modify indexes
  • insure that all predicates are indexable
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DB2 All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts ODPP(Optim Data privacy Provider) Iss... Rama kishore IBM Tools 1 Mon Nov 07, 2016 5:46 pm
No new posts Can sending 5 MB data between cobol p... Kevin Vaz CICS 12 Tue Oct 18, 2016 4:50 pm
No new posts output data in CSV Format janmejay COBOL Programming 8 Sat Oct 15, 2016 2:20 pm
This topic is locked: you cannot edit posts or make replies. output data in CSV Format janmejay DFSORT/ICETOOL 3 Sat Oct 15, 2016 2:16 pm
No new posts FTP Server and FTP.DATA vasanthz All Other Mainframe Topics 2 Wed Oct 12, 2016 11:49 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us