IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Db2 SQL - how to switch among different where conditions


IBM Mainframe Forums -> DB2
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Mon Apr 24, 2023 9:37 pm
Reply with quote

Hi,

Need your advice on the below scenario.

An old COBOL program has SELECT SQL where it has multiple WHERE clauses. But these WHERE clauses don't act together. Depending on variable values a single WHERE condition is switched on and others are ignored.

SQL Query:
-----------------------------------------
Code:
SELECT <Column1, Column2......ColumnN>
INTO <:DCLGEN variable1, DCLGEN variable2...variableN>
FROM <Table name>
WHERE ((Column1 = <:HOST Var1 OR 'N' = :HOST switch1)
             AND
             (Column1 = <:HOST Var1 OR 'N' = :HOST switch1)
             AND
             (Column1 = <:HOST Var1 OR 'N' = :HOST switch1)
             AND
             (Column1 = <:HOST Var1 OR 'N' = :HOST switch1))


Now in the above query as you can see there are 4 WHERE conditions connected with AND.

In the program, when we want to run the first condition under WHERE clause we are passing SPACE to HOST switch1 and 'N' to rest all three. Similarly when we want to activate the 3rd condition then SPACE to HOST switch3 and 'N' to rest all three and so on.

Now we are facing a performance issue with this query. The reason is, though we have 4 indexes (all 4 columns in WHERE condition) but because of these HOST Switch variables DB2 is doing TS space instead of Index scan. So we need to rewrite the query.

1. One solution is to remove all HOST switch variables and break the query into 4 diff queries. But that program will be lengthy and lots of code changes needed. So we are trying to avoid this.

Could someone please share suggestions on how we can achieve this with minimal code change (especially if we can achieve this within a single SQL query)?

NOTE- Due to business logic we need to use specific WHERE conditions depending on certain variable values.

Thanks
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2011
Location: USA

PostPosted: Tue Apr 25, 2023 12:59 am
Reply with quote

I personally do not find any difference between those 4 WHERE conditions connected with AND. icon_rolleyes.gif
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Tue Apr 25, 2023 1:20 am
Reply with quote

Hi,

Sorry my Bad..

Corrected SQL:
----------------------------------
Code:
SELECT <Column1, Column2......ColumnN>
INTO <:DCLGEN variable1, DCLGEN variable2...variableN>
FROM <Table name>
WHERE ((Column1 = <:HOST Var1 OR 'N' = :HOST switch1)
             AND
             (Column1 = <:HOST Var2 OR 'N' = :HOST switch2)
             AND
             (Column1 = <:HOST Var2 OR 'N' = :HOST switch3)
             AND
             (Column1 = <:HOST Var3 OR 'N' = :HOST switch4))


Now it should make sense.. Each WHERE condition has a certain logic check and that is controlled by the Switch variable..
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Tue Apr 25, 2023 1:26 am
Reply with quote

Writing the code directly would have been easy but as I can't do that.. so trying to keep the question the same by needing to simplify the SQL and had to mask it.. so a mistake happened..

Let me know if the question is clear.. basically, I am looking for a way to implement the same logic that without using the working storage variable (Switch variables)..

is it possible to switch between these 4 where conditions depending on some business logic?
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2011
Location: USA

PostPosted: Tue Apr 25, 2023 2:23 am
Reply with quote

As I can see in your WHERE clause, there is only 1 (one) column used from your table, plus 8 different host variables.

I would calculate the whole switch on host variables in your code BEFORE SQL itself, and it should produce a single host variable, like var_final.

Then, your SQL can be simplified to the simple check clause:
Code:
WHERE (column1 <= :var_final)

P.S.
I guess, in your case the tablespace scan has been used NOT because of missing index, but BECAUSE the SQL processor was not able to understand from your sophisticated WHERE: what exactly needs to be scanned!
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Tue Apr 25, 2023 8:05 am
Reply with quote

Hi,

Yes you are right TS scan is happening because of those HOST variables columns are having Indexes.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Tue Apr 25, 2023 10:01 am
Reply with quote

Another way , Try adding column1= column1 in each condition and do explain again. ( db2 gets tricked when bind is done to force the index ) I hope you have an index on column1

If that doesn’t work then Use Union and declare a cursor.
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Tue Apr 25, 2023 10:45 am
Reply with quote

Yes I am working on as you both suggested. lets see how it comes up..

[I know I should open a new Topic but because they are related so posting it here only.. Let me know or I will post as a new question]

A list of programs I need to work on rewriting SQLs. Another program I see is having below SQL. Though it looks same but bit tricky..

Code:
SELECT EMP_ID, EMP_SSN, EMP_PH, EMP_SALARY, EMP_EMAIL
INTO <:DCLGEN variable1, DCLGEN variable2...variable5>
FROM <Table name>
WHERE ((EMP_ID = <:HOST Var1> OR 'N' = :HOST switch1)
             AND
             (EMP_SSN = <:HOST Var2> OR 'N' = :HOST switch2)
             AND
             (EMP_PH = <:HOST Var3> OR 'N' = :HOST switch3)
             AND
             (EMP_EMAIL = <:HOST Var4> OR 'N' = :HOST switch4))


Here if you see they are using different columns in the WHERE clause, Let's say if I got input for EMP_ID, the program will activate 1st query.. likewise if program gets input for EMP_EMAIL then 4th query.

Again performance issue due to host variable inside the query. But because different columns I am a bit confused about how to approach and how to write a single query where I can achieve the same with minimal program change.. No cursor. It's a Singleton SELECT.
Back to top
View user's profile Send private message
sumannath

New User


Joined: 20 Mar 2017
Posts: 8
Location: India

PostPosted: Tue Apr 25, 2023 1:29 pm
Reply with quote

What about a simple select like below:
Code:

SELECT EMP_ID, EMP_SSN, EMP_PH, EMP_SALARY, EMP_EMAIL
INTO <:DCLGEN variable1, DCLGEN variable2...variable5>
FROM <Table name>
WHERE (EMP_ID = <:HOST Var1> OR  EMP_SSN = <:HOST Var2> OR  EMP_PH = <:HOST Var3> OR  EMP_EMAIL = <:HOST Var4>)


Have value in the host variable which is required, for other use an initialized/never appearing value
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2011
Location: USA

PostPosted: Tue Apr 25, 2023 3:38 pm
Reply with quote

subratarec wrote:
Hi,

Yes you are right TS scan is happening because of those HOST variables columns are having Indexes.
My post was just opposite: host variables are not table columns, that’s why they do not, and cannot at all have any table indexes.
In your example only 1 (one) single table column is used in WHERE clause, named “column1” (surprise!)
That’s why your CASE expression can be calculated OUTSIDE of your SQL, only once, before scanning your table in your SELECT.

Oops!… I see you have already fixed again your previous typos. Now there are four different columns involved, rather than a single one.

PLEASE!!! Verify your postings three times, before making people start thinking about senseless typos!
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Tue Apr 25, 2023 5:01 pm
Reply with quote

sumannath wrote:
What about a simple select like below:
Code:

SELECT EMP_ID, EMP_SSN, EMP_PH, EMP_SALARY, EMP_EMAIL
INTO <:DCLGEN variable1, DCLGEN variable2...variable5>
FROM <Table name>
WHERE (EMP_ID = <:HOST Var1> OR  EMP_SSN = <:HOST Var2> OR  EMP_PH = <:HOST Var3> OR  EMP_EMAIL = <:HOST Var4>)


Have value in the host variable which is required, for other use an initialized/never appearing value

This won’t work because
1. Index issues as OR conditions kills that
2. What if they have wrong ssn and right empid? It will still bring back the data but in original query it won’t due to AND . ( I am not sure if this is one of the use case yet)
3. What if spaces or zeroes are valid values ? It will return more than expected results and potentially ends with -811.
But what you suggested can be tried based on the data analysis and table definitions.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Tue Apr 25, 2023 5:09 pm
Reply with quote

subratarec wrote:
Yes I am working on as you both suggested. lets see how it comes up..

[I know I should open a new Topic but because they are related so posting it here only.. Let me know or I will post as a new question]

A list of programs I need to work on rewriting SQLs. Another program I see is having below SQL. Though it looks same but bit tricky..

Code:
SELECT EMP_ID, EMP_SSN, EMP_PH, EMP_SALARY, EMP_EMAIL
INTO <:DCLGEN variable1, DCLGEN variable2...variable5>
FROM <Table name>
WHERE ((EMP_ID = <:HOST Var1> OR 'N' = :HOST switch1)
             AND
             (EMP_SSN = <:HOST Var2> OR 'N' = :HOST switch2)
             AND
             (EMP_PH = <:HOST Var3> OR 'N' = :HOST switch3)
             AND
             (EMP_EMAIL = <:HOST Var4> OR 'N' = :HOST switch4))


Here if you see they are using different columns in the WHERE clause, Let's say if I got input for EMP_ID, the program will activate 1st query.. likewise if program gets input for EMP_EMAIL then 4th query.

Again performance issue due to host variable inside the query. But because different columns I am a bit confused about how to approach and how to write a single query where I can achieve the same with minimal program change.. No cursor. It's a Singleton SELECT.

Again UNION for each AND shall work if each column has its own index.
Ideally this type of requirements should have been written with dynamic sqls at first place and you may not able to do with minimal changes and time the query just like that .
Have you contacted DBA,s at your site yet for suggestions?
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2011
Location: USA

PostPosted: Tue Apr 25, 2023 6:17 pm
Reply with quote

One single SELECT cannot be optimized at once to use four different indexes, on four different columns. Period.

Logically your “switch” is related to your program logic, and not - by any chance - to SQL logic itself. The application MUST be designed properly:
1) based on your “switch” the search column must be chosen in your application, and
2) depending on the chosen column, one of four separate SELECTs, each optimized on its own index, must be executed.

You cannot fix the application design problem by any secret plotting of a primitive SQL!

P.S.
No DBA would be able to help in this case of failed application design.
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Wed Apr 26, 2023 10:02 pm
Reply with quote

Thanks all for your time and suggestions.. I am currently in talks with In-house DBA but meanwhile, I am converting the program from STATIC to DYNAMIC SQL then I guess this is WHERE the condition issue will be resolved..

The upper portion of the SQL I am keeping in one working storage group variable and the 4 where conditions I am keeping in separate working storage variables.. then depending on the business logic I am doing STRING to concatenate the main SQL query with either 1st or 2nd or 3rd or 4th WHERE condition and then PREPARING the SQL to execute.

Will get back to you with my update..
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Wed Apr 26, 2023 10:05 pm
Reply with quote

sumannath wrote:
What about a simple select like below:
Code:

SELECT EMP_ID, EMP_SSN, EMP_PH, EMP_SALARY, EMP_EMAIL
INTO <:DCLGEN variable1, DCLGEN variable2...variable5>
FROM <Table name>
WHERE (EMP_ID = <:HOST Var1> OR  EMP_SSN = <:HOST Var2> OR  EMP_PH = <:HOST Var3> OR  EMP_EMAIL = <:HOST Var4>)


Have value in the host variable which is required, for other use an initialized/never appearing value


This one I already tried this and have kept it in my Code repository. If I need to activate 1st query then I am using the proper value for 1st WHERE clause and rest all 3 WHERE clauses I am using LOW-VALUES in those Host Variables. Same for other WHERE clauses. It is working but as Rohit suggested I am yet to use full production load against my program to see how it is behaving..

Thanks! anyway for providing your thought..
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2011
Location: USA

PostPosted: Thu Apr 27, 2023 1:31 am
Reply with quote

subratarec wrote:
Thanks all for your time and suggestions.. I am currently in talks with In-house DBA but meanwhile, I am converting the program from STATIC to DYNAMIC SQL then I guess this is WHERE the condition issue will be resolved..

The upper portion of the SQL I am keeping in one working storage group variable and the 4 where conditions I am keeping in separate working storage variables.. then depending on the business logic I am doing STRING to concatenate the main SQL query with either 1st or 2nd or 3rd or 4th WHERE condition and then PREPARING the SQL to execute.

Will get back to you with my update..

Dynamically created exactly four types of a SQL should give exactly the same result as initially prepared four static SQL, as I tried to explain in my previous response. The difference should be: extra computing resources are required to reach the same final goal. I guess four static optimized SQL should give a better performance compared to dynamically created four types os SQL.

I personally always prefer using simple methods for simple tasks, not vice versa. The best tool for hammering nails is a regular hammer, not any GPS-controlled flying drone with AI.
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3048
Location: NYC,USA

PostPosted: Thu Apr 27, 2023 9:59 am
Reply with quote

After you load the data to test make sure to run report and run stat before executing explain or a query . Second try self join as I suggested earlier to see if explain shows it forces indexes ….

Quick and dirty way is also to create another index with composite key of 4 columns and test it out .

However , I do agree that splitting query into 4 different ones would save time as well as will enforce index during the bind time unlike runtime for dynamic query and less efforts.

You have one too many choices .. we will wait to hear what works out the best and will help others in a same situation in future.

Good Luck!
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Tue May 16, 2023 12:44 am
Reply with quote

Hi All,

Apologize for the delay in my response. There are 3 versions of the code I created.

1. Broke the single SQL into 4 different SQLs and change the program logic to accommodate this new setup. But lots of code changes had to make to incorporate this.

2. Kept the SQL intact and used correct values against the WHERE clause program wants to activate depending on the logic and LOW-VALUES against those rest 3 WHERE clauses which the program wants to ignore. Very less code change.

3. Changed from static to dynamic SQL to generate the exact needed WHERE clause depending on the program logic.

Test all these 3 with PROD load data(duration of about 1-week PROD load). All three gave the same output. But finally going with option 1 because of its simplicity and better query performance.

Thanks! all for your valuable input.
Back to top
View user's profile Send private message
sergeyken

Senior Member


Joined: 29 Apr 2008
Posts: 2011
Location: USA

PostPosted: Tue May 16, 2023 12:57 am
Reply with quote

subratarec wrote:
But finally going with option 1 because of its simplicity and better query performance.

Frankly speaking it was almost obvious if only started thinking about: what is really happening inside of this electronic monster? icon_biggrin.gif
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DB2

 


Similar Topics
Topic Forum Replies
No new posts How to load to DB2 with column level ... DB2 6
No new posts To search DB2 table based on Conditio... DB2 1
This topic is locked: you cannot edit posts or make replies. SUM based on two conditions using SYN... SYNCSORT 7
No new posts SORT - Multiple Conditions DFSORT/ICETOOL 4
No new posts Process Last Key of a record based on... DFSORT/ICETOOL 5
Search our Forums:

Back to Top