IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Query efficiency against historical data?


IBM Mainframe Forums -> DB2
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Mon Jun 13, 2011 12:46 pm
Reply with quote

Now I'm designing the massive data backup strategy , in order to achieve the best efficiency when query against these historical data.

Would u please give me some suggestions on this according to your rich experience?
Any possible solution is okay.

Thank you!
Back to top
View user's profile Send private message
bhairon singh rathore

New User


Joined: 19 Jun 2008
Posts: 91
Location: banglore

PostPosted: Mon Jun 13, 2011 3:03 pm
Reply with quote

We can use Partitioned table for this and we can store data in different partition as per date e.g like store data of today in one partition and next date data in second partition and so on......when maximum partition is reached start again from first....

Again partition information can be stored in another table which contain partition and date only for each day.

While query you can join both table on the basis of partition and date and use date in where clause

Note :-date is only the example you can use any other unique values as per functionality
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10872
Location: italy

PostPosted: Mon Jun 13, 2011 3:32 pm
Reply with quote

Quote:
Now I'm designing the massive data backup strategy , in order to achieve the best efficiency when query against these historical data.


please use the right terminology ...
backup has nothing to do with historical data
they are two completely different beasts

historical data is just that... history ( and it might not even be online )
there should be no worry about access performance , just the space utilization

if You are worried about performance, it means that the data is less than historical

how/when the data becomes history(cal) depends on the application and the relative legal implications for data accessibility

so the get better help it would be wise to reword the question

query ??? do You mean making old data available for processing
or run a <query/report> on it

again two different things
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Mon Jun 13, 2011 8:26 pm
Reply with quote

Hello,

Quote:
Now I'm designing the massive data backup strategy , in order to achieve the best efficiency when query against these historical data.
This is completely contradictory. . .

Which to you think you want to do? Create a massive backup or find some way to run "efficient" queries against historical data.

Maybe there is something else you really want to do. . .
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Mon Jun 13, 2011 8:48 pm
Reply with quote

hi, bhairon,

Thanks for your suggestion.
But I'd rather use Universal table space than Partitioned table space, as for DB2 v10, I suppose Universal table space is far more excellent than Partitioned table space from all aspects.

correct me if I'm wrong.
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Mon Jun 13, 2011 8:54 pm
Reply with quote

hi, enrico,

thanks for your reply.
when I mentioned "Query", I mean perform SQL query statement against these "OLD" data, for example, data generated one year ago with seldom query against on it.

I've been considering using UTS to realize this, but I want to know more strategy on this.
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Mon Jun 13, 2011 8:59 pm
Reply with quote

dick scherrer wrote:
Which to you think you want to do? Create a massive backup or find some way to run "efficient" queries against historical data.

Maybe there is something else you really want to do. . .

hi, Dick,
I just want the query on these historical data more efficient.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Mon Jun 13, 2011 9:18 pm
Reply with quote

Hello,

Quote:
I just want the query on these historical data more efficient.
Then you need to design historical database structures to support high-speed queries - possibly including ad-hoc. Often these are not copletely normalized to improve query performance.

Many places have implemented their vrsion of a "data warehouse" to do this. Others have merely determined which data is to be available and designed new tables to support the queries rather than optimize transacton processing.

Keep in mind that if you create these new tables, they will require considerable dasd space.
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10872
Location: italy

PostPosted: Mon Jun 13, 2011 9:31 pm
Reply with quote

as I already said one of the criteria for defining data as historical in IT sense
is the frequency of use
which in turn defines the storage hierarchy to be used and the how to make available it approach

if You are concerned with query performance it means that the data is accessed frequently and is not historical any longer

most organization historical definition is pretty dynamic according to the access pattern

online data has an overall management pattern different from backup/offline/historical data

looks from the wording that You are just concerned with accessing tables where one of the key is a date

so there is really nothing to make a big fuss out of it

reword Your requirement !
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Tue Jun 14, 2011 7:36 am
Reply with quote

in below scenario:
Let's say we have a journal table which contains all the transaction processing details.
As the data of the journal table is growing continuously, I'm considering using Partition-by-growth universal table space for it.
Because I have no experience on Partition-by-growth universal table before, I'm not sure when I query against the table for a transaction detail record that was inserted one year before, the speed will be almost the same as the query for record of today.

In addition, as only NPI indexes can be created on UTS, when the data of table becomes large, index data also growing large. Will this impact the query efficiency?

Using UTS, the data limits can reach 128T. Considering the efficiency, I think migration of old data should be implemented far before the data capacity reaches its limit, that is , move old data from the journal master table to another table. My problem is: after the migration, how can the original query successfully performed?
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Jun 14, 2011 8:23 am
Reply with quote

Hello,

Quote:
I'm not sure when I query against the table for a transaction detail record that was inserted one year before, the speed will be almost the same as the query for record of today.
Assuming proper keys have been designed, there will be very little or no difference in the time it takes for the system to retrieve a row from last year than it takes to retrieve a row from today. . .

Once again it appears that you have chosen a direction whether it is needed or not. Why was UTS picked before trying other approaches?
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Tue Jun 14, 2011 9:21 am
Reply with quote

dick scherrer wrote:
Why was UTS picked before trying other approaches?

hi, Dick,

Just because I have no other idea than UTS solution...
And this is just why I'm fetching help from you experts.
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Tue Jun 14, 2011 12:48 pm
Reply with quote

confusing use of terminology and I'll add some :
Both PBG and PBR are UTS.

for this scenario a PBR with adding/rotating partitions seems a good starting point.
The main reason why IBM changed the maximum # of partitions to 4096 was that it can contains 11 years, a partition / day (keeps the doctor away)
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Tue Jun 14, 2011 2:46 pm
Reply with quote

hi, GuyC,

If PBR is used, partitioning key is required. in this case, the date should be chosen as the partitioning key. Here comes a problem: partitioning range should be specified when the table is defined. and since one partition is used only by one day, shouldn't partitioning key range be explicitly specified for all 4096 partitions???? a horrible task...

I've been considering using PBR UTS, but I'm not sure whether rotating is applicable to PBG UTS.
Back to top
View user's profile Send private message
sushanth bobby

Senior Member


Joined: 29 Jul 2008
Posts: 1020
Location: India

PostPosted: Tue Jun 14, 2011 11:33 pm
Reply with quote

Dejunzhu,

In DB2 V10, there is a feature called Temporal Tables, which might help you keeping the current and historic data separately.

Read on it, Try it out and Test it!

Thanks,
Sushanth
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Wed Jun 15, 2011 7:15 am
Reply with quote

sushanth bobby wrote:
Dejunzhu,

In DB2 V10, there is a feature called Temporal Tables, which might help you keeping the current and historic data separately.

Read on it, Try it out and Test it!

Thanks,
Sushanth

hi, Sushanth,

Thanks for your information.
But Temporal Table is used to manage different VERSIONs of data.
but 'HISTORICAL' data here I mean OLD data like journal data asof one year ago.

So, I'm afraid Temporal Table does not apply for my requirement.
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Wed Jun 15, 2011 12:56 pm
Reply with quote

dejunzhu wrote:
hi, GuyC,

If PBR is used, partitioning key is required. in this case, the date should be chosen as the partitioning key. Here comes a problem: partitioning range should be specified when the table is defined. and since one partition is used only by one day, shouldn't partitioning key range be explicitly specified for all 4096 partitions???? a horrible task...

I've been considering using PBR UTS, but I'm not sure whether rotating is applicable to PBG UTS.

* Range doesn't have to be 1 day, it can be anything like 1 month, 3 months, 7 days,...
* Partitions can be added as time goes by. no need to define them all immediately.
* ofcourse rotating doesn't apply to PBG. Any reasonable understanding of the concept PBG would have clarified that.
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Wed Jun 15, 2011 1:04 pm
Reply with quote

GuyC wrote:

* Range doesn't have to be 1 day, it can be anything like 1 month, 3 months, 7 days,...
* Partitions can be added as time goes by. no need to define them all immediately.
* ofcourse rotating doesn't apply to PBG. Any reasonable understanding of the concept PBG would have clarified that.


So, here comes the problem:
If I do not define all partitions immediately, that is, I choose PBG UTS, how can I take advantage of rotating function? as rotating does not apply to PBG.

so, this is contradictory...
Back to top
View user's profile Send private message
GuyC

Senior Member


Joined: 11 Aug 2009
Posts: 1281
Location: Belgium

PostPosted: Wed Jun 15, 2011 1:26 pm
Reply with quote

go back to the books.
not defining partitions immediately <> choosing PBG
Every month generate and execute this script :
ALTER TABLE xxx ADD PARTITION ENDING AT (end-of-month)

Rotating is only interesting when you can start to delete data, ie. for data that is no longer required.
suppose after 36 months (you now have 37 partitions) you start executing this script instead of the previous one :
ALTER TABLE xxx ROTATE PARTITION FIRST TO LAST ENDING AT (end-of-month) RESET
Back to top
View user's profile Send private message
sushanth bobby

Senior Member


Joined: 29 Jul 2008
Posts: 1020
Location: India

PostPosted: Wed Jun 15, 2011 2:57 pm
Reply with quote

dejunzhu,

When you go for PBG, you accept the fact that data grows huge and you cannot predict how much data will grow in the next few months. So, inorder to reduce the maintainence work/task you go for PBG, so that whenever a partition reaches a specified size, a new partition is added.

As, GuyC mentioned, Adding partitions and rotating partitions can be automated. For that you have to use PBR.

Quote:
how can I take advantage of rotating function? as rotating does not apply to PBG.
You cannot. You have to do by other means, during REORG you can discard the older data.

And still did not tell,
How many months of data you want to keep ?
How many transactions per day you have like how transaction row get populated per day ?(since you were planning on partition per day).

Thanks,
Sushanth
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Thu Jun 16, 2011 8:38 am
Reply with quote

sushanth bobby wrote:

And still did not tell,
How many months of data you want to keep ?
How many transactions per day you have like how transaction row get populated per day ?(since you were planning on partition per day).

Thanks,
Sushanth


These are all still under discussion and not determined.
Anyway, I decide to use rotate-partitioning UTS.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Jun 16, 2011 8:46 am
Reply with quote

Quote:
These are all still under discussion and not determined.
How is there any useful discussion when no one involved has any understanding or experience. . . icon_confused.gif

d
Back to top
View user's profile Send private message
dejunzhu

Active User


Joined: 08 May 2008
Posts: 390
Location: China

PostPosted: Thu Jun 16, 2011 8:50 am
Reply with quote

Hi, Dick,
As I mentioned before, our shop lack money, and cannot hire very experienced experts. I'm doing partly DBA's work...
And this is just why I'm fetching help from you experts.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Thu Jun 16, 2011 9:35 am
Reply with quote

Quote:
As I mentioned before, our shop lack money, and cannot hire very experienced experts.
Nearly every shop would have us believe they lack money. What many have is the inability to manage what money they do have. . .

While your project might benefit from hiring a first-class team that has already successfully done several like this, this is not a necessary expense (usually). As i have mentoned several times, your organization should find a designer and an implementer who is local and have successfully done this and hire these people periodically to guide and review what is going on.

There are many here at the forum who would like to help, but with the lack of information (it is too much to post on a forum) and the lack of continuity (we have no idea what has happened between topics) we are just shooting in the dark. . .

d
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10872
Location: italy

PostPosted: Thu Jun 16, 2011 10:28 am
Reply with quote

Quote:
Your shop lack money, and cannot hire very experienced experts


if your shop lacks money it should not be in the IT business

Quote:
And this is just why I'm fetching help from you experts.

so you/your organization expect the we share with you a hard earned experience
in other words you/your organization are just beggars asking Us to donate money!

so just cut the BPM and stop begging for free consultancy services

(*) if Your shop lacks money how was it able to acquire a mainframe ? icon_evil.gif
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DB2 Goto page 1, 2  Next

 


Similar Topics
Topic Forum Replies
No new posts Data set Rec-Cnt and Byte-Cnt Testing & Performance 2
No new posts SCOPE PENDING option -check data DB2 2
No new posts RC query -Time column CA Products 3
No new posts Check data with Exception Table DB2 0
No new posts JCL EXEC PARM data in C Java & MQSeries 2
Search our Forums:

Back to Top