View previous topic :: View next topic
|
Author |
Message |
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
Now I'm designing the massive data backup strategy , in order to achieve the best efficiency when query against these historical data.
Would u please give me some suggestions on this according to your rich experience?
Any possible solution is okay.
Thank you! |
|
Back to top |
|
|
bhairon singh rathore
New User
Joined: 19 Jun 2008 Posts: 91 Location: banglore
|
|
|
|
We can use Partitioned table for this and we can store data in different partition as per date e.g like store data of today in one partition and next date data in second partition and so on......when maximum partition is reached start again from first....
Again partition information can be stored in another table which contain partition and date only for each day.
While query you can join both table on the basis of partition and date and use date in where clause
Note :-date is only the example you can use any other unique values as per functionality |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10888 Location: italy
|
|
|
|
Quote: |
Now I'm designing the massive data backup strategy , in order to achieve the best efficiency when query against these historical data. |
please use the right terminology ...
backup has nothing to do with historical data
they are two completely different beasts
historical data is just that... history ( and it might not even be online )
there should be no worry about access performance , just the space utilization
if You are worried about performance, it means that the data is less than historical
how/when the data becomes history(cal) depends on the application and the relative legal implications for data accessibility
so the get better help it would be wise to reword the question
query ??? do You mean making old data available for processing
or run a <query/report> on it
again two different things |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
Now I'm designing the massive data backup strategy , in order to achieve the best efficiency when query against these historical data. |
This is completely contradictory. . .
Which to you think you want to do? Create a massive backup or find some way to run "efficient" queries against historical data.
Maybe there is something else you really want to do. . . |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
hi, bhairon,
Thanks for your suggestion.
But I'd rather use Universal table space than Partitioned table space, as for DB2 v10, I suppose Universal table space is far more excellent than Partitioned table space from all aspects.
correct me if I'm wrong. |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
hi, enrico,
thanks for your reply.
when I mentioned "Query", I mean perform SQL query statement against these "OLD" data, for example, data generated one year ago with seldom query against on it.
I've been considering using UTS to realize this, but I want to know more strategy on this. |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
dick scherrer wrote: |
Which to you think you want to do? Create a massive backup or find some way to run "efficient" queries against historical data.
Maybe there is something else you really want to do. . . |
hi, Dick,
I just want the query on these historical data more efficient. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
I just want the query on these historical data more efficient. |
Then you need to design historical database structures to support high-speed queries - possibly including ad-hoc. Often these are not copletely normalized to improve query performance.
Many places have implemented their vrsion of a "data warehouse" to do this. Others have merely determined which data is to be available and designed new tables to support the queries rather than optimize transacton processing.
Keep in mind that if you create these new tables, they will require considerable dasd space. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10888 Location: italy
|
|
|
|
as I already said one of the criteria for defining data as historical in IT sense
is the frequency of use
which in turn defines the storage hierarchy to be used and the how to make available it approach
if You are concerned with query performance it means that the data is accessed frequently and is not historical any longer
most organization historical definition is pretty dynamic according to the access pattern
online data has an overall management pattern different from backup/offline/historical data
looks from the wording that You are just concerned with accessing tables where one of the key is a date
so there is really nothing to make a big fuss out of it
reword Your requirement ! |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
in below scenario:
Let's say we have a journal table which contains all the transaction processing details.
As the data of the journal table is growing continuously, I'm considering using Partition-by-growth universal table space for it.
Because I have no experience on Partition-by-growth universal table before, I'm not sure when I query against the table for a transaction detail record that was inserted one year before, the speed will be almost the same as the query for record of today.
In addition, as only NPI indexes can be created on UTS, when the data of table becomes large, index data also growing large. Will this impact the query efficiency?
Using UTS, the data limits can reach 128T. Considering the efficiency, I think migration of old data should be implemented far before the data capacity reaches its limit, that is , move old data from the journal master table to another table. My problem is: after the migration, how can the original query successfully performed? |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
I'm not sure when I query against the table for a transaction detail record that was inserted one year before, the speed will be almost the same as the query for record of today. |
Assuming proper keys have been designed, there will be very little or no difference in the time it takes for the system to retrieve a row from last year than it takes to retrieve a row from today. . .
Once again it appears that you have chosen a direction whether it is needed or not. Why was UTS picked before trying other approaches? |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
dick scherrer wrote: |
Why was UTS picked before trying other approaches? |
hi, Dick,
Just because I have no other idea than UTS solution...
And this is just why I'm fetching help from you experts. |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
confusing use of terminology and I'll add some :
Both PBG and PBR are UTS.
for this scenario a PBR with adding/rotating partitions seems a good starting point.
The main reason why IBM changed the maximum # of partitions to 4096 was that it can contains 11 years, a partition / day (keeps the doctor away) |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
hi, GuyC,
If PBR is used, partitioning key is required. in this case, the date should be chosen as the partitioning key. Here comes a problem: partitioning range should be specified when the table is defined. and since one partition is used only by one day, shouldn't partitioning key range be explicitly specified for all 4096 partitions???? a horrible task...
I've been considering using PBR UTS, but I'm not sure whether rotating is applicable to PBG UTS. |
|
Back to top |
|
|
sushanth bobby
Senior Member
Joined: 29 Jul 2008 Posts: 1020 Location: India
|
|
|
|
Dejunzhu,
In DB2 V10, there is a feature called Temporal Tables, which might help you keeping the current and historic data separately.
Read on it, Try it out and Test it!
Thanks,
Sushanth |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
sushanth bobby wrote: |
Dejunzhu,
In DB2 V10, there is a feature called Temporal Tables, which might help you keeping the current and historic data separately.
Read on it, Try it out and Test it!
Thanks,
Sushanth |
hi, Sushanth,
Thanks for your information.
But Temporal Table is used to manage different VERSIONs of data.
but 'HISTORICAL' data here I mean OLD data like journal data asof one year ago.
So, I'm afraid Temporal Table does not apply for my requirement. |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
dejunzhu wrote: |
hi, GuyC,
If PBR is used, partitioning key is required. in this case, the date should be chosen as the partitioning key. Here comes a problem: partitioning range should be specified when the table is defined. and since one partition is used only by one day, shouldn't partitioning key range be explicitly specified for all 4096 partitions???? a horrible task...
I've been considering using PBR UTS, but I'm not sure whether rotating is applicable to PBG UTS. |
* Range doesn't have to be 1 day, it can be anything like 1 month, 3 months, 7 days,...
* Partitions can be added as time goes by. no need to define them all immediately.
* ofcourse rotating doesn't apply to PBG. Any reasonable understanding of the concept PBG would have clarified that. |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
GuyC wrote: |
* Range doesn't have to be 1 day, it can be anything like 1 month, 3 months, 7 days,...
* Partitions can be added as time goes by. no need to define them all immediately.
* ofcourse rotating doesn't apply to PBG. Any reasonable understanding of the concept PBG would have clarified that. |
So, here comes the problem:
If I do not define all partitions immediately, that is, I choose PBG UTS, how can I take advantage of rotating function? as rotating does not apply to PBG.
so, this is contradictory... |
|
Back to top |
|
|
GuyC
Senior Member
Joined: 11 Aug 2009 Posts: 1281 Location: Belgium
|
|
|
|
go back to the books.
not defining partitions immediately <> choosing PBG
Every month generate and execute this script :
ALTER TABLE xxx ADD PARTITION ENDING AT (end-of-month)
Rotating is only interesting when you can start to delete data, ie. for data that is no longer required.
suppose after 36 months (you now have 37 partitions) you start executing this script instead of the previous one :
ALTER TABLE xxx ROTATE PARTITION FIRST TO LAST ENDING AT (end-of-month) RESET |
|
Back to top |
|
|
sushanth bobby
Senior Member
Joined: 29 Jul 2008 Posts: 1020 Location: India
|
|
|
|
dejunzhu,
When you go for PBG, you accept the fact that data grows huge and you cannot predict how much data will grow in the next few months. So, inorder to reduce the maintainence work/task you go for PBG, so that whenever a partition reaches a specified size, a new partition is added.
As, GuyC mentioned, Adding partitions and rotating partitions can be automated. For that you have to use PBR.
Quote: |
how can I take advantage of rotating function? as rotating does not apply to PBG. |
You cannot. You have to do by other means, during REORG you can discard the older data.
And still did not tell,
How many months of data you want to keep ?
How many transactions per day you have like how transaction row get populated per day ?(since you were planning on partition per day).
Thanks,
Sushanth |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
sushanth bobby wrote: |
And still did not tell,
How many months of data you want to keep ?
How many transactions per day you have like how transaction row get populated per day ?(since you were planning on partition per day).
Thanks,
Sushanth |
These are all still under discussion and not determined.
Anyway, I decide to use rotate-partitioning UTS. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Quote: |
These are all still under discussion and not determined. |
How is there any useful discussion when no one involved has any understanding or experience. . .
d |
|
Back to top |
|
|
dejunzhu
Active User
Joined: 08 May 2008 Posts: 390 Location: China
|
|
|
|
Hi, Dick,
As I mentioned before, our shop lack money, and cannot hire very experienced experts. I'm doing partly DBA's work...
And this is just why I'm fetching help from you experts. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Quote: |
As I mentioned before, our shop lack money, and cannot hire very experienced experts. |
Nearly every shop would have us believe they lack money. What many have is the inability to manage what money they do have. . .
While your project might benefit from hiring a first-class team that has already successfully done several like this, this is not a necessary expense (usually). As i have mentoned several times, your organization should find a designer and an implementer who is local and have successfully done this and hire these people periodically to guide and review what is going on.
There are many here at the forum who would like to help, but with the lack of information (it is too much to post on a forum) and the lack of continuity (we have no idea what has happened between topics) we are just shooting in the dark. . .
d |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10888 Location: italy
|
|
|
|
Quote: |
Your shop lack money, and cannot hire very experienced experts |
if your shop lacks money it should not be in the IT business
Quote: |
And this is just why I'm fetching help from you experts. |
so you/your organization expect the we share with you a hard earned experience
in other words you/your organization are just beggars asking Us to donate money!
so just cut the BPM and stop begging for free consultancy services
(*) if Your shop lacks money how was it able to acquire a mainframe ? |
|
Back to top |
|
|
|