I have a requirement of creating new DB2 Tables(Offcourse this will be created by the DBA). Now in these DB2 tables around 10 million records will come on a daily basis.
We have a requirement originating because of this mentioned below.
1) The number of tables to be archived is less(around 10) but the volume is VERY HIGH.
2) We want to keep this archived DATA for 2 YEARS(OFFLINE) and keep the last 6 months data (ONLINE on DASD's ).
3) Now business wants to retrieve this data ON DEMAND basis for security reasons.
1) Can i implement some Archival/Retrieval strategy MVS environment, if YES, Can you please let me know which TOOL/UTILITY can i use to ARCHIVE the Data?
2) What sort of Architecture can i propose which will help in retrieval of historical data rapidly(In terms of where the Data Archived should reside)?
Any sort of Inputs will be helpful. Please do share your experiences.
Joined: 23 Nov 2006 Posts: 19270 Location: Inside the Matrix
Before you begin design, i'd suggest you get a definition of just what is "rapidly". That may mean quite different things to different people in your organization.
I'm unsure how the "on demand" requirement relates to security. Clarification would help.
How much of the archived data might be needed for an "on demand" process? Something to consider is that the smaller 6 months of data will be about 1,800,000,000 rows. The larger (2 years or 1.5 years) will be 3-4 times that many rows. I suspect that there is no really rapid (again, depending on the definition of rapid) way to reload over a billion rows - and that is the smaller case.
Your organization may have a different theory, but most of the places i support operate on the theory that their databases are just as (if not more) secure than flat files.
If your organization is willing to invest in a large volume of historical data, using one or more read-only databases may accomplish what you want. Keep in mind that the read-only databases could be defined with many alternate indexes to support a variety of "on demand" requests for data.
Thank you for your response. I have tried to provide some explanations to your queries, please do read.
First of all 'rapidly' is not the correct word(thanks for pointing it out)... they are more intersted in architecture part and then looking for more cost effective solution to this problem.
We can push the request for Archived Data by business into Batch process also.
There are some transactions being carried out by the Back End staff, and due to recent SOX compliance directives, my organization want to track those(infact all) trasactions now. I hope this clarifies the word security reasons. Just to prevent frauds etc..
In the worst case, the request for online+offline data could even be for 2.5 years(maximum). But i guess even in this case the resultset will not be billion rows, infact i suppose it is going to be significantly less
(around 1000) as it is based on some conditional inputs from business.
Yes you are correct we don't need to update the archived DB, could you please clarify what is meant by READ ONLY Database?
Could you share with me what sort of Architecture can i propose for this situation? And you mentioned something about the Index to be built on the Archived Data, though i have some understanding about indexes, Could you please throw more light on how multiple indexes could be used in this case?