View previous topic :: View next topic
|
Author |
Message |
tomehta
New User
Joined: 18 Aug 2008 Posts: 98 Location: India
|
|
|
|
Hi
How can we compare two Identical tables in DB2, say in two test regions I want to compare the same tables.
Db1.Employee and Db2.Employee.
I dont want to extract the tables on files and then compare to generate the reconcile report.
Do we have some way we can do it in DB2.
thanks
Rohit |
|
Back to top |
|
|
Terry Heinze
JCL Moderator
Joined: 14 Jul 2008 Posts: 1248 Location: Richfield, MN, USA
|
|
|
|
Why don't you want to "extract the tables on files and then compare" the unloaded data sets? That would seem to be the quickest method. Utililies can be used for both steps. |
|
Back to top |
|
|
agkshirsagar
Active Member
Joined: 27 Feb 2007 Posts: 691 Location: Earth
|
|
|
|
Rohit,
Terry has raised a very valid point. Do think about that approach.
You may want to try something like this-
SELECT * FROM DB1.EMPLOYEE WHERE EMPID NOT IN(
SELECT EMPID FROM DB2.EMPLOYEE ) ;
UNION
SELECT * FROM DB2.EMPLOYEE WHERE EMPID NOT IN(
SELECT EMPID FROM DB1.EMPLOYEE ) ; |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
That query could identify key "mis-match"es, but would not compare the contents of the tables. . . |
|
Back to top |
|
|
agkshirsagar
Active Member
Joined: 27 Feb 2007 Posts: 691 Location: Earth
|
|
|
|
You are right Dick.
But it is my guess that OP wanted to know only the key mismatches.
Lets wait and watch.. |
|
Back to top |
|
|
tomehta
New User
Joined: 18 Aug 2008 Posts: 98 Location: India
|
|
|
|
I am sorry guys for replying late,
Let me elaborate further, I have to two db , having same table (approx 150), have to compare the contents.
Abt compare here, compare the contents of all columns of table except the primary keys, bit cryptic but thats the way it is. Actually primary key is generated from time stamp, so no point in comparing that.
let me know your thoughts. |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
Quote: |
let me know your thoughts. |
If you don't have common keys, how are going to decide which row in table a to compare to a row in table b?
the similarity of the tables ends with the fact that they contain the same types of columns.
are you saying that there is no other common column to use as a 'key'.
you might as well then dump both tables, replacing the timestamp with A for table A and B for table B rows.
then sort on everything except the 'A' and 'B', and do your compare. |
|
Back to top |
|
|
tomehta
New User
Joined: 18 Aug 2008 Posts: 98 Location: India
|
|
|
|
We have created a new application. A replica for the current data base is created. Now to make sure that the new application is working fine in parallel run , the new database which is a replica of existing database, needs to be compared to the existing DB. This will ensure that new application is working as fine as the existing one.
Both the current and new application will be running in parallel. Problem at hand is how to compare the contents of two databases. The rows has to be sorted as the primary keys wont be same. Depending on some timestamp selection criteria, sorted rows need to be compared. We are talking about 150 tables here having huge data.
Hope I am able to formulate the problem. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Is the data in all of the tables "driven" by this "timestamp selection criteria"?
As the 2 systems run in parallel, will all rows in all tables be inserted in the same sequences of processes. Is this timestamp part of the primary key?
If some arbitrary value is the uniqueness key and these are inserted ascending, i believe you could unload 2 of the "same" tables and (excluding the uniqueness key) compare the remainder of each row against the "same" row from the other table.
As i re-read this i may have added confusion. . . Hopefully, this will make what i want to say more clear.
Let's pick on tableA on both sides - the original and the replica. Thru the real process and the parallel process will the 100th row added be the same row in both tableAs. The timestamp identifier would be different, but should all of the rows be added serially?
If this is true, suggest you try an experiment unloading tableA from both environments, copying the unloaded file strippinig off the unequal timestamp info, and then comparing the entire remainder of the record using superc. This would not be the final solution, but rather a proof of concept that you could identify any differences (again other than the timestamp) in the 2 tables.
I understand that you'd prefer to avoid the unloads, but if this works, it would only be set up once and all of the processes would be clones, not taking much "real" work.
Please post the outcome of a test of this or clarify why this would not accomplish what you need. Good luck |
|
Back to top |
|
|
tomehta
New User
Joined: 18 Aug 2008 Posts: 98 Location: India
|
|
|
|
thanks Dick for the solution, I am also want a similar kind of solution, where coding effort has to be minimal.
Yes this time stamp is more or less primary key in the tables..,
I want to reconfirm with you my understanding. We have to unload the tables excluding the primary key, on asc(desc) on some column and then do the superc. is it correct ?
Serial part, i guess shud be fine, As both the batch job application will be working on same set of input files. I am really not confident, that I have understood 100% what you have suggested.
But i will definitely going to go to DB2 support and ask them why we cant implement this.
thanks dick.. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
You want to unload both in timestamp sequence - you just don't want to include the timestamp in the compare for differences. . .
If i understand correctly, the timestamps will be different, but all of the other columns should be the same. |
|
Back to top |
|
|
tomehta
New User
Joined: 18 Aug 2008 Posts: 98 Location: India
|
|
|
|
Hi Dick
two jobs will be processing on the same set of files, doing same kind of processing.
from the job timings, i can give a between time stamp criteria for unload of the rows ( i hope so ) from two tables.
But now the question is, will db2 give us rows back serially if I dont give any ASC/DESC criteria in unload . Its possible that the value of the column on which we are doing ASC/DESC is populated wrongly by new set of program.
Whole aim to reconcile, and prove that new application is working same as the old one. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19243 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
Its possible that the value of the column on which we are doing ASC/DESC is populated wrongly by new set of program. |
If the system timestamp is used, how might the application code corrupt it?
If i understood the original requirement, the timestamp would be different in the 2 tables, but it would always be ascending.
The compare would not include the timestamp data so; s long as the rows are created in the same order and the timestamp is ascending the compare should work.
If it is discovered that "something" is creating rows out of order, correct whatever causes this and unload again to compare.
It may take multiple iterations to get to where the code always creates the rows in the correct order - then a closer look at the content is possible. |
|
Back to top |
|
|
sushanth bobby
Senior Member
Joined: 29 Jul 2008 Posts: 1020 Location: India
|
|
|
|
Hello,
Quote: |
We have created a new application. A replica for the current data base is created. Now to make sure that the new application is working fine in parallel run , the new database which is a replica of existing database, needs to be compared to the existing DB. This will ensure that new application is working as fine as the existing one.
Both the current and new application will be running in parallel. Problem at hand is how to compare the contents of two databases. The rows has to be sorted as the primary keys wont be same. Depending on some timestamp selection criteria, sorted rows need to be compared. We are talking about 150 tables here having huge data. |
I have been working on a similar scenario with our cobol conversion team.
They have to run their production run and after conversion run in the development region and compare the outputs. I will load the development tables from the data which i took from the production. Ask them to run the production run and also ask them to send the job timings from spool(start & end).
Using the timings, i will generate a report(Log Analyser - Platinum Log analyzer) report. LA report will contain all the DML actions that happend to that database in that timing.
Once again, i will load the same data, which i loaded before for production run.
And ask them to run the after conversion run and they will send me the job timings. I will give them the LA report.
They will do the compare.
I have given a list of things that can neglected in the LA report like Timestamp, RID, LRSN, RBA & columns(which is identity always).
Now, this solution becomes completely product specific, so this is useless, unless you have platinum.
I don't know about any other tools with this kind of feature, this is handy to me when working along with this team.
Sushanth |
|
Back to top |
|
|
|