IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Reconcile two Identical tables


IBM Mainframe Forums -> DB2
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
tomehta

New User


Joined: 18 Aug 2008
Posts: 98
Location: India

PostPosted: Mon May 18, 2009 11:33 am
Reply with quote

Hi
How can we compare two Identical tables in DB2, say in two test regions I want to compare the same tables.

Db1.Employee and Db2.Employee.

I dont want to extract the tables on files and then compare to generate the reconcile report.

Do we have some way we can do it in DB2.

thanks
Rohit
Back to top
View user's profile Send private message
Terry Heinze

JCL Moderator


Joined: 14 Jul 2008
Posts: 1249
Location: Richfield, MN, USA

PostPosted: Mon May 18, 2009 9:19 pm
Reply with quote

Why don't you want to "extract the tables on files and then compare" the unloaded data sets? That would seem to be the quickest method. Utililies can be used for both steps.
Back to top
View user's profile Send private message
agkshirsagar

Active Member


Joined: 27 Feb 2007
Posts: 691
Location: Earth

PostPosted: Tue May 19, 2009 5:01 pm
Reply with quote

Rohit,
Terry has raised a very valid point. Do think about that approach.

You may want to try something like this-

SELECT * FROM DB1.EMPLOYEE WHERE EMPID NOT IN(
SELECT EMPID FROM DB2.EMPLOYEE ) ;
UNION
SELECT * FROM DB2.EMPLOYEE WHERE EMPID NOT IN(
SELECT EMPID FROM DB1.EMPLOYEE ) ;
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue May 19, 2009 9:13 pm
Reply with quote

Hello,

That query could identify key "mis-match"es, but would not compare the contents of the tables. . .
Back to top
View user's profile Send private message
agkshirsagar

Active Member


Joined: 27 Feb 2007
Posts: 691
Location: Earth

PostPosted: Tue May 19, 2009 11:49 pm
Reply with quote

You are right Dick.
But it is my guess that OP wanted to know only the key mismatches.
Lets wait and watch..
Back to top
View user's profile Send private message
tomehta

New User


Joined: 18 Aug 2008
Posts: 98
Location: India

PostPosted: Thu Jun 11, 2009 7:57 pm
Reply with quote

I am sorry guys for replying late,

Let me elaborate further, I have to two db , having same table (approx 150), have to compare the contents.
Abt compare here, compare the contents of all columns of table except the primary keys, bit cryptic but thats the way it is. Actually primary key is generated from time stamp, so no point in comparing that.

let me know your thoughts.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Thu Jun 11, 2009 8:17 pm
Reply with quote

Quote:
let me know your thoughts.


If you don't have common keys, how are going to decide which row in table a to compare to a row in table b?

the similarity of the tables ends with the fact that they contain the same types of columns.

are you saying that there is no other common column to use as a 'key'.

you might as well then dump both tables, replacing the timestamp with A for table A and B for table B rows.
then sort on everything except the 'A' and 'B', and do your compare.
Back to top
View user's profile Send private message
tomehta

New User


Joined: 18 Aug 2008
Posts: 98
Location: India

PostPosted: Fri Jun 12, 2009 7:49 pm
Reply with quote

We have created a new application. A replica for the current data base is created. Now to make sure that the new application is working fine in parallel run , the new database which is a replica of existing database, needs to be compared to the existing DB. This will ensure that new application is working as fine as the existing one.

Both the current and new application will be running in parallel. Problem at hand is how to compare the contents of two databases. The rows has to be sorted as the primary keys wont be same. Depending on some timestamp selection criteria, sorted rows need to be compared. We are talking about 150 tables here having huge data.

Hope I am able to formulate the problem.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sat Jun 13, 2009 9:57 am
Reply with quote

Hello,

Is the data in all of the tables "driven" by this "timestamp selection criteria"?

As the 2 systems run in parallel, will all rows in all tables be inserted in the same sequences of processes. Is this timestamp part of the primary key?

If some arbitrary value is the uniqueness key and these are inserted ascending, i believe you could unload 2 of the "same" tables and (excluding the uniqueness key) compare the remainder of each row against the "same" row from the other table.

As i re-read this i may have added confusion. . . icon_confused.gif Hopefully, this will make what i want to say more clear.

Let's pick on tableA on both sides - the original and the replica. Thru the real process and the parallel process will the 100th row added be the same row in both tableAs. The timestamp identifier would be different, but should all of the rows be added serially?

If this is true, suggest you try an experiment unloading tableA from both environments, copying the unloaded file strippinig off the unequal timestamp info, and then comparing the entire remainder of the record using superc. This would not be the final solution, but rather a proof of concept that you could identify any differences (again other than the timestamp) in the 2 tables.

I understand that you'd prefer to avoid the unloads, but if this works, it would only be set up once and all of the processes would be clones, not taking much "real" work.

Please post the outcome of a test of this or clarify why this would not accomplish what you need. Good luck icon_smile.gif
Back to top
View user's profile Send private message
tomehta

New User


Joined: 18 Aug 2008
Posts: 98
Location: India

PostPosted: Fri Jun 26, 2009 11:37 pm
Reply with quote

thanks Dick for the solution, I am also want a similar kind of solution, where coding effort has to be minimal.

Yes this time stamp is more or less primary key in the tables..,

I want to reconfirm with you my understanding. We have to unload the tables excluding the primary key, on asc(desc) on some column and then do the superc. is it correct ?

Serial part, i guess shud be fine, As both the batch job application will be working on same set of input files. I am really not confident, that I have understood 100% what you have suggested.
But i will definitely going to go to DB2 support and ask them why we cant implement this.

thanks dick..
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sat Jun 27, 2009 12:55 am
Reply with quote

Hello,

You want to unload both in timestamp sequence - you just don't want to include the timestamp in the compare for differences. . .

If i understand correctly, the timestamps will be different, but all of the other columns should be the same.
Back to top
View user's profile Send private message
tomehta

New User


Joined: 18 Aug 2008
Posts: 98
Location: India

PostPosted: Sat Jul 11, 2009 1:59 am
Reply with quote

Hi Dick
two jobs will be processing on the same set of files, doing same kind of processing.
from the job timings, i can give a between time stamp criteria for unload of the rows ( i hope so ) from two tables.

But now the question is, will db2 give us rows back serially if I dont give any ASC/DESC criteria in unload . Its possible that the value of the column on which we are doing ASC/DESC is populated wrongly by new set of program.
Whole aim to reconcile, and prove that new application is working same as the old one.
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sat Jul 11, 2009 2:13 am
Reply with quote

Hello,

Quote:
Its possible that the value of the column on which we are doing ASC/DESC is populated wrongly by new set of program.
If the system timestamp is used, how might the application code corrupt it?

If i understood the original requirement, the timestamp would be different in the 2 tables, but it would always be ascending.

The compare would not include the timestamp data so; s long as the rows are created in the same order and the timestamp is ascending the compare should work.

If it is discovered that "something" is creating rows out of order, correct whatever causes this and unload again to compare.

It may take multiple iterations to get to where the code always creates the rows in the correct order - then a closer look at the content is possible.
Back to top
View user's profile Send private message
sushanth bobby

Senior Member


Joined: 29 Jul 2008
Posts: 1020
Location: India

PostPosted: Mon Jul 13, 2009 11:28 am
Reply with quote

Hello,

Quote:
We have created a new application. A replica for the current data base is created. Now to make sure that the new application is working fine in parallel run , the new database which is a replica of existing database, needs to be compared to the existing DB. This will ensure that new application is working as fine as the existing one.

Both the current and new application will be running in parallel. Problem at hand is how to compare the contents of two databases. The rows has to be sorted as the primary keys wont be same. Depending on some timestamp selection criteria, sorted rows need to be compared. We are talking about 150 tables here having huge data.


I have been working on a similar scenario with our cobol conversion team.
They have to run their production run and after conversion run in the development region and compare the outputs. I will load the development tables from the data which i took from the production. Ask them to run the production run and also ask them to send the job timings from spool(start & end).

Using the timings, i will generate a report(Log Analyser - Platinum Log analyzer) report. LA report will contain all the DML actions that happend to that database in that timing.

Once again, i will load the same data, which i loaded before for production run.

And ask them to run the after conversion run and they will send me the job timings. I will give them the LA report.

They will do the compare.

I have given a list of things that can neglected in the LA report like Timestamp, RID, LRSN, RBA & columns(which is identity always).

Now, this solution becomes completely product specific, so this is useless, unless you have platinum.
I don't know about any other tools with this kind of feature, this is handy to me when working along with this team.

Sushanth
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DB2

 


Similar Topics
Topic Forum Replies
No new posts Need to fetch data from so many DB2 t... DB2 9
No new posts How to: PK does not exist in several ... DB2 6
No new posts Discrepancy b/w SYSIBM tables and BMC... DB2 0
No new posts SYSIBM Tables Query DB2 8
No new posts Column names in SYSIBM tables DB2 5
Search our Forums:

Back to Top