View previous topic :: View next topic
|
Author |
Message |
amargallani
New User
Joined: 04 May 2010 Posts: 5 Location: ballarpur
|
|
|
|
Hi, Can anyone help me with the below requiremet:
I have to comapre two file's, acctually the files should be identical if the files are not identical the we have two provide the reason. The reasons can be:
1. Number of records are not equal
2. File attributes not matching
3. An extra record in a file
4. Records are equal but in jumbeled order ( Header is placed at the bottom of a new file)
As this can be done mannually if there are few files, but we are expecting 400 + files and thinking to automate the process or reduce the efforts by 50-60%. its difficult to achieve, I have proposed a solution can you guys suggest me if there is any thing with wich it can be achieved.
This can be done using REXX:
1. Invoke SuperC via Rexx / Batch to compare New_File and Old_File
IF the file matches (Apple to Apple match) Exit the Process
IF there is a mismatch go ahead with step2
2. Indexing the files (Add the sequence number to both the files )
3. Sort the files in Asecnding order, here the files would be sorted leaving the index.
4. Now compare the sorted files, If file matches the issue is files are not in sorted order.
5. If still there is a mismatch the this file can be processed manually.
As we cannot fully automate the process, bu this work can be reduced by 50%.
Can anybody help me on this....
Thanks in advance |
|
Back to top |
|
|
expat
Global Moderator
Joined: 14 Mar 2007 Posts: 8797 Location: Welsh Wales
|
|
|
|
It looks as though you have the best part of the logic worked out, so what do you need help with. |
|
Back to top |
|
|
Escapa
Senior Member
Joined: 16 Feb 2007 Posts: 1399 Location: IL, USA
|
|
|
|
Quote: |
As we cannot fully automate the process |
We can... .. As long as you know all the requirements clearly.. And so you have it. Where are you stuck?
And,
Why?
Quote: |
2. Indexing the files (Add the sequence number to both the files )
3. Sort the files in Asecnding order, here the files would be sorted leaving the index.
4. Now compare the sorted files, If file matches the issue is files are not in sorted order.
5. If still there is a mismatch the this file can be processed manually.
|
Why not just process the output you get from SUPERC. That should give almost everything required if you use proper options available in SUPERC. |
|
Back to top |
|
|
amargallani
New User
Joined: 04 May 2010 Posts: 5 Location: ballarpur
|
|
|
|
Actually we should not sort the given files, as the data in one of the files can be in jumbled order....here we get the reason that the new file is Sorted/not sorted and this is the disceprency.
In order to easily compare these files we are adding the sequence no.
This indexing can also be used...if the 1st record of new file is placed at the bottom of old file.....we will add SEQNUM and sort the files...and the sorting will be done on the data not considering the sequence number.... after compare we can use the sequence no. to tell the user where the discrepancy is.... |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2547 Location: Silicon Valley
|
|
|
|
Your comments about sorting and jumbled order are not clear to me.
I think you should first create test files and have one that is in 'jumbled order'. And then see if SUPERC will find it.
I agree with Escapa about processing the SUPERC output file. |
|
Back to top |
|
|
amargallani
New User
Joined: 04 May 2010 Posts: 5 Location: ballarpur
|
|
|
|
I agree, we can use SuperC to compare two files.
If the files are matching, result of superC is X no.of rows matching and 0 Paired and 0 unpaired
If few records are not matching then the result is 1 paired and 2 unpaired,
it is difficult to analyse and report the user why the file is not mathcing...
For example:
[b]NEW_FILE:
AAA011
BBB021
CCC031
FFFF041
OLD_FILE:
XXX011
BBB021
CCC031
FFFF041
AAA011
After compare we have to report the user
The new file is not matching wiht the old file.
Reason: Last record int he old file is placed at the top of the ne wfile and the header (XXX011) is missing in the new file. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Possibly (probably), the summary at the end of the SUPERC run is enough information to report to the "user". . . .
If more detailed analysis is needed for one file or another, then do some more detailed work to provide some additional info.
This assumes that most of the files will completely match most of the time or that they are supposed to match most/all of the time. If the files are supposed to match, and they do not, i suspect that the problem process needs to be corrected and this should not require much depth. . . |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2547 Location: Silicon Valley
|
|
|
|
Quote: |
it is difficult to analyse and report the user why the file is not mathcing... |
I think the report is pretty self explanatory. The report contains:
Code: |
ID SOURCE LINES TYPE LEN N-LN# O-LN#
----+----1----+----2----+----3--7----+----8
I - AAA011 00010000 RPL= 1 00001 00001
D - XXX011 00010000
MAT= 3
D - AAA011 00050000 DEL= 1 00005 00005
|
It says that one record replaced the original first line and another record (line 5) was deleted.
How would you want to say the same thing? How would you say it if there were more complex changes? What if there were 20 different changes - how would you say it? |
|
Back to top |
|
|
amargallani
New User
Joined: 04 May 2010 Posts: 5 Location: ballarpur
|
|
|
|
Hi Pedro,
Here the example is small, if we have a file of RL 500 and the disceprency is at the 250th colum and the same issue occurs with 50 rows, it is difficult to analyse the results of Super C.
It would be difficult to inform the user what data is missing.
Is SuperC output is restricted to RL 133 ??? (wanted to know) |
|
Back to top |
|
|
Escapa
Senior Member
Joined: 16 Feb 2007 Posts: 1399 Location: IL, USA
|
|
|
|
amargallani wrote: |
Hi Pedro,
Here the example is small, if we have a file of RL 500 and the disceprency is at the 250th colum and the same issue occurs with 50 rows, it is difficult to analyse the results of Super C.
It would be difficult to inform the user what data is missing.
|
first thing you never told your LRECL is 500. People would have suggested you correct way by now.
About your question
Quote: |
Is SuperC output is restricted to RL 133 ??? (wanted to know) |
I am sure you will get the answer by reading couple of post over here on SUPERC. |
|
Back to top |
|
|
|