IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

How to Compare two files using SORT?


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
somapradeep1

New User


Joined: 07 Sep 2010
Posts: 22
Location: hyderabad

PostPosted: Mon Feb 27, 2012 3:16 pm
Reply with quote

Hi,
I have two files with 300 LRECL each.

Now i need to compare entire record two input files and write the matched records in one file and un matched records in other file using SORT. There is no matching criteria.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Mon Feb 27, 2012 3:30 pm
Reply with quote

somapradeep1 wrote:
Hi,
I have two files with 300 LRECL each.Now i need to compare entire record two input files and write the matched records in one file and un matched records in other file. There is no matching criteria.


If all of the above is really true, you have a bit of a pickle.

So, let's assume you are relying on the sequence of the two files for your matching criteria.

Have you searched the forum for examples that might help?
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Mon Feb 27, 2012 8:23 pm
Reply with quote

somapradeep1,

Can there be duplicates in either file?

Thanks?
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Feb 28, 2012 1:34 am
Reply with quote

Quote:
There is no matching criteria.


You can't do a compare if there are no matching criteria.

Are you trying to compare record 1 from file1 with record 1 from file 2, record 2 from file1 with record 2 from file2, etc? If so your matching criteria would be record by record. If that's not your criteria, then you need to define your criteria for comparing records before anyone can help you.
Back to top
View user's profile Send private message
somapradeep1

New User


Joined: 07 Sep 2010
Posts: 22
Location: hyderabad

PostPosted: Tue Feb 28, 2012 6:42 pm
Reply with quote

Sorry.......
The matching criteria is to compare the total record in both files.
One more thing is there are no Duplicates
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Tue Feb 28, 2012 8:22 pm
Reply with quote

somapradeep1,
See if below helps...

Since both the files have same RECFM and LRECL, this can also be done without joinkeys, but the solution using joinkeys will be easier to maintain and understand.

Code:
//STEP0001 EXEC PGM=SORT                                         
//SYSOUT   DD SYSOUT=*                                           
//INA      DD DISP=SHR,DSN=INPUT1 FB/300                         
//INB      DD DISP=SHR,DSN=INPUT2 FB/300                         
//MATCHED  DD MATCHED RECORD OUTPUT FB/300                       
//UNMATCH  DD UNMATCH RECORD OUTPUT FB/300                       
//SYSIN    DD *                                                   
  OPTION COPY                                                     
  JOINKEYS F1=INA,FIELDS=(1,300,A)                               
  JOINKEYS F2=INB,FIELDS=(1,300,A)                               
  JOIN UNPAIRED                                                   
  REFORMAT FIELDS=(F1:1,300,F2:1,300,?)                           
  INREC IFTHEN=(WHEN=(601,1,CH,EQ,C'2'),OVERLAY=(1:301,300))     
  OUTFIL FNAMES=MATCHED,INCLUDE=(601,1,CH,EQ,C'B'),BUILD=(1,300) 
  OUTFIL FNAMES=UNMATCH,INCLUDE=(601,1,SS,EQ,C'1,2'),BUILD=(1,300)
/*                                                               

Thanks,
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Feb 28, 2012 9:12 pm
Reply with quote

somapradeep, you have still not made things very clear.

This one works without sorting the files. No idea if that matters to you. Depends if you have files which are both logically and physically the same, or just logically the same (order does not matter).

SURPLUS1 and SURPLUS2 are just in case you have more records on one file than the other. SORTOUT will contain the REFORMAT records, so if you have items on MISMATCH you can look at them in context, if necessary. MISMATCH also contains the REFORMAT records.

Code:
//MTCHFILE EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTOUT  DD SYSOUT=*
//SORTOUT DD SYSOUT=*
//MATCH DD SYSOUT=*
//MISMATCH DD SYSOUT=*
//SURPLUS1 DD SYSOUT=*
//SURPLUS2 DD SYSOUT=*
//SYSIN DD *
  JOINKEYS F1=IN1,FIELDS=(81,8,A),SORTED,NOSEQCK
  JOINKEYS F2=IN2,FIELDS=(81,8,A),SORTED,NOSEQCK
  JOIN UNPAIRED,F1,F2
  REFORMAT FIELDS=(F1:1,80,F2:1,80,?)
  OPTION COPY
  INREC IFTHEN=(WHEN=(1,80,CH,NE,81,80,CH),
                OVERLAY=(162:C'N'))
  OUTFIL FNAMES=SURPLUS1,INCLUDE=(161,1,CH,EQ,C'1'),
           BUILD=(1,80)
  OUTFIL FNAMES=SURPLUS2,INCLUDE=(161,1,CH,EQ,C'2'),
           BUILD=(81,80)
  OUTFIL FNAMES=MISMATCH,INCLUDE=(162,1,CH,EQ,C'N'),
           BUILD=(1,162)
  OUTFIL FNAMES=MATCH,
  INCLUDE=(162,1,CH,NE,C'N',AND,161,1,CH,EQ,C'B'),
           BUILD=(1,80)
//JNF1CNTL DD *
  INREC OVERLAY=(81:SEQNUM,8,ZD)
//JNF2CNTL DD *
  INREC OVERLAY=(81:SEQNUM,8,ZD)
/*
//IN1 DD *
RECORD1
RECORD2
RECORD3
RECORD4
RECORD5AFILE1
RECORD6
RECORD7
RECORD8
RECORD9
RECOR10
RECOR11
RECOR12
//IN2 DD *
RECORD1
RECORD2
RECORD3
RECORD4
RECORD5
RECORD6
RECORD7
RECORD8
RECORD9
RECOR10
RECOR11AFILE2
RECOR12
RECOR13



[
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Tue Feb 28, 2012 9:54 pm
Reply with quote

Bill Woodger,

somapradeep1 wrote:
Hi,
I have two files with 300 LRECL each.Now i need to compare entire record two input files and write the matched records in one file and un matched records in other file. There is no matching criteria.
Other than OP having 300 byte files as opposed to 80 byte, with this approach, you are assuming that both the files are in sorted order and OP wants to compare record by record (1st record of file1 with 1st record of file2), which is not the case. Atleast that's not what OP has mentioned.

If the files are in sorted order already (which I don't see OP has mentioned anywhere), you could always give SORTED,NOSEQCK to the solution I provided and ? will take care of the rest. Also, MISMATCH file in this situation, seem to be double the size of input file+2 bytes. Not sure if that is required as well.

Just my 2 cents... You probably don't need SURPLUS1 and SURPLUS2 because these are true unmatched records.

Thanks,
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Feb 28, 2012 10:15 pm
Reply with quote

I tested with 80. Not beyond the wit of TS/OP to change to 300, and the rest which relate to 80. I nearly did it with SYMNAMES to make the process of changing easier...

As I explained, the MISMATCH and the SORTOUT are intended to have the REFORMAT records on, just handy to look at what mismatches. If there are mismatches, you can easily see the context. SURPLUS1 and SURPLUS2 are just to provide a quick check on the fiels containing the same number of records. Any or all of these could be changed or excluded with ease.

The idea with the sequence numbers is that it deals with two files whose contents are equal (or to be tested as so), per record, but which are not in sorted order.

We don't know from TS/OP whether their files are in order, need to be in order for comparison, or need to be compared "as is",without being sorted. I believe yours covers the first two, and mine the last.

If your code is used for the last, the files might compare clean, whereas records which are out of order between the two files will be missed as mismatching.

That's what this was about:

Quote:
This one works without sorting the files. No idea if that matters to you. Depends if you have files which are both logically and physically the same, or just logically the same (order does not matter).


We don't know. If the files must be sorted to compare, according to TS/OPs non-information so far, then they must. If they mustn't be sorted, then they mustn't.
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Wed Feb 29, 2012 3:48 am
Reply with quote

Hi,

another way but not as efficient if files are already sorted.

Code:
//STEP0100 EXEC PGM=ICETOOL                                       
//TOOLMSG  DD SYSOUT=*                                           
//DFSMSG   DD SYSOUT=*                                           
//IN       DD DSN=file1,DISP=SHR                                 
//         DD DSN=file2,DISP=SHR                                 
//MATCHED  DD SYSOUT=*                                           
//UNMATCH  DD SYSOUT=*                                           
//TOOLIN   DD *                                                   
SELECT FROM(IN) TO(MATCHED) ON(1,300,CH) ALLDUPS DISCARD(UNMATCH)



Gerry
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Feb 29, 2012 5:39 am
Reply with quote

This is basically the same as my previous, but with SYMNAMES and an extra step.

The new step takes a PARM indicating the record-length. No further changes are needed to compare fixed-length records up to about 1/2 maximum record size. The sequence number allows eight digits, so up-to-but-not-including 100million records.

The REFORMAT record now contains the sequence number. The match indicator is no longer included if it is obvious from the file the record is on.

SURPLUS1, SURPLUS2, SORTOUT, MISMATCH and even MATCH can easily be removed or amended as needed by the user.

The reason for including them was wondering why TS/OP didn't want to use a file-comparison product. So, if I ddn't want to use a file-comparison product, it would be because I could customise it more, not less :-)

Tested (lightly) with records of 5 and 80 bytes.

Code:
//DOSYMBOL EXEC PGM=SORT,PARM='JP1"80"'
//SYSOUT   DD SYSOUT=*
//SORTOUT DD DSN=&&SYMB1,UNIT=SYSDA,DISP=(,PASS)
//SORTIN DD *
DUMMY-RECORD-DEFINITION
//SYSIN DD *
 OPTION COPY
 INREC OVERLAY=(24:C',1,',JP1,C',CH',80:X)
//MTCHFILE EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTOUT DD SYSOUT=*
//MATCH DD SYSOUT=*
//MISMATCH DD SYSOUT=*
//SURPLUS1 DD SYSOUT=*
//SURPLUS2 DD SYSOUT=*
//SYMNOUT DD SYSOUT=*
//SYMNAMES DD DSN=&&SYMB1,DISP=(OLD,PASS)
// DD *
* INPUT RECORDS
  F1-WHOLE-RECORD,=,=,=
  F2-WHOLE-RECORD,=,=,=
* REFORMAT RECORD
  F1-REFORMAT-RECORD,=,=,=
  F2-REFORMAT-RECORD,*,=,=
  F1-SEQUENCE,*,8,ZD
  REFORMAT-JOIN-IND,*,1,CH
* EXTENSION OF REFORMAT RECORD
  OVERLAY-MATCHED-IND,*,1,CH
* JNFNCNTL FIELDS
POSITION,F2-REFORMAT-RECORD
  CNTL1-OVERLAY-COL,=,1,CH
  CNTL2-OVERLAY-COL,=,=,=
  CNTL1-SEQUENCE,=,8,ZD
  CNTL2-SEQUENCE,=,=,=
* LITERALS
  NO-MATCH-IND,C'N'
  ON-F1-ONLY,C'1'
  ON-F2-ONLY,C'2'
  ON-BOTH-F1-AND-F2,C'B'
//SYSIN DD *
                                                         
  JOINKEYS F1=IN1,
             FIELDS=(CNTL1-SEQUENCE,
                     A),
                     SORTED,
                     NOSEQCK
  JOINKEYS F2=IN2,
             FIELDS=(CNTL2-SEQUENCE,
                     A),
                     SORTED,
                     NOSEQCK
                                                         
  JOIN UNPAIRED,F1,F2
                                                         
  REFORMAT FIELDS=(F1:F1-WHOLE-RECORD,
                   F2:F2-WHOLE-RECORD,
                   F1:CNTL1-SEQUENCE,
                      ?)
                                                         
  OPTION COPY
                                                         
  INREC IFTHEN=(WHEN=(F1-REFORMAT-RECORD,
               NE,
                F2-REFORMAT-RECORD),
                  OVERLAY=(OVERLAY-MATCHED-IND:NO-MATCH-I
                                                         
  OUTFIL FNAMES=SURPLUS1,INCLUDE=(REFORMAT-JOIN-IND,EQ,ON
           BUILD=(F1-REFORMAT-RECORD)
                                                                       
  OUTFIL FNAMES=SURPLUS2,INCLUDE=(REFORMAT-JOIN-IND,EQ,ON-F2-ONLY),
           BUILD=(F2-REFORMAT-RECORD)
                                                                       
  OUTFIL FNAMES=MISMATCH,INCLUDE=(OVERLAY-MATCHED-IND,EQ,NO-MATCH-IND),
           BUILD=(F1-REFORMAT-RECORD,
                  F2-REFORMAT-RECORD,
                  F1-SEQUENCE,
                  REFORMAT-JOIN-IND)
                                                                       
  OUTFIL FNAMES=MATCH,
           INCLUDE=(OVERLAY-MATCHED-IND,NE,NO-MATCH-IND,
                    AND,REFORMAT-JOIN-IND,EQ,ON-BOTH-F1-AND-F2),
           BUILD=(F1-REFORMAT-RECORD)
                                                                       
//JNF1CNTL DD *
                                                                       
  INREC OVERLAY=(CNTL1-OVERLAY-COL:SEQNUM,8,ZD)
                                                                       
//JNF2CNTL DD *
                                                                       
  INREC OVERLAY=(CNTL2-OVERLAY-COL:SEQNUM,8,ZD)
                                                                       
/*
//IN1 DD *
RECORD1Z
RECORD2Y
RECORD3X
RECORD4W
BECORD5VAFILE1
RECORD6U
RECORD7T
RECORD8S
RECORD9R
RECOR10Q
RECOR11P
RECOR12O
//IN2 DD *
RECORD1Z
RECORD2Y
RECORD3X
RECORD4W
RECORD5V
RECORD6U
RECORD7T
RECORD8S
RECORD9R
BECOR10QBFILE2
RECOR11P
RECOR12O
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Feb 29, 2012 1:14 pm
Reply with quote

Thanks to Gerry for spotting the glitch in my paste :-)

I didn't check out my claim about the size of the records, and worried, but it gets through syntax with 16350 (haven't created any records that size).

Code:
//DOSYMBOL EXEC PGM=SORT,PARM='JP1"16350"'
//SYSOUT   DD SYSOUT=*
//SORTOUT DD DSN=&&SYMB1,UNIT=SYSDA,DISP=(,PASS)
//SORTIN DD *
DUMMY-RECORD-DEFINITION
//SYSIN DD *
 OPTION COPY
 INREC OVERLAY=(24:C',1,',JP1,C',CH',80:X)
//MTCHFILE EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTOUT DD SYSOUT=*
//MATCH DD SYSOUT=*
//MISMATCH DD SYSOUT=*
//SURPLUS1 DD SYSOUT=*
//SURPLUS2 DD SYSOUT=*
//SYMNOUT DD SYSOUT=*
//SYMNAMES DD DSN=&&SYMB1,DISP=(OLD,PASS)
// DD *
* INPUT RECORDS
  F1-WHOLE-RECORD,=,=,=
  F2-WHOLE-RECORD,=,=,=
* REFORMAT RECORD
  F1-REFORMAT-RECORD,=,=,=
  F2-REFORMAT-RECORD,*,=,=
  F1-SEQUENCE,*,8,ZD
  REFORMAT-JOIN-IND,*,1,CH
* EXTENSION OF REFORMAT RECORD
  OVERLAY-MATCHED-IND,*,1,CH
* JNFNCNTL FIELDS
POSITION,F2-REFORMAT-RECORD
  CNTL1-OVERLAY-COL,=,1,CH
  CNTL2-OVERLAY-COL,=,=,=
  CNTL1-SEQUENCE,=,8,ZD
  CNTL2-SEQUENCE,=,=,=
* LITERALS
  NO-MATCH-IND,C'N'
  ON-F1-ONLY,C'1'
  ON-F2-ONLY,C'2'
  ON-BOTH-F1-AND-F2,C'B'
//SYSIN DD *
                                                                       
  JOINKEYS F1=IN1,
             FIELDS=(CNTL1-SEQUENCE,
                     A),
                     SORTED,
                     NOSEQCK
  JOINKEYS F2=IN2,
             FIELDS=(CNTL2-SEQUENCE,
                     A),
                     SORTED,
                     NOSEQCK
                                                                       
  JOIN UNPAIRED,F1,F2
                                                                       
  REFORMAT FIELDS=(F1:F1-WHOLE-RECORD,
                   F2:F2-WHOLE-RECORD,
                   F1:CNTL1-SEQUENCE,
                      ?)
                                                                       
  OPTION COPY
                                                                       
  INREC IFTHEN=(WHEN=(F1-REFORMAT-RECORD,
               NE,
                F2-REFORMAT-RECORD),
                  OVERLAY=(OVERLAY-MATCHED-IND:NO-MATCH-IND))
                                                                       
  OUTFIL FNAMES=SURPLUS1,INCLUDE=(REFORMAT-JOIN-IND,
                                 EQ,
                                  ON-F1-ONLY),
           BUILD=(F1-REFORMAT-RECORD)
                                                                       
  OUTFIL FNAMES=SURPLUS2,INCLUDE=(REFORMAT-JOIN-IND,
                                 EQ,
                                  ON-F2-ONLY),
           BUILD=(F2-REFORMAT-RECORD)
                                                                       
  OUTFIL FNAMES=MISMATCH,INCLUDE=(OVERLAY-MATCHED-IND,
                                 EQ,
                                  NO-MATCH-IND),
           BUILD=(F1-REFORMAT-RECORD,
                  F2-REFORMAT-RECORD,
                  F1-SEQUENCE,
                  REFORMAT-JOIN-IND)
                                                                       
  OUTFIL FNAMES=MATCH,
           INCLUDE=(OVERLAY-MATCHED-IND,NE,NO-MATCH-IND,
                    AND,REFORMAT-JOIN-IND,EQ,ON-BOTH-F1-AND-F2),
           BUILD=(F1-REFORMAT-RECORD)
                                                                       
//JNF1CNTL DD *
                                                                       
  INREC OVERLAY=(CNTL1-OVERLAY-COL:SEQNUM,8,ZD)
                                                                       
//JNF2CNTL DD *
                                                                       
  INREC OVERLAY=(CNTL2-OVERLAY-COL:SEQNUM,8,ZD)
                                                                       
/*
//IN1 DD *
RECORD1Z
RECORD2Y
RECORD3X
RECORD4W
BECORD5VAFILE1
RECORD6U
RECORD7T
RECORD8S
RECORD9R
RECOR10Q
RECOR11P
RECOR12O
//IN2 DD *
RECORD1Z
RECORD2Y
RECORD3X
RECORD4W
RECORD5V
RECORD6U
RECORD7T
RECORD8S
RECORD9R
BECOR10QBFILE2
RECOR11P
RECOR12O
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts Need to set RC4 through JCL SORT DFSORT/ICETOOL 5
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Write line by line from two files DFSORT/ICETOOL 7
No new posts Compare only first records of the fil... SYNCSORT 7
Search our Forums:

Back to Top