IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Duplicates elimination in two datsets


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
mfraju

New User


Joined: 07 Jul 2007
Posts: 2
Location: bangalore

PostPosted: Thu Aug 16, 2007 7:01 pm
Reply with quote

Hi,

i have two input files first one i have to compare with second file then if i found any duplicate record in file2 i have to skip that.
see below exaple can we do it in any way by sort or...........


INPUT File1
1
2
3

INPUT File2
1
4
5

output file
2
3
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Aug 16, 2007 8:30 pm
Reply with quote

mfraju,

Here's a DFSORT/ICETOOL job that will do what you asked for. I assumed your input files have RECFM=FB and LRECL=80, but the job can be changed appropriately for other attributes.

Code:

//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//IN1 DD *
1
2
3
/*
//IN2 DD *
1
4
5
/*
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT DD SYSOUT=*
//TOOLIN   DD    *
COPY FROM(IN1) TO(T1) USING(CTL1)
COPY FROM(IN2) TO(T1) USING(CTL2)
SELECT FROM(T1) TO(OUT) ON(1,1,CH) NODUPS USING(CTL3)
/*
//CTL1CNTL DD *
  INREC OVERLAY=(81:C'1')
/*
//CTL2CNTL DD *
  INREC OVERLAY=(81:C'2')
/*
//CTL3CNTL DD *
  OUTFIL FNAMES=OUT,INCLUDE=(81,1,CH,EQ,C'1'),
    BUILD=(1,80)
/*
Back to top
View user's profile Send private message
puzzled_elton

New User


Joined: 09 May 2005
Posts: 7

PostPosted: Thu Aug 16, 2007 8:33 pm
Reply with quote

my record attribute looks like DCBRECFM=VB,BLKSIZE=20096,LRECL=20001)

i can have data from 1 to 20001
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Aug 16, 2007 8:36 pm
Reply with quote

What positions do you want to use to check for duplicates?
Back to top
View user's profile Send private message
puzzled_elton

New User


Joined: 09 May 2005
Posts: 7

PostPosted: Fri Aug 17, 2007 12:57 pm
Reply with quote

From first position to 20001(end). That means entire record.
Back to top
View user's profile Send private message
CICS Guy

Senior Member


Joined: 18 Jul 2007
Posts: 2146
Location: At my coffee table

PostPosted: Fri Aug 17, 2007 2:41 pm
Reply with quote

puzzled_elton wrote:
From first position to 20001(end). That means entire record.
I'd guess that you will need to make several passes since.....
DFSORT Application Programming Guide wrote:
The collected control fields (comprising the control word) must not exceed 4092 bytes (or 4088 bytes when EQUALS is in effect).
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri Aug 17, 2007 8:22 pm
Reply with quote

I'd suggest using a compare program for this kind of thing. Although DFSORT can do some types of matching, it has no built-in functions for comparing 20000 byte records.
Back to top
View user's profile Send private message
hallecodec

New User


Joined: 05 Sep 2006
Posts: 30
Location: Philippines

PostPosted: Sun Aug 19, 2007 8:54 pm
Reply with quote

hi,

i would just like to clarify how the above code works:
(1) does it compare two datasets row by row, if the first record of the first file doesnt match with the first record of the second file, it will be put into the third file, or
(2) does every record in the first dataset will be compared with every record in the second dataset, if didnt found any duplicate, then it will be put into the third dataset?
Please use the below example for both situations.

example(using the example from the first post):

for (1):
input file 1 input file 2
1 -----------> 1
2 -----------> 4
3 -----------> 5
(first record of file 1 will be compared with first record of file 2, second record of file 1 will be compared with second record of file 2, and so on...)

for (2):
input file 1 input file 2
1 -----------> 1, 4, 5
2 -----------> 1, 4, 5
3 -----------> 1, 4, 5
(first record of file 1 will be compared with every record of file 2, second record of file 1 will be compared with every record of file 2, and so on...)

Please advise. Thanks in advance.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Sun Aug 19, 2007 9:14 pm
Reply with quote

Code:

SELECT FROM(T1) TO(OUT) ON(1,1,CH) NODUPS USING(CTL3)


SELECT sorts the records from both files using the ON field as the key so records with the same key will be adjacent. It can then compare the key of adjacent records. I used ON(1,1,CH) to correspond to an example with RECFM=FB, LRECL=80 and the key in position 1, but the ON field length can be up to 4088 bytes.

For complete details on the SELECT operator of DFSORT's ICETOOL, see:

publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA20/6.11?DT=20060615185603
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts How to remove block of duplicates DFSORT/ICETOOL 8
This topic is locked: you cannot edit posts or make replies. Compare files with duplicates in one ... DFSORT/ICETOOL 11
No new posts Merging 2 files but ignore duplicate... DFSORT/ICETOOL 1
No new posts COUNT the number of duplicates DFSORT/ICETOOL 3
This topic is locked: you cannot edit posts or make replies. SUM FIELDS=NONE in reverse - Get dupl... DFSORT/ICETOOL 9
Search our Forums:

Back to Top