Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

finding duplicates in a file using ICETOOL..

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
Mukesh Pandey

Active User


Joined: 11 Nov 2008
Posts: 143
Location: India

PostPosted: Wed Mar 03, 2010 5:38 pm    Post subject: finding duplicates in a file using ICETOOL..
Reply with quote

hi all,

I have two files each with million of records.

Need to match the record field from file1 with file2 record fields. if duplicated are found the the duplicate record is to be written into file3.

Please let me know the solution for this.

Please note : the file is a flat PS file.
Back to top
View user's profile Send private message

gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1703
Location: Australia

PostPosted: Wed Mar 03, 2010 5:44 pm    Post subject:
Reply with quote

Hi,

are there duplicates in either file ?

What constitues a duplicate, do both files have the same layout ?

Please provide data from both files and what is expected in file 3


Gerry
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Wed Mar 03, 2010 11:43 pm    Post subject:
Reply with quote

Quote:
Please let me know the solution for this.


You haven't given enough information for anyone to do that.

Please show an example of the records in each input file (relevant fields only) and what you expect for output. Explain the "rules" for getting from input to output. Give the starting position, length and format of each relevant field. Give the RECFM and LRECL of the input files. If file1 can have duplicates within it, show that in your example. If file2 can have duplicates within it, show that in your example.

Also, run this job and show the //SYSOUT messages you receive, so I can see what level you're at:

Code:

//S1    EXEC  PGM=SORT         
//SYSOUT    DD  SYSOUT=*       
//SORTIN DD *                 
RECORD                         
//SORTOUT DD DUMMY             
//SYSIN    DD    *             
  OPTION COPY                 
/*                             
Back to top
View user's profile Send private message
Mukesh Pandey

Active User


Joined: 11 Nov 2008
Posts: 143
Location: India

PostPosted: Thu Mar 04, 2010 11:10 am    Post subject:
Reply with quote

suppose lerecl are fixed to 80 chars for both the files and we have only one field called R1 in file1 and R2 in file2

R1 and R2 both are of lenght 10 aplphanumeric items starting fropm position one. both R1 and R2 have duplicate values.


Need to find record which is present in R1 and R2 both.

It can be achieved using icetool i heard. Please let me know the solution.

Let me know if more info is required.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Thu Mar 04, 2010 11:14 pm    Post subject:
Reply with quote

It would really have helped if you'd given all of the information I asked for including an example of input and output. Since you didn't, I can only guess what your files look like.

Quote:
both R1 and R2 have duplicate values.


Assuming you mean that there are dups within file1, and dups within file2, you can use a DFSORT/ICETOOL job like the following (of course, I don't know if your files really look like this but it should give you the idea):

Code:

//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//SYSUDUMP DD SYSOUT=*
//IN1 DD *
AAAAAAAAAA FILE1 R1
AAAAAAAAAA FILE1 R2
CCCCCCCCCC FILE1 R3
BBBBBBBBBB FILE1 R4
BBBBBBBBBB FILE1 R5
BBBBBBBBBB FILE1 R6
FFFFFFFFFF FILE1 R7
FFFFFFFFFF FILE1 R8
/*
//IN2 DD *
AAAAAAAAAA FILE2 R1
DDDDDDDDDD FILE2 R2
CCCCCCCCCC FILE2 R3
CCCCCCCCCC FILE2 R4
FFFFFFFFFF FILE2 R5
FFFFFFFFFF FILE2 R6
/*
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(MOD,PASS)
//OUT DD SYSOUT=*
//TOOLIN DD *
SELECT FROM(IN1) TO(T1) ON(1,10,CH) FIRST
SELECT FROM(IN2) TO(T1) ON(1,10,CH) FIRST
SELECT FROM(T1) TO(OUT) ON(1,10,CH) FIRSTDUP
/*


For this example, OUT would have:

Code:

AAAAAAAAAA FILE1 R1 
CCCCCCCCCC FILE1 R3 
FFFFFFFFFF FILE1 R7 
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Extra character appears in file when ... Balu5491 All Other Mainframe Topics 1 Wed Jul 26, 2017 2:39 pm
No new posts SSH - known_hosts file configuration vasanthz All Other Mainframe Topics 2 Wed Jul 26, 2017 2:10 am
This topic is locked: you cannot edit posts or make replies. Fetching data from BAI File arunsoods JCL & VSAM 1 Wed Jul 19, 2017 4:28 pm
No new posts Write out NODUPS but just from one file Jay Villaverde DFSORT/ICETOOL 8 Fri Jul 14, 2017 12:44 am
No new posts How to add header with Date(YYMMDD) i... Rajan Moorthy DFSORT/ICETOOL 2 Thu Jul 06, 2017 11:44 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us