Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

Removal of duplicate records in a file

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> COBOL Programming
View previous topic :: :: View next topic  
Author Message
Priya_Shankar

New User


Joined: 07 Aug 2007
Posts: 22
Location: Chennai

PostPosted: Thu Aug 09, 2007 6:20 pm    Post subject: Removal of duplicate records in a file
Reply with quote

How we can remove the duplicate records from different types of files through COBOL? I want to know the commands or statements available in COBOL programming to perform this operation.
Back to top
View user's profile Send private message

dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6968
Location: porcelain throne

PostPosted: Thu Aug 09, 2007 6:39 pm    Post subject:
Reply with quote

At the top of this page, the button 'Manuals' to the right of 'Portals' (far left). Find a COBOL and RTFM.
Back to top
View user's profile Send private message
stodolas

Active Member


Joined: 13 Jun 2007
Posts: 632
Location: Wisconsin

PostPosted: Thu Aug 09, 2007 7:38 pm    Post subject:
Reply with quote

Sort is probably more efficient than COBOL anyway. But when all you have is a hammer, everything looks like a nail (even screws).
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Thu Aug 09, 2007 11:18 pm    Post subject:
Reply with quote

Hello,

There is no command to remove duplicates. In many, many languages, the process is
1 - sort the records by whatever is the "key" on which duplicates are to be removed.
2 - read the records and keep the first or last duplicate as suits your requirement.
3 - write the "uniques" to a file and the dups to another (if required).

That's all there is. . . .
Back to top
View user's profile Send private message
Priya_Shankar

New User


Joined: 07 Aug 2007
Posts: 22
Location: Chennai

PostPosted: Fri Aug 10, 2007 10:22 am    Post subject:
Reply with quote

Hello dick,
Thanks for the response. Could you please explain the process you had mentioned through some examples?
Back to top
View user's profile Send private message
Devzee

Active Member


Joined: 20 Jan 2007
Posts: 684
Location: Hollywood

PostPosted: Fri Aug 10, 2007 10:25 am    Post subject:
Reply with quote

Quote:
Priya_Shankar

dick has posted the high level steps.

First you need to understand of how to delete duplicate, then it'll be easy to build the logic.

I would suggest do it thru SORT which is quick and easy.
Back to top
View user's profile Send private message
viswam

New User


Joined: 10 Jul 2006
Posts: 2

PostPosted: Sat Aug 11, 2007 1:38 am    Post subject: Re: Removal of duplicate records in a file
Reply with quote

Hi,

If the files are in sorted order, then you can do it very easily by reading the records one at a time an keep the previous record in a temp variable. Before writing to the output file,check with the temp record. If it is matching skip writing. Alway move the last read record to the temp record after writing. Hope this will help you.
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Sat Aug 11, 2007 2:08 am    Post subject:
Reply with quote

Hello Priya,

Quote:
Thanks for the response. Could you please explain the process you had mentioned through some examples?


You're welcome icon_smile.gif

This
Quote:
read the records and keep the first or last duplicate as suits your requirement
basically is the example. Until you define which record(s) are to be kept and which "discarded", we can only talk in generalities.

Are you trying to compare the entire record? Does some "key" determine what is a duplicate? Are the duplicates to be copied into another output file? Only your requirement can know that.
Back to top
View user's profile Send private message
balakrishna reddy

Active User


Joined: 13 Jul 2007
Posts: 130
Location: Guntur

PostPosted: Mon Aug 13, 2007 12:10 pm    Post subject:
Reply with quote

hi Priya,

As dick told
Quote:

Are you trying to compare the entire record? Does some "key" determine what is a duplicate?


until and unless you provide in which field you want to check for duplicates we were helpless to provide you the logic for removal of those records.
Back to top
View user's profile Send private message
Priya_Shankar

New User


Joined: 07 Aug 2007
Posts: 22
Location: Chennai

PostPosted: Mon Aug 13, 2007 3:09 pm    Post subject:
Reply with quote

I don't want to copy the duplicates into another file. I just want to read a file and if I find any duplicate record, a single entry of the record (will it be the first record or the last one) must be written into a file.
What is the procedure (i). if I have to compare the entire record or (ii). only based on the keys ? Is the procedure is different for different kinds of files?
Back to top
View user's profile Send private message
shankar.v

Active User


Joined: 25 Jun 2007
Posts: 196
Location: Bangalore

PostPosted: Mon Aug 13, 2007 3:33 pm    Post subject:
Reply with quote

The following skeleton code helps you to write a single entry of duplicate records into output file. This can be done easily using a sort jcl.
Code:
DATA DIVISION.
FILE SECTION.
FD INFILE.
01 WS-INFILE-REC.
 02 WS-KEY PIC X().
 02 WS-OTHER-FIELDS PIC X().
WORKING-STORAGE SECTION.
01 WF-DUPLICATE-RECORD
 88 DUPLICATE-RECORD-FOUND VALUE 'Y'
 88 DUPLICATE-RECORD-NOT-FOUND VALUE 'N'
77 WS-TEMP-KEY PIC X().
77 WS-DUPLICATE-COUNT PIC 9.
PROCEDURE DIVISION.
SET DUPLICATE-RECORD-NOT-FOUND TO TRUE
PERFORM UNTIL WF-END-OF-FILE
 READ INFILE AT END SET WF-END-OF-FILE TO TRUE
 END-READ
 IF WS-KEY = WS-TEMP-KEY
  SET DUPLICATE-RECORD-FOUND TO TRUE
 ELSE
  SET DUPLICATE-RECORD-NOT-FOUND TO TRUE
 END-IF
 IF DUPLICATE-RECORD-NOT-FOUND
 WRITE OUTFILE-REC
 END-IF
 MOVE WS-KEY TO WS-TEMP-KEY
END-PERFORM
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> COBOL Programming All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
This topic is locked: you cannot edit posts or make replies. Fetching data from BAI File arunsoods JCL & VSAM 1 Wed Jul 19, 2017 4:28 pm
No new posts Write out NODUPS but just from one file Jay Villaverde DFSORT/ICETOOL 8 Fri Jul 14, 2017 12:44 am
No new posts How to add header with Date(YYMMDD) i... Rajan Moorthy DFSORT/ICETOOL 2 Thu Jul 06, 2017 11:44 pm
No new posts How to generate a new unique Input fi... for1ranjith CLIST & REXX 11 Sat Jul 01, 2017 12:09 pm
No new posts Writing a file using online program grvtomar PL/I & Assembler 3 Fri Jun 30, 2017 1:06 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us