View previous topic :: View next topic
|
Author |
Message |
Priya_Shankar
New User
Joined: 07 Aug 2007 Posts: 22 Location: Chennai
|
|
|
|
How we can remove the duplicate records from different types of files through COBOL? I want to know the commands or statements available in COBOL programming to perform this operation. |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
At the top of this page, the button 'Manuals' to the right of 'Portals' (far left). Find a COBOL and RTFM. |
|
Back to top |
|
|
stodolas
Active Member
Joined: 13 Jun 2007 Posts: 632 Location: Wisconsin
|
|
|
|
Sort is probably more efficient than COBOL anyway. But when all you have is a hammer, everything looks like a nail (even screws). |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
There is no command to remove duplicates. In many, many languages, the process is
1 - sort the records by whatever is the "key" on which duplicates are to be removed.
2 - read the records and keep the first or last duplicate as suits your requirement.
3 - write the "uniques" to a file and the dups to another (if required).
That's all there is. . . . |
|
Back to top |
|
|
Priya_Shankar
New User
Joined: 07 Aug 2007 Posts: 22 Location: Chennai
|
|
|
|
Hello dick,
Thanks for the response. Could you please explain the process you had mentioned through some examples? |
|
Back to top |
|
|
Devzee
Active Member
Joined: 20 Jan 2007 Posts: 684 Location: Hollywood
|
|
|
|
dick has posted the high level steps.
First you need to understand of how to delete duplicate, then it'll be easy to build the logic.
I would suggest do it thru SORT which is quick and easy. |
|
Back to top |
|
|
viswam
New User
Joined: 10 Jul 2006 Posts: 2
|
|
|
|
Hi,
If the files are in sorted order, then you can do it very easily by reading the records one at a time an keep the previous record in a temp variable. Before writing to the output file,check with the temp record. If it is matching skip writing. Alway move the last read record to the temp record after writing. Hope this will help you. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello Priya,
Quote: |
Thanks for the response. Could you please explain the process you had mentioned through some examples? |
You're welcome
This
Quote: |
read the records and keep the first or last duplicate as suits your requirement |
basically is the example. Until you define which record(s) are to be kept and which "discarded", we can only talk in generalities.
Are you trying to compare the entire record? Does some "key" determine what is a duplicate? Are the duplicates to be copied into another output file? Only your requirement can know that. |
|
Back to top |
|
|
balakrishna reddy
Active User
Joined: 13 Jul 2007 Posts: 128 Location: Guntur
|
|
|
|
hi Priya,
As dick told
Quote: |
Are you trying to compare the entire record? Does some "key" determine what is a duplicate?
|
until and unless you provide in which field you want to check for duplicates we were helpless to provide you the logic for removal of those records. |
|
Back to top |
|
|
Priya_Shankar
New User
Joined: 07 Aug 2007 Posts: 22 Location: Chennai
|
|
|
|
I don't want to copy the duplicates into another file. I just want to read a file and if I find any duplicate record, a single entry of the record (will it be the first record or the last one) must be written into a file.
What is the procedure (i). if I have to compare the entire record or (ii). only based on the keys ? Is the procedure is different for different kinds of files? |
|
Back to top |
|
|
shankar.v
Active User
Joined: 25 Jun 2007 Posts: 196 Location: Bangalore
|
|
|
|
The following skeleton code helps you to write a single entry of duplicate records into output file. This can be done easily using a sort jcl.
Code: |
DATA DIVISION.
FILE SECTION.
FD INFILE.
01 WS-INFILE-REC.
02 WS-KEY PIC X().
02 WS-OTHER-FIELDS PIC X().
WORKING-STORAGE SECTION.
01 WF-DUPLICATE-RECORD
88 DUPLICATE-RECORD-FOUND VALUE 'Y'
88 DUPLICATE-RECORD-NOT-FOUND VALUE 'N'
77 WS-TEMP-KEY PIC X().
77 WS-DUPLICATE-COUNT PIC 9.
PROCEDURE DIVISION.
SET DUPLICATE-RECORD-NOT-FOUND TO TRUE
PERFORM UNTIL WF-END-OF-FILE
READ INFILE AT END SET WF-END-OF-FILE TO TRUE
END-READ
IF WS-KEY = WS-TEMP-KEY
SET DUPLICATE-RECORD-FOUND TO TRUE
ELSE
SET DUPLICATE-RECORD-NOT-FOUND TO TRUE
END-IF
IF DUPLICATE-RECORD-NOT-FOUND
WRITE OUTFILE-REC
END-IF
MOVE WS-KEY TO WS-TEMP-KEY
END-PERFORM |
|
|
Back to top |
|
|
|