IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

remove duplicates and merge two files


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Mon Aug 23, 2010 6:35 pm
Reply with quote

Y000263.RMSFEED.FILE1 :

010110011025247 XXXXXX
010110012300940 051461
010110012310940 111111

Y000263.REJECT.FILE1:

010110011025247 YYYYYY
010110011026855 051466

EXPECTED RESULT:
010110011025247 XXXXX
010110011026855 051466
010110012300940 051461
010110012310940 111111

MY CODING:
Code:

//SORT1    EXEC PGM=SORT                         
//SORTIN   DD DSN=Y000263.RMSFEED.FILE1,DISP=SHR
//         DD DSN=Y000263.REJECT.FILE1,DISP=SHR 
//SORTOUT  DD DSN=Y000263.QTRAN.RTK,             
//            UNIT=DATA,DISP=(,CATLG,DELETE),   
//            SPACE=(CYL,(500,50),RLSE)
//SYSOUT   DD SYSOUT=*                           
//SYSIN    DD *                                 
   SORT FIELDS=(1,15,CH,A)                       
   SUM FIELDS=NONE                               
/*                                               


CURRENT OUTPUT:
010110011025247 YYYYYY
010110011026855 051460
010110012300940 051461
010110012310940 111111

If any duplicate the record should be taken from Y000263.RMSFEED.FILE1 only. Also first 15 digit is key for two files. If any other way please let me know...
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8696
Location: Dubuque, Iowa, USA

PostPosted: Mon Aug 23, 2010 6:49 pm
Reply with quote

Why is this in the COBOL forum? If you use SYNCSORT, it should be in the JCL forum; if you use DFSORT, it should be in the SORT forum.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Mon Aug 23, 2010 7:19 pm
Reply with quote

Gopalakrishnan V,
What is the LRECL and RECFM of the input file(s)?

Also, why couldn't you feed just 1 input file (Y000263.RMSFEED.FILE1) to this step, remove duplicates and then later in the next step/program concatenate both the files?

Thanks,
Back to top
View user's profile Send private message
Garry Carroll

Senior Member


Joined: 08 May 2006
Posts: 1193
Location: Dublin, Ireland

PostPosted: Mon Aug 23, 2010 7:48 pm
Reply with quote

Quote:
Also, why couldn't you feed just 1 input file (Y000263.RMSFEED.FILE1) to this step, remove duplicates and then later in the next step/program concatenate both the files?


Wouldn't the fact that the duplicate to be dropped is in the second file not be significant?

The OP hasn't told us whether there can be duplicates in the first file or whether these keys are unique. If there can be duplicates in File1 is only the first to be kept?

Garry.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Mon Aug 23, 2010 8:17 pm
Reply with quote

Garry Carroll,
I agree with you, and yes OP can't use the method I have described above.

From his sample input and expected output he is looking for unpaired records from File1 and File2 along with paired F1 records. If he has duplicates in File1 (010110012310940) then he wants all the dups in output. In other words, he wants records from file2 only if its not present in file1.

Gopalakrishnan V
Could you also give us your DFSort Function level?

Thanks,
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 420
Location: Inside the SPEW (Southwest Ohio, USA)

PostPosted: Mon Aug 23, 2010 9:05 pm
Reply with quote

Help me understand OP examples. It's been awhile since I've slept.

I see no dups in 1st 15 characters except 1st rec in each file. Shouldn't all records with no dups be in output?

EQUALS will get the 1st rec.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Mon Aug 23, 2010 9:14 pm
Reply with quote

dneufarth,

Quote:
I see no dups in 1st 15 characters except 1st rec in each file. Shouldn't all records with no dups be in output?

Yes, Agreed. May be I need some sleep. 824.gif

Thanks,
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Mon Aug 23, 2010 11:32 pm
Reply with quote

Quote:
If any duplicate the record should be taken from Y000263.RMSFEED.FILE1 only.


You probably just need to add the following to your SYSIN:

Code:

    OPTION EQUALS


That will tell DFSORT to keep the first record with each duplicate. Since you have Y000263.RMSFEED.FILE1 first in the concatenation, that should do it.

Alternatively, you could use the SELECT opertor of DFSORT's ICETOOL with FIRST to do this - SELECT uses EQUALS automatically:

publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA40/6.12?DT=20090527161936
Back to top
View user's profile Send private message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Tue Aug 24, 2010 9:56 am
Reply with quote

Hi,
In a cobol program i get the input file as y000263.rmsfeed.file1, in that program if any invalid record then it written into y000263.reject.file1.

on next day that reject file get corrected by some support team, and merged with y000263.rmsfeed.file1.

The main thing is if any record is missing in y000263.rmsfeed.file1 then we have to take that record from y000263.reject.file1 if available.
Otherwise if any duplicates then we should remove y000263.reject.file1 record.


LRECL=60, RECFM=FB
Back to top
View user's profile Send private message
Garry Carroll

Senior Member


Joined: 08 May 2006
Posts: 1193
Location: Dublin, Ireland

PostPosted: Tue Aug 24, 2010 12:14 pm
Reply with quote

Quote:
In a cobol program i get the input file as y000263.rmsfeed.file1, in that program if any invalid record then it written into y000263.reject.file1.

This suggests that all records are still in y000263.rmsfeed.file1 and rejects are in y000263.reject.file1. Have the rejects been removed from y000263.rmsfeed.file1?

Quote:
on next day that reject file get corrected by some support team, and merged with y000263.rmsfeed.file1


Is it not the function of "some support team" to ensure that the corrected reject records are properly applied to y000263.rmsfeed.file1 ?

Quote:
The main thing is if any record is missing in y000263.rmsfeed.file1 then we have to take that record from y000263.reject.file1 if available.


Won't any such reject records be rejected again?

Garry.
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 420
Location: Inside the SPEW (Southwest Ohio, USA)

PostPosted: Tue Aug 24, 2010 12:32 pm
Reply with quote

seems like this should all be in a single program that processes 'rmsfeed' while resolving the corrected record processsing order and creating a 'reject' file for correction later. And then the cycle repeats the next scheduled run.

Corrected records seem to supercede a more current feed.
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 420
Location: Inside the SPEW (Southwest Ohio, USA)

PostPosted: Tue Aug 24, 2010 12:49 pm
Reply with quote

I'm no SORT guru, so in plain speak

as corrected rejects are concatenated last, perhaps

BUILD a rec with seq number at end of rec

SORT the key ascending and the seq number descending

With Option Equals, the reject should always eliminate the 'rmsfeed' 'reject' records dup issue in favor of the reject

BUILD rec without seq number

now take this file into pgm that processes whatever and yields the latest reject file



search the dfsort or JCL forums for build examples
Back to top
View user's profile Send private message
Garry Carroll

Senior Member


Joined: 08 May 2006
Posts: 1193
Location: Dublin, Ireland

PostPosted: Tue Aug 24, 2010 12:56 pm
Reply with quote

Quote:
the reject should always elimate the 'rmsfeed' 'reject' records dup issue in favor of the reject


... doesn't this contradict the OP's requirement...

Quote:
If any duplicate the record should be taken from Y000263.RMSFEED.FILE1 only.


Garry.
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 420
Location: Inside the SPEW (Southwest Ohio, USA)

PostPosted: Tue Aug 24, 2010 1:07 pm
Reply with quote

Garry,

Correct - had it backwards in my mind. Darn fine solution to get those corrected records in there first though.

Have no clue what I was thinking to bother with all that diatribe that is much ado about nothing.

sleepless in Cincinnati - got hung up on corrected records for some reason
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files and retrive records f... DFSORT/ICETOOL 2
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts Write line by line from two files DFSORT/ICETOOL 7
No new posts Sortjoin and Search for a String and ... DFSORT/ICETOOL 1
No new posts Compare only first records of the fil... SYNCSORT 7
Search our Forums:

Back to Top