Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

remove duplicates and merge two files

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Mon Aug 23, 2010 6:35 pm    Post subject: remove duplicates and merge two files
Reply with quote

Y000263.RMSFEED.FILE1 :

010110011025247 XXXXXX
010110012300940 051461
010110012310940 111111

Y000263.REJECT.FILE1:

010110011025247 YYYYYY
010110011026855 051466

EXPECTED RESULT:
010110011025247 XXXXX
010110011026855 051466
010110012300940 051461
010110012310940 111111

MY CODING:
Code:

//SORT1    EXEC PGM=SORT                         
//SORTIN   DD DSN=Y000263.RMSFEED.FILE1,DISP=SHR
//         DD DSN=Y000263.REJECT.FILE1,DISP=SHR 
//SORTOUT  DD DSN=Y000263.QTRAN.RTK,             
//            UNIT=DATA,DISP=(,CATLG,DELETE),   
//            SPACE=(CYL,(500,50),RLSE)
//SYSOUT   DD SYSOUT=*                           
//SYSIN    DD *                                 
   SORT FIELDS=(1,15,CH,A)                       
   SUM FIELDS=NONE                               
/*                                               


CURRENT OUTPUT:
010110011025247 YYYYYY
010110011026855 051460
010110012300940 051461
010110012310940 111111

If any duplicate the record should be taken from Y000263.RMSFEED.FILE1 only. Also first 15 digit is key for two files. If any other way please let me know...
Back to top
View user's profile Send private message

Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 7913
Location: Bellevue, IA

PostPosted: Mon Aug 23, 2010 6:49 pm    Post subject:
Reply with quote

Why is this in the COBOL forum? If you use SYNCSORT, it should be in the JCL forum; if you use DFSORT, it should be in the SORT forum.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 578
Location: USA

PostPosted: Mon Aug 23, 2010 7:19 pm    Post subject:
Reply with quote

Gopalakrishnan V,
What is the LRECL and RECFM of the input file(s)?

Also, why couldn't you feed just 1 input file (Y000263.RMSFEED.FILE1) to this step, remove duplicates and then later in the next step/program concatenate both the files?

Thanks,
Back to top
View user's profile Send private message
Garry Carroll

Active Member


Joined: 08 May 2006
Posts: 990
Location: Dublin, Ireland / Edinburgh, Scotland

PostPosted: Mon Aug 23, 2010 7:48 pm    Post subject:
Reply with quote

Quote:
Also, why couldn't you feed just 1 input file (Y000263.RMSFEED.FILE1) to this step, remove duplicates and then later in the next step/program concatenate both the files?


Wouldn't the fact that the duplicate to be dropped is in the second file not be significant?

The OP hasn't told us whether there can be duplicates in the first file or whether these keys are unique. If there can be duplicates in File1 is only the first to be kept?

Garry.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 578
Location: USA

PostPosted: Mon Aug 23, 2010 8:17 pm    Post subject:
Reply with quote

Garry Carroll,
I agree with you, and yes OP can't use the method I have described above.

From his sample input and expected output he is looking for unpaired records from File1 and File2 along with paired F1 records. If he has duplicates in File1 (010110012310940) then he wants all the dups in output. In other words, he wants records from file2 only if its not present in file1.

Gopalakrishnan V
Could you also give us your DFSort Function level?

Thanks,
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 236
Location: Cincinnati OH USA

PostPosted: Mon Aug 23, 2010 9:05 pm    Post subject:
Reply with quote

Help me understand OP examples. It's been awhile since I've slept.

I see no dups in 1st 15 characters except 1st rec in each file. Shouldn't all records with no dups be in output?

EQUALS will get the 1st rec.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 578
Location: USA

PostPosted: Mon Aug 23, 2010 9:14 pm    Post subject:
Reply with quote

dneufarth,

Quote:
I see no dups in 1st 15 characters except 1st rec in each file. Shouldn't all records with no dups be in output?

Yes, Agreed. May be I need some sleep. 824.gif

Thanks,
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Mon Aug 23, 2010 11:32 pm    Post subject:
Reply with quote

Quote:
If any duplicate the record should be taken from Y000263.RMSFEED.FILE1 only.


You probably just need to add the following to your SYSIN:

Code:

    OPTION EQUALS


That will tell DFSORT to keep the first record with each duplicate. Since you have Y000263.RMSFEED.FILE1 first in the concatenation, that should do it.

Alternatively, you could use the SELECT opertor of DFSORT's ICETOOL with FIRST to do this - SELECT uses EQUALS automatically:

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA40/6.12?DT=20090527161936
Back to top
View user's profile Send private message
Gopalakrishnan V

Active User


Joined: 28 Jun 2010
Posts: 102
Location: chennai

PostPosted: Tue Aug 24, 2010 9:56 am    Post subject:
Reply with quote

Hi,
In a cobol program i get the input file as y000263.rmsfeed.file1, in that program if any invalid record then it written into y000263.reject.file1.

on next day that reject file get corrected by some support team, and merged with y000263.rmsfeed.file1.

The main thing is if any record is missing in y000263.rmsfeed.file1 then we have to take that record from y000263.reject.file1 if available.
Otherwise if any duplicates then we should remove y000263.reject.file1 record.


LRECL=60, RECFM=FB
Back to top
View user's profile Send private message
Garry Carroll

Active Member


Joined: 08 May 2006
Posts: 990
Location: Dublin, Ireland / Edinburgh, Scotland

PostPosted: Tue Aug 24, 2010 12:14 pm    Post subject:
Reply with quote

Quote:
In a cobol program i get the input file as y000263.rmsfeed.file1, in that program if any invalid record then it written into y000263.reject.file1.

This suggests that all records are still in y000263.rmsfeed.file1 and rejects are in y000263.reject.file1. Have the rejects been removed from y000263.rmsfeed.file1?

Quote:
on next day that reject file get corrected by some support team, and merged with y000263.rmsfeed.file1


Is it not the function of "some support team" to ensure that the corrected reject records are properly applied to y000263.rmsfeed.file1 ?

Quote:
The main thing is if any record is missing in y000263.rmsfeed.file1 then we have to take that record from y000263.reject.file1 if available.


Won't any such reject records be rejected again?

Garry.
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 236
Location: Cincinnati OH USA

PostPosted: Tue Aug 24, 2010 12:32 pm    Post subject:
Reply with quote

seems like this should all be in a single program that processes 'rmsfeed' while resolving the corrected record processsing order and creating a 'reject' file for correction later. And then the cycle repeats the next scheduled run.

Corrected records seem to supercede a more current feed.
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 236
Location: Cincinnati OH USA

PostPosted: Tue Aug 24, 2010 12:49 pm    Post subject:
Reply with quote

I'm no SORT guru, so in plain speak

as corrected rejects are concatenated last, perhaps

BUILD a rec with seq number at end of rec

SORT the key ascending and the seq number descending

With Option Equals, the reject should always eliminate the 'rmsfeed' 'reject' records dup issue in favor of the reject

BUILD rec without seq number

now take this file into pgm that processes whatever and yields the latest reject file



search the dfsort or JCL forums for build examples
Back to top
View user's profile Send private message
Garry Carroll

Active Member


Joined: 08 May 2006
Posts: 990
Location: Dublin, Ireland / Edinburgh, Scotland

PostPosted: Tue Aug 24, 2010 12:56 pm    Post subject:
Reply with quote

Quote:
the reject should always elimate the 'rmsfeed' 'reject' records dup issue in favor of the reject


... doesn't this contradict the OP's requirement...

Quote:
If any duplicate the record should be taken from Y000263.RMSFEED.FILE1 only.


Garry.
Back to top
View user's profile Send private message
dneufarth

Active User


Joined: 27 Apr 2005
Posts: 236
Location: Cincinnati OH USA

PostPosted: Tue Aug 24, 2010 1:07 pm    Post subject:
Reply with quote

Garry,

Correct - had it backwards in my mind. Darn fine solution to get those corrected records in there first though.

Have no clue what I was thinking to bother with all that diatribe that is much ado about nothing.

sleepless in Cincinnati - got hung up on corrected records for some reason
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts High CPU consumption Job using IAM fi... aswinir JCL & VSAM 8 Thu Dec 01, 2016 8:28 pm
No new posts Updating the counters after eliminati... PANDU1 DFSORT/ICETOOL 12 Mon Nov 21, 2016 9:47 am
No new posts Match or compare two files in VB Format anatol DFSORT/ICETOOL 14 Thu Nov 03, 2016 7:41 pm
This topic is locked: you cannot edit posts or make replies. How to use 2 input files in control c... Gunapala CN DFSORT/ICETOOL 23 Thu Oct 13, 2016 3:42 pm
No new posts Adding records from two files into on... shiitiizz SYNCSORT 4 Mon Sep 19, 2016 8:41 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us