IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Stripping off only first occurence of duplicate record


IBM Mainframe Forums -> All Other Mainframe Topics
Post new topic   This topic is locked: you cannot edit posts or make replies.
View previous topic :: View next topic  
Author Message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Tue Mar 18, 2014 12:38 am
Reply with quote

Hi. I have a file that has duplicate records only and the requestor wants to remove only the first occurence and leave all the other duplicates. So if there are 3 duplicate records, remove 1 and leave 2.

Can this be done using SYNCSORT? If so, what commands would I need for that?

By the way, my file only has dupes so really just need to drop one dupe records and leave others. Doesn't neccesarily have to be the first one.

Thanks
Back to top
View user's profile Send private message
Rohit Umarjikar

Global Moderator


Joined: 21 Sep 2010
Posts: 3049
Location: NYC,USA

PostPosted: Tue Mar 18, 2014 1:06 am
Reply with quote

May be you can use the below approach,

1) Take the input file and using Syncsort add numbers at the every last like 1,2,3 and so on for every unique chance of the record
2) And finally add a condition to remove the record which has last number added euqals to 1.
3) By this you will remove first entry of every duplicate.

E.g.

1) As per #1

Code:
AAAA1234.340001
AAAA1234.340002
AAAA1234.340003
BBBB1234.340001
BBBB1234.340002
BBBB1234.340003
BBBB1234.340004


Out put as per #2 above


Code:
AAAA1234.34
AAAA1234.34
BBBB1234.34
BBBB1234.34
BBBB1234.34


However to help you by other experts you need to provide all the necessary details and sample input data otherwise none can be helpful.
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Tue Mar 18, 2014 1:12 am
Reply with quote

That would work well. How do I add such a counter at the end as in your first example?

My data looks like this. Some have just 2 dupes, some have more than 2:

Code:
0000000024343090800074433902
0000000024343090800074433902
0000000024351661261958120101
0000000024351661261958120101
0000000024352050300074377903
0000000024352050300074377903
0000000024352050300074377903
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Tue Mar 18, 2014 1:37 am
Reply with quote

Got it to work using SUM FIELDS=NONE,XSUM
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Tue Mar 18, 2014 8:06 pm
Reply with quote

Good to hear it is working - thank you for letting us know and posting your solution icon_smile.gif

d
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Mar 18, 2014 8:22 pm
Reply with quote

What is "working" is not what is described in the question.

This will retain one record with a duplicate key for each key. The record retained depends on EQUALS (the first) or NOEQUALS (can't predict which) and the discarded records will be written to the XSUM DD.

hailashwin has the correct approach. A sequence number with a RESTART= for the key in question, then OUTFIL OMIT=(sequencenumbersisone).

There is your manual, there are examples here.
Back to top
View user's profile Send private message
Jay Villaverde

New User


Joined: 08 Mar 2014
Posts: 27
Location: USA

PostPosted: Tue Mar 18, 2014 8:28 pm
Reply with quote

Sorry not following because XSUM did give me what I needed. As I stated in my question it didn't necessarily need to be the first record dropped with others kept. Just needed to drop 1 dupe and keep the rest in a separate file. XSUM achieved that for me.

Yes, the other approach would have worked as well, but XSUM was quicker and easier for me.

Regards
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Mar 18, 2014 9:35 pm
Reply with quote

I see now that your question says both, and doesn't say anything about needing to keep the records which have been dropped.

However, you show one instance where you have three duplicates, so two will be dropped. Doesn't fit the "only one" from any interpretation of your question.

Using SUM with XSUM are you SORTing the records? Were you SORTing them anyway?

I suppose it may take up to a minute to code differently, but you'll save many, many minutes by not having to SORT the file.

To collect together the dropped records by the other method suggested, you'd just need a second OUTFIL with SAVE. There would be no duplicate keys on that file, unlike your XSUM file.

Looking at the sample data you have shown, it is irrelevant which record is dropped, because the duplicates are identical to each other.

If you are happy with XSUM, fine, just don't pretend it satisfied what you asked.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   This topic is locked: you cannot edit posts or make replies. View Bookmarks
All times are GMT + 6 Hours
Forum Index -> All Other Mainframe Topics

 


Similar Topics
Topic Forum Replies
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Duplicate transid's declared using CEDA CICS 3
No new posts FINDREP - Only first record from give... DFSORT/ICETOOL 3
No new posts To find whether record count are true... DFSORT/ICETOOL 6
Search our Forums:

Back to Top