Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

Removing duplicates using icetool without sorting the file

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
jebbin

New User


Joined: 13 Nov 2009
Posts: 6
Location: Bangalore

PostPosted: Fri Nov 13, 2009 4:40 pm    Post subject: Removing duplicates using icetool without sorting the file
Reply with quote

I need to remove records which have similar value in a particular position based on the value in another field. For example, if there are two records which have a value of "AAAAAAA" in first 8 bytes, i need to remove the record which has got the greater value in 20th byte. Also i need to update the trailer count depening on how many records are left in the output file. Would this be possible without out sorting the file i.e when the records with values "AAAAAAAA" in first 8 bytes are not grouped together?
Back to top
View user's profile Send private message

enrico-sorichetti

Global Moderator


Joined: 14 Mar 2007
Posts: 10276
Location: italy

PostPosted: Fri Nov 13, 2009 4:45 pm    Post subject: Reply to: Removing duplicates using icetool without sorting
Reply with quote

if the unordered sequence is so important not to disturbed by a sort
how will You define which one of the duplicates should be kept ??
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8593
Location: Back in jolly old England

PostPosted: Fri Nov 13, 2009 4:55 pm    Post subject:
Reply with quote

Because the solution for sort related questions may vary from product to product, please ensure that you state clearly which sort product you are using.

If you are not sure, then by running a simple sort step shown below, you will be able to find out for yourself.

If the messages start with ICE then your product is DFSORT. Please also post the output of the complete line which has a message code ICE201I, as this will enable our DFSORT experts to determine which release of DFSORT that you have installed. This may also affect the solution offered.

If the messages start with WER then the product is SYNCSORT and should be posted in the JCL forum. Please also post the information telling which version of SYNCSORT is installed, as this may also affect the solution offered.

Thank you for taking your time to ensure that the valuable time of others is not wasted by offering inappropriate solutions which are not relevant due to the sort product being used and/or the release that is installed in you site.

Code:
//SORTSTEP EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN   DD *
ABC
//SORTOUT  DD SYSOUT=*
//SYSIN    DD *
  SORT     FIELDS=COPY
Back to top
View user's profile Send private message
jebbin

New User


Joined: 13 Nov 2009
Posts: 6
Location: Bangalore

PostPosted: Fri Nov 13, 2009 5:00 pm    Post subject: Reply to: Removing duplicates using icetool without sorting
Reply with quote

The file is sorted based on another criteria which is important not to disturb. However if we do find duplicates, I want to keep the first occurrence of the record and delete the rest.
Back to top
View user's profile Send private message
jebbin

New User


Joined: 13 Nov 2009
Posts: 6
Location: Bangalore

PostPosted: Fri Nov 13, 2009 5:59 pm    Post subject: Reply to: Removing duplicates using icetool without sorting
Reply with quote

The message start with ICE. The line you requested is :

Code:
ICE201I F RECORD TYPE IS F - DATA STARTS IN POSITION 1
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Fri Nov 13, 2009 10:39 pm    Post subject:
Reply with quote

jebbin,

Please show an example of the records in your input file (relevant fields only) and what you expect for output. Explain the "rules" for getting from input to output. Give the starting position, length and format of each relevant field. Give the RECFM and LRECL of the input file.
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1506
Location: Chennai

PostPosted: Fri Nov 13, 2009 10:41 pm    Post subject:
Reply with quote

Hi,

If you could show how the file looks like, then I think this can be achieved.
A sample input & output test file,
Quote:

The file is sorted based on another criteria

Not a problem

First we can add seq numbers to all records,
then do a sort based on the fields which needs duplicate elimination,
Sort the recs back to the original order based on the seq number added earlier.
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1506
Location: Chennai

PostPosted: Fri Nov 13, 2009 10:44 pm    Post subject:
Reply with quote

He he he .. U beat me to the reply. with the same comments. icon_smile.gif

Although Posts: 169 is no where near Posts: 5667 , But
I cant resist saying "Wise men think alike". icon_wink.gif
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Fri Nov 13, 2009 11:16 pm    Post subject:
Reply with quote

I always preferred:

Great minds "sink" in the same channels. icon_lol.gif
Back to top
View user's profile Send private message
jebbin

New User


Joined: 13 Nov 2009
Posts: 6
Location: Bangalore

PostPosted: Mon Nov 16, 2009 12:26 pm    Post subject: Reply to: Removing duplicates using icetool without sorting
Reply with quote

Hi, following is sample data of my file:

Code:
00000UHL1091109JBHCF
1000167578932000003896161000000000000200704041101103
1000167578932000001424032000000000000200612011101001
1000567578932000001628561000020061203200612011101103
1000567578932000001722282000020061202200612011101112
1000567578932000001424032000020060406200604061101103
1000567578932000001430882000020060406200604061101005
99999UTL100000006


The relevant fields that make the record a duplicate are from position 6 to 25. So it would be SORT FIELDS = (6,20,CH). The first 19 characters of these is alphanumeric and the last field is numeric. But for identifying duplicates, they can be considered together as CH. The first occureence of a record that has a duplicate should be kept and the rest deleted. Header can be identified either with first 5 char = 00000 or next 4 = UHL1. And trailer with first 5 char = 99999 or next 4 = UTL1. The RECFM = FB and LRECL = 60. Please let me know if you need any further information.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Tue Nov 17, 2009 12:47 am    Post subject:
Reply with quote

Please show what you expect for the output records.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Extra character appears in file when ... Balu5491 All Other Mainframe Topics 1 Wed Jul 26, 2017 2:39 pm
No new posts SSH - known_hosts file configuration vasanthz All Other Mainframe Topics 2 Wed Jul 26, 2017 2:10 am
This topic is locked: you cannot edit posts or make replies. Fetching data from BAI File arunsoods JCL & VSAM 1 Wed Jul 19, 2017 4:28 pm
No new posts Write out NODUPS but just from one file Jay Villaverde DFSORT/ICETOOL 8 Fri Jul 14, 2017 12:44 am
No new posts How to add header with Date(YYMMDD) i... Rajan Moorthy DFSORT/ICETOOL 2 Thu Jul 06, 2017 11:44 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us