IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Removing duplicates using icetool without sorting the file


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
jebbin

New User


Joined: 13 Nov 2009
Posts: 6
Location: Bangalore

PostPosted: Fri Nov 13, 2009 4:40 pm
Reply with quote

I need to remove records which have similar value in a particular position based on the value in another field. For example, if there are two records which have a value of "AAAAAAA" in first 8 bytes, i need to remove the record which has got the greater value in 20th byte. Also i need to update the trailer count depening on how many records are left in the output file. Would this be possible without out sorting the file i.e when the records with values "AAAAAAAA" in first 8 bytes are not grouped together?
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Fri Nov 13, 2009 4:45 pm
Reply with quote

if the unordered sequence is so important not to disturbed by a sort
how will You define which one of the duplicates should be kept ??
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8797
Location: Welsh Wales

PostPosted: Fri Nov 13, 2009 4:55 pm
Reply with quote

Because the solution for sort related questions may vary from product to product, please ensure that you state clearly which sort product you are using.

If you are not sure, then by running a simple sort step shown below, you will be able to find out for yourself.

If the messages start with ICE then your product is DFSORT. Please also post the output of the complete line which has a message code ICE201I, as this will enable our DFSORT experts to determine which release of DFSORT that you have installed. This may also affect the solution offered.

If the messages start with WER then the product is SYNCSORT and should be posted in the JCL forum. Please also post the information telling which version of SYNCSORT is installed, as this may also affect the solution offered.

Thank you for taking your time to ensure that the valuable time of others is not wasted by offering inappropriate solutions which are not relevant due to the sort product being used and/or the release that is installed in you site.

Code:
//SORTSTEP EXEC PGM=SORT
//SYSOUT   DD SYSOUT=*
//SORTIN   DD *
ABC
//SORTOUT  DD SYSOUT=*
//SYSIN    DD *
  SORT     FIELDS=COPY
Back to top
View user's profile Send private message
jebbin

New User


Joined: 13 Nov 2009
Posts: 6
Location: Bangalore

PostPosted: Fri Nov 13, 2009 5:00 pm
Reply with quote

The file is sorted based on another criteria which is important not to disturb. However if we do find duplicates, I want to keep the first occurrence of the record and delete the rest.
Back to top
View user's profile Send private message
jebbin

New User


Joined: 13 Nov 2009
Posts: 6
Location: Bangalore

PostPosted: Fri Nov 13, 2009 5:59 pm
Reply with quote

The message start with ICE. The line you requested is :

Code:
ICE201I F RECORD TYPE IS F - DATA STARTS IN POSITION 1
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri Nov 13, 2009 10:39 pm
Reply with quote

jebbin,

Please show an example of the records in your input file (relevant fields only) and what you expect for output. Explain the "rules" for getting from input to output. Give the starting position, length and format of each relevant field. Give the RECFM and LRECL of the input file.
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1742
Location: Tirupur, India

PostPosted: Fri Nov 13, 2009 10:41 pm
Reply with quote

Hi,

If you could show how the file looks like, then I think this can be achieved.
A sample input & output test file,
Quote:

The file is sorted based on another criteria

Not a problem

First we can add seq numbers to all records,
then do a sort based on the fields which needs duplicate elimination,
Sort the recs back to the original order based on the seq number added earlier.
Back to top
View user's profile Send private message
vasanthz

Global Moderator


Joined: 28 Aug 2007
Posts: 1742
Location: Tirupur, India

PostPosted: Fri Nov 13, 2009 10:44 pm
Reply with quote

He he he .. U beat me to the reply. with the same comments. icon_smile.gif

Although Posts: 169 is no where near Posts: 5667 , But
I cant resist saying "Wise men think alike". icon_wink.gif
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri Nov 13, 2009 11:16 pm
Reply with quote

I always preferred:

Great minds "sink" in the same channels. icon_lol.gif
Back to top
View user's profile Send private message
jebbin

New User


Joined: 13 Nov 2009
Posts: 6
Location: Bangalore

PostPosted: Mon Nov 16, 2009 12:26 pm
Reply with quote

Hi, following is sample data of my file:

Code:
00000UHL1091109JBHCF
1000167578932000003896161000000000000200704041101103
1000167578932000001424032000000000000200612011101001
1000567578932000001628561000020061203200612011101103
1000567578932000001722282000020061202200612011101112
1000567578932000001424032000020060406200604061101103
1000567578932000001430882000020060406200604061101005
99999UTL100000006


The relevant fields that make the record a duplicate are from position 6 to 25. So it would be SORT FIELDS = (6,20,CH). The first 19 characters of these is alphanumeric and the last field is numeric. But for identifying duplicates, they can be considered together as CH. The first occureence of a record that has a duplicate should be kept and the rest deleted. Header can be identified either with first 5 char = 00000 or next 4 = UHL1. And trailer with first 5 char = 99999 or next 4 = UTL1. The RECFM = FB and LRECL = 60. Please let me know if you need any further information.
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Nov 17, 2009 12:47 am
Reply with quote

Please show what you expect for the output records.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts FTP VB File from Mainframe retaining ... JCL & VSAM 1
No new posts Extract the file name from another fi... DFSORT/ICETOOL 6
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
Search our Forums:

Back to Top