Portal | Manuals | References | Downloads | Info | Programs | JCLs | Master the Mainframes
IBM Mainframe Computers Forums Index
 
Register
 
IBM Mainframe Computers Forums Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Usergroups Profile Log in to check your private messages Log in
 

 

eliminate duplicates for certain condtions

 
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
ayushT

New User


Joined: 01 Oct 2007
Posts: 23
Location: india

PostPosted: Sun Jul 25, 2010 12:51 pm    Post subject: eliminate duplicates for certain condtions
Reply with quote

Hi,

I have a file containing having two columns. Column 1 has duplicates. I need to filter out records if Column 2 is greater than 999999.

Input file

Column1Coulmn2
abcdefgh123456
abcdefgh123451
abcdefgh1234567
zaqwersd9876546
zaqwersd9876541
zaqwersd987654

Output file
abcdefgh123456
abcdefgh123451
zaqwersd987654
Back to top
View user's profile Send private message

gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1703
Location: Australia

PostPosted: Sun Jul 25, 2010 1:34 pm    Post subject:
Reply with quote

Hi,

I'm not sure the significance of duplicate colums, but here is a way of omitting records

Code:
//STEP0100 EXEC PGM=SORT                     
//SYSOUT   DD SYSOUT=*                       
//SORTIN   DD *                             
ABCDEFGH123456                               
ABCDEFGH123451                               
ABCDEFGH1234567                             
ZAQWERSD9876546                             
ZAQWERSD9876541                             
ZAQWERSD987654                               
/*                                           
//SORTOUT  DD SYSOUT=*                       
//SYSIN    DD *                             
  SORT FIELDS=COPY                           
  OMIT COND=(9,7,UFF,GT,999999)             
/*                                           



Gerry
Back to top
View user's profile Send private message
ayushT

New User


Joined: 01 Oct 2007
Posts: 23
Location: india

PostPosted: Mon Jul 26, 2010 7:38 am    Post subject: Reply to: eliminate duplicates for certain condtions
Reply with quote

Hi Gerry,

Thank you for your reply. This would work if I wanted to removed all records where column 2 > 999999, but in my case I want to eliminate only if there are records with duplicates on column1. I want to keep a record with column2 > 999999 if there is only one instance of column1.

I think the example I put before might have misled you. Here's the latest

Column1Coulmn2
abcdefgh123456
abcdefgh123451
abcdefgh1234567
zaqwersd9876546
zaqwersd9876541
zaqwersd987654
wqewqrw8765432

Output file
abcdefgh123451
abcdefgh123456
wqewqrw8765432
zaqwersd987654

Thanks,
Ayush
Back to top
View user's profile Send private message
dick scherrer

Site Director


Joined: 23 Nov 2006
Posts: 19270
Location: Inside the Matrix

PostPosted: Mon Jul 26, 2010 7:51 am    Post subject:
Reply with quote

Hello,

Suggest you lose the "alphabet soup" and post something more usable/readable using some proper spacing and the "Code" tag. If your actual data is sensitive, make up some obvious values.

Like:
Code:
FRED     12345
FRED     123456
MARY     5656565
MARY     5656565
JIM      9876543
JIM      1234
JIM      98765


Make enough data to show different situations and how you want the output from the new input.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 578
Location: USA

PostPosted: Mon Jul 26, 2010 8:04 am    Post subject:
Reply with quote

Ayush,
What is the LRECL of input file? FB or VB? What are the data types for column1 and column2?

Thanks,
Back to top
View user's profile Send private message
Gabriel Araujo Alves

New User


Joined: 20 Jul 2010
Posts: 38
Location: Brazil

PostPosted: Mon Jul 26, 2010 8:46 pm    Post subject: Re: Reply to: eliminate duplicates for certain condtions
Reply with quote

ayushT wrote:
Hi Gerry,

Thank you for your reply. This would work if I wanted to removed all records where column 2 > 999999, but in my case I want to eliminate only if there are records with duplicates on column1. I want to keep a record with column2 > 999999 if there is only one instance of column1.

I think the example I put before might have misled you. Here's the latest

Column1Coulmn2
abcdefgh123456
abcdefgh123451
abcdefgh1234567
zaqwersd9876546
zaqwersd9876541
zaqwersd987654
wqewqrw8765432

Output file
abcdefgh123451
abcdefgh123456
wqewqrw8765432
zaqwersd987654

Thanks,
Ayush



Hi,
first you need separe the duplicated records in conditions of first column, and before this in another step you remove all records(duplicated) with column 2 greater 999999, like gcicchet do. if you don't know how to do this, i can post it here...
Back to top
View user's profile Send private message
ayushT

New User


Joined: 01 Oct 2007
Posts: 23
Location: india

PostPosted: Mon Jul 26, 2010 9:04 pm    Post subject:
Reply with quote

Thanks Gabriel. I thought of the same, but it involves two steps. I wanted to do it in one step.

Posting below a sample input dataset (FB,LRECL=15 bytes) which is sorted by Column1

Column1 - 5 bytes
Filler - 3 bytes
Coulmn2- 7 bytes

INPUT FILE
Code:

----5---9---------
AYUSH   444444
AYUSH   5555555
BECKY   6666666
RICKY   123456
RICKY   1234567
RICKY   987654

OUTPUT FILE
----5---9---------
AYUSH   444444
BECKY   6666666
RICKY   123456
RICKY   987654


Thanks,
Ayush
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 578
Location: USA

PostPosted: Mon Jul 26, 2010 9:43 pm    Post subject:
Reply with quote

Ayush,
What if there are only 2 records for the keys and both has field2 values less than 999999. Do you need to keep both the records? If not which one do you need to keep 1st or 2nd?

What is an expected output from below input?

Code:

AYUSH   444444
AYUSH   333333
BECKY   6666666
RICKY   123456
RICKY   1234567
RICKY   987654


Thanks,
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Mon Jul 26, 2010 9:45 pm    Post subject:
Reply with quote

Ayush,

Here's a DFSORT/ICETOOL job that will do what you asked for in one step (although NOT in one pass):

Code:

//S1    EXEC  PGM=ICETOOL                                           
//TOOLMSG DD SYSOUT=*                                               
//DFSMSG  DD SYSOUT=*                                               
//IN DD DSN=&&IN,DISP=(OLD,PASS)                                     
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)           
//T2 DD DSN=&&T2,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)           
//CON DD DSN=*.T1,VOL=REF=*.T1,DISP=(OLD,PASS)                       
//    DD DSN=*.T2,VOL=REF=*.T2,DISP=(OLD,PASS)                       
//OUT DD SYSOUT=*                                                   
//TOOLIN DD *                                                       
SELECT FROM(IN) TO(T1) DISCARD(T2) ON(1,5,CH) ALLDUPS USING(CTL1)   
SORT FROM(CON) TO(OUT) USING(CTL2)                                   
/*
//CTL1CNTL DD *                                                     
  OUTFIL FNAMES=T1,OMIT=(9,7,UFF,GT,999999)                         
/*
//CTL2CNTL DD *                                                     
  OPTION EQUALS                                                     
  SORT FIELDS=(1,5,CH,A)                                             
/*
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 578
Location: USA

PostPosted: Mon Jul 26, 2010 10:00 pm    Post subject:
Reply with quote

Frank,
Would below work as a single pass solution?

Code:
//SORT01  EXEC PGM=SORT                                               
//SORTIN   DD *                                                       
AYUSH   444444                                                       
AYUSH   5555555                                                       
BECKY   6666666                                                       
RICKY   123456                                                       
RICKY   1234567                                                       
RICKY   987654                                                       
//SORTOUT  DD SYSOUT=*                                               
//*                                                                   
//SYSIN DD *                                                         
 INREC IFTHEN=(WHEN=INIT,OVERLAY=(21:SEQNUM,8,ZD,RESTART=(1,5),       
                                  30:C'Y')),                         
       IFTHEN=(WHEN=(21,8,ZD,GT,1,AND,9,7,UFF,GT,0999999),           
               OVERLAY=(30:C'N'))                                     
 SORT FIELDS=COPY                                                     
 OUTFIL INCLUDE=(30,1,CH,EQ,C'Y'),BUILD=(1,15)                       
/*                                                                   
//SYSOUT DD SYSOUT=*                                                 
//*



Thanks,
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Mon Jul 26, 2010 10:19 pm    Post subject:
Reply with quote

It doesn't appear to work correctly for these input records:

Code:

BILL    6666666       
BILL    6666666       
RICKY   1234567       
RICKY   1234567       
RICKY   1234567       


My job correctly gives no output records for this case, whereas your job incorrectly gives these output records:

Code:

BILL    6666666
RICKY   1234567
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 578
Location: USA

PostPosted: Mon Jul 26, 2010 10:40 pm    Post subject:
Reply with quote

Frank,
Thanks for the feedback.

I assumed this based on OP's expected output where he had hinted that he would still want to keep atleast 1 record even if its above 999999.

Code:

----5---9---------
AYUSH   444444
AYUSH   5555555
BECKY   6666666
RICKY   123456
RICKY   1234567
RICKY   987654

OUTPUT FILE
----5---9---------
AYUSH   444444
BECKY   6666666 
RICKY   123456
RICKY   987654


BECKY 6666666 --> This part in expected output led me to believe that OP wanted to keep atleast single record if all the record has field2 above limit.Basically I assumed he wanted to keep atleast single record for each input key regardless of the field2 condition.

Well this was an assumption and I could be wrong.

Thanks,
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Mon Jul 26, 2010 11:58 pm    Post subject:
Reply with quote

Quote:
I assumed this based on OP's expected output where he had hinted that he would still want to keep atleast 1 record even if its above 999999.


Quote:
BECKY 6666666 --> This part in expected output led me to believe that OP wanted to keep atleast single record if all the record has field2 above limit.Basically I assumed he wanted to keep atleast single record for each input key regardless of the field2 condition.


I took this to mean he wanted to keep any nonduplicate records regardless of whether they met the < 999999 criteria or not. That's diffferent then always keeping one record of duplicates, which I don't see stated or implied anywhere.

I guess the OP will have to let us know (if he's still interested).
Back to top
View user's profile Send private message
Gabriel Araujo Alves

New User


Joined: 20 Jul 2010
Posts: 38
Location: Brazil

PostPosted: Tue Jul 27, 2010 12:04 am    Post subject:
Reply with quote

WHERE I CAN SEE MORE ABOUT
(WHEN=INIT)/WHEN=GROUP
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Moderator


Joined: 15 Feb 2005
Posts: 7130
Location: San Jose, CA

PostPosted: Tue Jul 27, 2010 12:09 am    Post subject:
Reply with quote

See the discussion of "IFTHEN" under "OUTFIL Control Statements" in z/OS DFSORT Application Programming Guide:

http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/ICE1CA40/3.13?DT=20090527161936
Back to top
View user's profile Send private message
Gabriel Araujo Alves

New User


Joined: 20 Jul 2010
Posts: 38
Location: Brazil

PostPosted: Tue Jul 27, 2010 12:11 am    Post subject:
Reply with quote

THANK'S FRANK.
Back to top
View user's profile Send private message
ayushT

New User


Joined: 01 Oct 2007
Posts: 23
Location: india

PostPosted: Tue Jul 27, 2010 1:21 am    Post subject: Reply to: eliminate duplicates for certain condtions
Reply with quote

I think what Frank put works for me. Thanks Frank!!!
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 578
Location: USA

PostPosted: Tue Jul 27, 2010 1:43 am    Post subject:
Reply with quote

ayushT,
Thanks for the clarification on requirement.


Thanks,
Back to top
View user's profile Send private message
ayushT

New User


Joined: 01 Oct 2007
Posts: 23
Location: india

PostPosted: Tue Jul 27, 2010 2:03 am    Post subject:
Reply with quote

Thank you all for your help!!!!
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Updating the counters after eliminati... PANDU1 DFSORT/ICETOOL 12 Mon Nov 21, 2016 9:47 am
No new posts how to eliminate records from operati... PUMA DFSORT/ICETOOL 5 Tue Dec 23, 2014 4:42 pm
No new posts How to eliminate duplicate records - ... Rohan saldanha DFSORT/ICETOOL 7 Tue Jul 15, 2014 7:35 pm
No new posts SORT CARD to remove duplicates & ... mkk157 DFSORT/ICETOOL 17 Tue Jul 08, 2014 9:04 pm
No new posts Remove the duplicates without sorting... maki_psg JCL & VSAM 13 Tue May 13, 2014 10:32 pm


Facebook
Back to Top
 
Mainframe Wiki | Forum Rules | Bookmarks | Subscriptions | FAQ | Tutorials | Contact Us