Portal | Manuals | References | Downloads | Info | Programs | JCLs | Mainframe wiki | Quick Ref
IBM Mainframe Forum Index
 
Register
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Memberlist Profile Log in to check your private messages Log in
 
Removing duplicate record based on threshold limit

 
Post new topic   This topic is locked: you cannot edit posts or make replies.    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL
View previous topic :: :: View next topic  
Author Message
Nilanjan Sikdar

New User


Joined: 26 Feb 2016
Posts: 2
Location: India

PostPosted: Mon Jul 22, 2019 8:29 pm    Post subject: Removing duplicate record based on threshold limit
Reply with quote

Hi,

I have a requirement to remove duplicate record based on threshold limit. The limit will be parameterized and can be mentioned in control card. If the number of duplicate is more than the limit then the job should fail. Is it possible to do using DFSORT? Please help

Thanks,
Nilanjan
Back to top
View user's profile Send private message

Rohit Umarjikar

Senior Member


Joined: 21 Sep 2010
Posts: 2299
Location: NY,USA

PostPosted: Tue Jul 23, 2019 1:05 am    Post subject:
Reply with quote

I would think of this as one step solution.
1. Add duplicate counts per key at the end of the record using INREC
2. Using OUTFIL only include records whos counts form the INREC is greater than threshold limit (use JP1) and using NULLOFL set RC.
Back to top
View user's profile Send private message
Nic Clouston

Global Moderator


Joined: 10 May 2007
Posts: 2247
Location: Hampshire, UK

PostPosted: Tue Jul 23, 2019 12:45 pm    Post subject: Reply to: Removing duplicate record based on threshold limit
Reply with quote

I think that should be BELOW the limit not GREATER than the limit.
Also, I am not clear on 2 points:
1 - are you removing ALL duplicates or just duplicates over the limit
2 - is the limit referring to the total duplicates in the data set or the number of duplicates per record.
Back to top
View user's profile Send private message
Nilanjan Sikdar

New User


Joined: 26 Feb 2016
Posts: 2
Location: India

PostPosted: Tue Jul 23, 2019 2:06 pm    Post subject:
Reply with quote

Hi Rohit,

I don't want to keep count for individual key rather overall duplicate count.

Hi Nic,

1 - in case of duplicate over the limit job should fail (in a sense that the file is not correct).
2 - the limit referring to total duplicate.

For example: Say the max duplicate limit is set to 3 in sort card and below is the records:
input:

AAAAA
BBBBB
AAAAA
CCCCC
DDDDD
AAAAA
EEEEE

Output should be:
AAAAA
BBBBB
CCCCC
DDDDD
EEEEE

But if the input is like below:
AAAAA
BBBBB
CCCCC
CCCCC
DDDDD
CCCCC
CCCCC

then the step should fail as number of duplicate here is 4 which is more than the threshold limit.

Thanks,
Nilanjan
Back to top
View user's profile Send private message
Rohit Umarjikar

Senior Member


Joined: 21 Sep 2010
Posts: 2299
Location: NY,USA

PostPosted: Wed Jul 24, 2019 1:24 am    Post subject:
Reply with quote

Nilanjan,
Try this. You are in control of //SYMNAMES DD * to change it dynamically.
Code:
//*                                                 
//*GET THE TOTAL DUPPLICATE COUNT ACROSS KEYS AND UNIQUE RECORDS
//*                                                 
//STEP0100 EXEC PGM=SORT                           
//SYSOUT   DD SYSOUT=*                             
//SYSPRINT DD SYSOUT=*                             
//SORTIN   DD *                                     
AAAAA                                               
AAAAA                                               
AAAAA                                               
AAAAA                                               
AAAAA                                               
AAAAA                                               
CCCCC                                               
BBBBB                                               
DDDDD                                               
//GOOD     DD SYSOUT=*                                             
//BAD      DD DSN=&&S1,UNIT=SYSDA,SPACE=(TRK,(1,1)),DISP=(,PASS)   
//SORTLIST DD SYSOUT=*                                             
//SYSIN    DD *                                                     
  SORT FIELDS=(1,5,CH,A)                                           
  INREC OVERLAY=(20:C'00000001')                                   
  SUM FIELDS=(20,8,ZD)                                             
  OUTFIL FNAMES=BAD,REMOVECC,NODETAIL,INCLUDE=(20,8,ZD,GE,00000002),
  TRAILER1=(C'TOTAL    :',TOT=(20,8,ZD,EDIT=(TTTTTTTT)))           
  OUTFIL FNAMES=GOOD,BUILD=(1,5)                                   
//*                                                                 
//*SET RC=04 IF DUPLICATES ARE BEYOND THE THRESHOLD SUPPLIED IN SYM
//*                                                                 
//STEP0200 EXEC PGM=SORT,PARM='NULLOUT=RC4'                         
//SYMNAMES DD *                                                     
DUPLIMIT,00000002                                                   
//SYSOUT   DD SYSOUT=*                                             
//SYSPRINT DD SYSOUT=*                                             
//SORTIN   DD DSN=&&S1,DISP=(OLD,PASS)   
//SORTOUT  DD SYSOUT=*               
//SORTLIST DD SYSOUT=*               
//SYSIN    DD *                       
  OPTION COPY                         
  INCLUDE COND=(12,8,ZD,LE,DUPLIMIT)   
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   This topic is locked: you cannot edit posts or make replies.    IBMMAINFRAMES.com Support Forums -> DFSORT/ICETOOL All times are GMT + 6 Hours
Page 1 of 1

 

Search our Forum:

Similar Topics
Topic Author Forum Replies Posted
No new posts Sort to construct single record from ... Deepti_R SYNCSORT 3 Wed Nov 13, 2019 12:14 am
No new posts Need help in parsing record in VBA file guptae DFSORT/ICETOOL 4 Fri Oct 11, 2019 2:31 pm
No new posts Need to fetch data from database base... Satandale COBOL Programming 8 Sun Sep 08, 2019 12:01 am
No new posts Concatenate two files of variable rec... Thiru S DFSORT/ICETOOL 3 Wed Aug 21, 2019 11:53 pm
No new posts Select based on a range from a differ... sergeyken SYNCSORT 2 Fri Aug 16, 2019 12:37 am

Facebook
Back to Top
 
Job Vacancies | Forum Rules | Bookmarks | Subscriptions | FAQ | Polls | Contact Us