IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Remove duplicates if it occurs more than 2 times


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
nagarajan.dharani

New User


Joined: 27 Dec 2006
Posts: 36
Location: Chennai

PostPosted: Mon Jan 14, 2008 1:01 pm
Reply with quote

Hi,

Can anyone help me one in the following requirement.

I've a fixed length record file with a employee number as one of the fields.

I need to do a sort and remove the duplicate records if the employee number occurs more than 2 times in the input file.

Is this possible through DFSORT?

For example

INPUT FILE:

88888
88888
77777
88888
77777
66666

My output should be
77777
77777
66666

The employee 88888 had occurred 3 times so the number should be removed.

Thanks,
Dharani
Back to top
View user's profile Send private message
ParagChouguley

Active User


Joined: 03 Feb 2007
Posts: 175
Location: PUNE(INDIA)

PostPosted: Mon Jan 14, 2008 3:54 pm
Reply with quote

Here is a job that will do what you had asked.
Code:

//S1      EXEC PGM=ICETOOL                         
//TOOLMSG DD SYSOUT=*                               
//DFSMSG  DD SYSOUT=*                               
//*                                                 
//IN1     DD *                                     
88888                                               
88888                                               
77777                                               
88888                                               
77777                                               
66666                                               
/*                                                 
//*                                                 
//OUT1    DD DSN=OUTPUT-FILE-NAME,                 
//      DSORG=PS,RECFM=FB,                         
//      DISP=(NEW,CATLG,DELETE)                     
//TEMP1   DD DSN=&&TEMP1,DISP=(MOD,DELETE,DELETE),DSORG=PS,RECFM=FB   
//TEMP2   DD DSN=&&TEMP2,DISP=(MOD,DELETE,DELETE),DSORG=PS,RECFM=FB   
//TOOLIN  DD *                                                       
    SORT FROM(IN1) TO(TEMP1) USING(SRT1)                             
    COPY FROM(TEMP1) TO(TEMP2) USING(SRT2)                           
    COPY FROM(IN1) TO(OUT1) USING(SRT3)                               
/*                                                                   
//SRT1CNTL DD *                                                       
    SORT FIELDS=(1,5,CH,A)                                           
    OUTREC FIELDS=(SEQNUM,10,ZD,START=1,INCR=1,RESTART=(1,5),1,5)     
/*                                                                   
//SRT2CNTL DD *                                                       
    OPTION COPY                                                       
    INCLUDE COND=(1,10,ZD,GT,+2)                                     
    OUTREC FIELDS=(3X,C'1,5,CH,EQ,C''',11,5,C''',OR,',80:X)           
/*                                                                   
//SRT3CNTL DD *                                               
    OPTION COPY                                               
    OMIT COND=(1,5,CH,NE,1,5,CH,OR,                           
/*                                                             
//         DD DSN=*.TEMP2,VOL=REF=*.TEMP2,DISP=(OLD,PASS)     
//         DD *                                               
    1,5,CH,NE,1,5,CH)                                         
/*                                                             


Output:
Code:

77777
77777
66666


Here's a link whose I took help of for developing this job.
ibmmainframes.com/viewtopic.php?t=24453

--Parag
Back to top
View user's profile Send private message
Ajay Baghel

Active User


Joined: 25 Apr 2007
Posts: 206
Location: Bangalore

PostPosted: Mon Jan 14, 2008 7:16 pm
Reply with quote

Quote:
OUTREC FIELDS=(SEQNUM,10,ZD,START=1,INCR=1,RESTART=(1,5),


Parag,

Can you please explain, what does RESTART do?

Thanks,
Ajay
Back to top
View user's profile Send private message
ParagChouguley

Active User


Joined: 03 Feb 2007
Posts: 175
Location: PUNE(INDIA)

PostPosted: Mon Jan 14, 2008 8:39 pm
Reply with quote

Quote:

Can you please explain, what does RESTART do?

As its name suggests, it restarts sequence number from beginning for change in control field mentioned in bracket.
Code:

Control Field    Seq Number
aaaa               0001
aaaa               0002
aaaa               0003
bbbb               0001
bbbb               0002
ccccc              0001


--Parag
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Mon Jan 14, 2008 10:21 pm
Reply with quote

Dharani,

Here's a better way to do what you want with DFSORT's ICETOOL:

Code:

//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//IN DD *
88888
88888
77777
88888
77777
66666
/*
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD SYSOUT=*
//TOOLIN DD *
SELECT FROM(IN) TO(T1) ON(1,5,CH) LOWER(3) USING(CTL1)
SORT FROM(T1) TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
  INREC OVERLAY=(81:SEQNUM,8,ZD)
/*
//CTL2CNTL DD *
  SORT FIELDS=(81,8,ZD,A)
  OUTREC BUILD=(1,80)
/*
Back to top
View user's profile Send private message
Ajay Baghel

Active User


Joined: 25 Apr 2007
Posts: 206
Location: Bangalore

PostPosted: Tue Jan 15, 2008 1:46 pm
Reply with quote

Thanks Parag.


-Ajay
Back to top
View user's profile Send private message
ParagChouguley

Active User


Joined: 03 Feb 2007
Posts: 175
Location: PUNE(INDIA)

PostPosted: Tue Jan 15, 2008 4:46 pm
Reply with quote

Hi Frank,
It's great ! Thats so simple to do with SELECT ! icon_biggrin.gif

But here are my 2 questions.
1) Whats the use of sequence number in CTL1 ? Is it because CTL1 cannot be left blank ?
2) I ran your job and saw intermediate results in T1. Its seq is changed. I guess that's why you are again sorting it. But what if user wants to retain original sequence from input file and just remove unwanted records ?

--Parag
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Tue Jan 15, 2008 10:06 pm
Reply with quote

Quote:
what if user wants to retain original sequence from input file and just remove unwanted records ?


That's what my job does. (Did you try it?)

I had to add the sequence numbers and do the extra sort in order to retain the original sequence of the records as shown by your example output.

SELECT sorts the records by the ON field. So if we just used:

Code:

//S1    EXEC  PGM=ICETOOL                       
//TOOLMSG DD SYSOUT=*                           
//DFSMSG  DD SYSOUT=*                           
//IN DD *                                       
88888                                           
88888                                           
77777                                           
88888                                           
77777                                           
66666                                       
/*   
//OUT DD SYSOUT=*                               
//TOOLIN DD *                                   
SELECT FROM(IN) TO(OUT) ON(1,5,CH) LOWER(3)     
/*     


OUT would have:

Code:

66666
77777
77777


with the records in sorted order.

But you wanted the records in their original order with the 77777 records before the 66666 record. So I had to add the sequence numbers for the original order of the records, do the SELECT which sorts by the ON field, and then do the extra sort on the sequence numbers to get the records back in their original order.
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Wed Jan 16, 2008 7:56 am
Reply with quote

Hi Frank,
I copied your code and ran the job.

The results are not what I was expecting.

The out is :-
Code:
66666   
77777   
77777   

The T1 file looks like this :-
Code:
------------------------------------------------------------------------------
=COLS> --2----+----3----+----4----+----5----+----6----+----7----+----8----+----
       --F----+----F----+----F----+----F----+----F----+----F----+----F----+----
       --2----+----3----+----4----+----5----+----6----+----7----+----8----+----
------------------------------------------------------------------------------
000001                                                                   
       444444444444444444444444444444444444444444444444444444444444444000000004
       000000000000000000000000000000000000000000000000000000000000000000000000
------------------------------------------------------------------------------
000002                                                                77777   
       444444444444444444444444444444444444444444444444444444444444444FFFFF4444
       000000000000000000000000000000000000000000000000000000000000000777770000
------------------------------------------------------------------------------
000003                                                                   
       444444444444444444444444444444444444444444444444444444444444444000000004
       000000000000000000000000000000000000000000000000000000000000000000000000
------------------------------------------------------------------------------


ie. the sequence number does not seem to be working.

Please assist.


Thanks

Gerry
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Jan 16, 2008 10:28 pm
Reply with quote

I don't see how you could have gotten that output in T1 with my job. Show me YOUR job and the //TOOLMSG and //DFSMSG messages you received.
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Thu Jan 17, 2008 3:07 am
Reply with quote

Hi Frank,
job i ran is below:-
Code:
//S1    EXEC  PGM=ICETOOL                                             
//TOOLMSG DD SYSOUT=*                                                 
//DFSMSG  DD SYSOUT=*                                                 
//IN DD *                                                             
88888                                                                 
88888                                                                 
77777                                                                 
88888                                                                 
77777                                                                 
66666                                                                 
/*                                                                     
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)             
//OUT DD SYSOUT=*                                                     
//TOOLIN DD *                                                         
SELECT FROM(IN) TO(T1) ON(1,5,CH) LOWER(3) USING(CTL1)                 
SORT FROM(T1) TO(OUT) USING(CTL2)                                     
/*                                                                     
//CTL1CNTL DD *                                                       
  INREC OVERLAY=(81:SEQNUM,8,ZD)                                       
/*                                                                     
//CTL2CNTL DD *                                     
  SORT FIELDS=(81,8,ZD,A)                           
  OUTREC BUILD=(1,80)                               
/*                                                   
//PRINT#IT EXEC PGM=IEBGENER                         
//SYSPRINT DD SYSOUT=*                               
//SYSIN    DD DUMMY                                 
//SYSUT1   DD DSN=&&T1,DISP=(SHR)                   
//SYSUT2   DD SYSOUT=*                               


TOOLMSG below:-
    1ICE600I 0 DFSORT ICETOOL UTILITY RUN STARTED

    ICE650I 0 VISIT www.ibm.com/storage/dfsort FOR ICETOOL PAPERS, EXAMPLES AND MORE

    ICE632I 0 SOURCE FOR ICETOOL STATEMENTS: TOOLIN


    ICE630I 0 MODE IN EFFECT: STOP

    SELECT FROM(IN) TO(T1) ON(1,5,CH) LOWER(3) USING(CTL1)
    ICE606I 0 DFSORT CALL 0001 FOR SORT FROM IN TO T1 USING CTL1CNTL COMPLETED
    ICE628I 0 RECORD COUNT: 000000000000006
    ICE638I 0 NUMBER OF RECORDS RESULTING FROM CRITERIA: 000000000000003
    ICE602I 0 OPERATION RETURN CODE: 00

    SORT FROM(T1) TO(OUT) USING(CTL2)
    ICE606I 0 DFSORT CALL 0002 FOR SORT FROM T1 TO OUT USING CTL2CNTL COMPLETED
    ICE602I 0 OPERATION RETURN CODE: 00


    ICE601I 0 DFSORT ICETOOL UTILITY RUN ENDED - RETURN CODE: 00


DFSMSG
    ICE200I 0 IDENTIFIER FROM CALLING PROGRAM IS 0001
    ICE143I 0 BLOCKSET SORT TECHNIQUE SELECTED
    ICE250I 0 VISIT www.ibm.com/storage/dfsort FOR DFSORT PAPERS, EXAMPLES AND MORE
    ICE000I 0 - CONTROL STATEMENTS FOR 5694-A01, Z/OS DFSORT V1R5 - 13:15 ON WED JAN 16, 2008 -
    0 INREC OVERLAY=(81:SEQNUM,8,ZD)
    ICE146I 0 END OF STATEMENTS FROM CTL1CNTL - PARAMETER LIST STATEMENTS FOLLOW
    DEBUG NOABEND,ESTAE
    OPTION MSGDDN=DFSMSG,LIST,MSGPRT=ALL,RESINV=0,SORTDD=CTL1,SORTIN=IN,SOR*
    TOUT=T1,DYNALLOC,SZERO,EQUALS,NOVLSHRT,LOCALE=NONE,NOCHE*
    CK
    SORT FIELDS=(1,5,CH,A)
    MODS E35=(ICE35DU,12288)
    ICE201I 0 RECORD TYPE IS F - DATA STARTS IN POSITION 1
    ICE751I 0 C5-K05352 C6-Q95214 C7-K90000 C8-K05352 E9-K06751 C9-BASE E5-K10929 E6-K90000 E7-K11698
    ...


Hopefully this helps.

Thanks

Gerry
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Thu Jan 17, 2008 3:45 am
Reply with quote

Ok, I see the difference. I'm running with the latest DFSORT PTF which supports INREC with SELECT. You don't have that PTF, so INREC with SELECT is not giving you the correct results. You need z/OS V1R5 PTF UK90007. That PTF has been available since April, 2006. Ask your System Programmer to install it (it's free).

In the meantime, you can use an extra pass to avoid using INREC with SELECT as follow:

Code:

//S1    EXEC  PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG  DD SYSOUT=*
//IN DD *
88888
88888
77777
88888
77777
66666
/*
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//T2 DD DSN=&&T2,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD SYSOUT=*
//TOOLIN DD *
COPY FROM(IN) TO(T1) USING(CTL1)
SELECT FROM(T1) TO(T2) ON(1,5,CH) LOWER(3)
SORT FROM(T2) TO(OUT) USING(CTL2)
/*
//CTL1CNTL DD *
  INREC OVERLAY=(81:SEQNUM,8,ZD)
/*
//CTL2CNTL DD *
  SORT FIELDS=(81,8,ZD,A)
  OUTREC BUILD=(1,80)
/*


For complete details on all of the new DFSORT and ICETOOL functions available with the April, 2006 PTF, see:

Use [URL] BBCode for External Links
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Thu Jan 17, 2008 4:10 am
Reply with quote

Hi Frank,
thanks for your prompt response.

Gerry
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Sortjoin and Search for a String and ... DFSORT/ICETOOL 1
No new posts Remove leading zeroes SYNCSORT 4
No new posts Build a record in output file and rep... DFSORT/ICETOOL 11
No new posts How to remove block of duplicates DFSORT/ICETOOL 8
This topic is locked: you cannot edit posts or make replies. Compare files with duplicates in one ... DFSORT/ICETOOL 11
Search our Forums:

Back to Top