IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Eliminate duplicate fields in a record as well as the file


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
sujithsamuel

New User


Joined: 22 Dec 2006
Posts: 11
Location: Chennai

PostPosted: Wed Apr 27, 2011 11:14 am
Reply with quote

Hi

I have a requirement in which duplicate fields in a record as well as the whole file must be eliminated and the first one retained as well.
Each field is considered to be 6 bytes

Example

Input
111111222222333333111111222222
111111444444555555222222
333333666666

Output
111111222222333333
444444555555
666666

Kindly help out with the same.

Thanks
Sujith.
Back to top
View user's profile Send private message
expat

Global Moderator


Joined: 14 Mar 2007
Posts: 8797
Location: Welsh Wales

PostPosted: Wed Apr 27, 2011 12:53 pm
Reply with quote

Kindly explain in understandable terms exactly what it is that you want to do
Back to top
View user's profile Send private message
sujithsamuel

New User


Joined: 22 Dec 2006
Posts: 11
Location: Chennai

PostPosted: Wed Apr 27, 2011 1:21 pm
Reply with quote

Hi

I want to eliminate duplicate fields from a record throughout the file.

This can be broken down into 2 stages

1) Eliminate duplicates within a same record

e.g
Lets say that there is a record of 24 bytes length which is again divided into 4 fields of 6 each

111111222222333333222222

In the above record 222222 occurs 2 times. So the routine shoudl eliminate the repeating 222222 and give the output as in below

111111222222333333

2) The second stage is that if 222222 occurs anywhere else within the file also it should be eliminated.


So in the whole file there should be only unique fields.


I/P file

111111222222333333222222
222222555555333333555555
333333777777888888111111

the O/P file should be as follows

111111222222333333
555555
777777888888

I hope the requirement is clear now

Thanks
Back to top
View user's profile Send private message
Escapa

Senior Member


Joined: 16 Feb 2007
Posts: 1399
Location: IL, USA

PostPosted: Wed Apr 27, 2011 2:49 pm
Reply with quote

What I understand from your big explanation is .. you don't want duplicates of six byte fields in whole file(within record and within file).

What if I give you output like below.. That still should satisfy your business requirement, I believe...
Code:

111111
222222
333333
555555
777777
888888
Back to top
View user's profile Send private message
sujithsamuel

New User


Joined: 22 Dec 2006
Posts: 11
Location: Chennai

PostPosted: Wed Apr 27, 2011 2:56 pm
Reply with quote

Yes

That should also do it

Thanks
Back to top
View user's profile Send private message
Escapa

Senior Member


Joined: 16 Feb 2007
Posts: 1399
Location: IL, USA

PostPosted: Wed Apr 27, 2011 3:08 pm
Reply with quote

You can use below...
Code:

//STEP0100 EXEC PGM=SORT                     
//SYSOUT   DD SYSOUT=*                       
//SORTIN   DD *                             
111111222222333333111111222222               
111111444444555555222222                     
333333666666                                 
//SORTOUT  DD DSN=&&TEMP,DISP=(NEW,PASS)     
//SYSIN    DD *                             
 OPTION COPY                                 
 OUTFIL BUILD=(1,6,/,7,6,/,13,6,/,19,6)     
/*                                           
//STEP0200 EXEC PGM=SORT                     
//SYSOUT   DD SYSOUT=*                       
//SORTIN   DD DSN=&&TEMP,DISP=(MOD,PASS)     
//SORTOUT  DD SYSOUT=*                       
//SYSIN    DD *                             
 SORT FIELDS=(1,6,CH,A)             
 SUM FIELDS=(NONE)                 
 OUTFIL OMIT=(1,6,CH,EQ,C'     ')   
/*                                 
Back to top
View user's profile Send private message
sujithsamuel

New User


Joined: 22 Dec 2006
Posts: 11
Location: Chennai

PostPosted: Wed Apr 27, 2011 3:52 pm
Reply with quote

Thanks a lot...its working.

Regards
Back to top
View user's profile Send private message
Escapa

Senior Member


Joined: 16 Feb 2007
Posts: 1399
Location: IL, USA

PostPosted: Wed Apr 27, 2011 4:02 pm
Reply with quote

sujithsamuel wrote:
Thanks a lot...its working.

Regards

Cheers..
Actually.. In second step outfil is not required. you can OMIT COND instead
Code:

//STEP0200 EXEC PGM=SORT                     
//SYSOUT   DD SYSOUT=*                       
//SORTIN   DD DSN=&&TEMP,DISP=(MOD,PASS)     
//SORTOUT  DD SYSOUT=*                       
//SYSIN    DD *                               
 OMIT COND=(1,6,CH,EQ,C'     ')               
 SORT FIELDS=(1,6,CH,A)                       
 SUM FIELDS=(NONE)                           
/*                                           
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Wed Apr 27, 2011 10:16 pm
Reply with quote

Escapa/Sambhaji,

You do NOT need to specify 6 blanks for the constant. You just need one blank. DFSORT will pad CH constants on the right with blanks to match the length of the field.

Code:

   OMIT COND=(1,6,CH,EQ,C' ')


will work fine. This becomes important for fields with large lengths; for example, you wouldn't want to have to code 40 blanks for a 40-byte CH field.
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Thu Apr 28, 2011 11:22 am
Reply with quote

The TS originally wanted the output as:
Code:

111111222222333333
555555
777777888888

in case the TS lost spacing, and wanted it like this:
Code:

111111222222333333
      555555
      777777888888


i created a little 19 part icetool job, to get what he wanted, and more.
though the return from Escapa's solution makes the most logic sense.
I have many sysouts used in debugging,
they and the TOOLIN operators used to populate can also be deleted.
I also added a few extra input items to check all my code.

also, in as a BUILD parm, what is the shortcut from
Code:
 
OUTFIL BUILD=(01,06,C'                        ',SEQNUM,8,ZD,C' 1B'/,

so the literal C'-with 24 spaces' can be shorter?
I have used symnames in the past, but there must be a shortcut notation.

I have also not bothered to show the output, this post is long enough.

for me, it was a nice, relaxing, coding experience.

Code:

//STEP0100 EXEC PGM=ICETOOL
//SYSOUT   DD SYSOUT=*
//SORTMSG  DD SYSOUT=*
//TOOLMSG  DD SYSOUT=*
//DFSMSG   DD SYSOUT=*
//SORTIN   DD *
222222111111333333111111
666666111111888888
666666444444555555222222
111111444444555555777777
333333666666
//T1       DD  DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(75,50)),
//             DISP=(MOD,PASS)
//T2       DD  DSN=&&T2,UNIT=SYSDA,SPACE=(CYL,(75,50)),
//             DISP=(MOD,PASS)
//T3       DD  DSN=&&T3,UNIT=SYSDA,SPACE=(CYL,(75,50)),
//             DISP=(MOD,PASS)
//T4       DD  DSN=&&T4,UNIT=SYSDA,SPACE=(CYL,(75,50)),
//             DISP=(MOD,PASS)
//T5       DD  DSN=&&T5,UNIT=SYSDA,SPACE=(CYL,(75,50)),
//             DISP=(MOD,PASS)
//T6       DD  DSN=&&T6,UNIT=SYSDA,SPACE=(CYL,(75,50)),
//             DISP=(MOD,PASS)
//TA       DD  DSN=&&TA,UNIT=SYSDA,SPACE=(CYL,(75,50)),
//             DISP=(MOD,PASS)
//SORTOUT  DD SYSOUT=*
//SORTT1   DD SYSOUT=*
//SORTT2   DD SYSOUT=*
//SORTT3   DD SYSOUT=*
//SORTT4   DD SYSOUT=*
//SORTT5   DD SYSOUT=*
//SORTT6   DD SYSOUT=*
//SORTTA   DD SYSOUT=*
//TOOLOUT  DD SYSOUT=*
//SYSIN    DD *
//TOOLIN   DD *
*  SPLIT INPUT INTO 4 RECORDS REF IS ALWAYS IN POS 1
  COPY FROM(SORTIN) TO(T1) USING(CTL1)
*  SPLIT INPUT INTO 4 RECORDS REF IS IN POS
  COPY FROM(SORTIN) TO(TA) USING(CTLA)
*
*  REMOVE DUPLICATES OF REF IN POS 1
  COPY FROM(T1) TO(T2) USING(CTL2)
*
*  SAVE THE TEMP DDS TO SYSOUT
  COPY FROM(T2) TO(SORTT2) USING(CTL5)
  COPY FROM(TA) TO(SORTTA) USING(CTL5)
*
*  JOINKEYS,  4 RECORDS REF IS IN POS
  COPY JKFROM TO(T3) USING(CTL3)
*
*  SAVE THE TEMP DDS TO SYSOUT
  COPY FROM(T3) TO(SORTT3) USING(CTL5)
*
  SPLICE FROM(T3) TO(T4) ON(31,08,CH) WITHANY KEEPNODUPS-
         WITH(01,6) WITH(07,6) WITH(13,6) WITH(19,6)
*
*  SAVE THE TEMP DDS TO SYSOUT
  COPY FROM(T4) TO(SORTT4) USING(CTL5)
*
*  FORMAT WITHOUT LEFT JUSTIFY
  COPY FROM(T4) TO(T5) USING(CTL7)
*
*  FORMAT WITH LEFT JUSTIFY
  COPY FROM(T4) TO(T6) USING(CTL8)
*
*  SAVE THE TEMP DDS TO SYSOUT
  COPY FROM(T5) TO(SORTT5) USING(CTL5)
  COPY FROM(T6) TO(SORTT6) USING(CTL5)
/*
//CTLACNTL DD *
 OPTION COPY
 OUTFIL BUILD=(01,06,C'                        ',SEQNUM,8,ZD,C' 1A'/,
           C'      ',07,06,C'                  ',SEQNUM,8,ZD,C' 2A'/,
           C'            ',13,06,C'            ',SEQNUM,8,ZD,C' 3A'/,
           C'                  ',19,06,C'      ',SEQNUM,8,ZD,C' 4A')
/*
//CTL1CNTL DD *
 OPTION COPY
 OUTFIL BUILD=(01,06,C'                        ',SEQNUM,8,ZD,C' 1B'/,
               07,06,C'                        ',SEQNUM,8,ZD,C' 2B'/,
               13,06,C'                        ',SEQNUM,8,ZD,C' 3B'/,
               19,06,C'                        ',SEQNUM,8,ZD,C' 4B')
/*
//CTL2CNTL DD *
 OMIT COND=(1,6,CH,EQ,C'      ')
 SORT FIELDS=(1,6,CH,A)
 SUM FIELDS=(NONE)
/*
//CTL3CNTL DD *
*JOINKEYS
 JOINKEYS F1=T2,FIELDS=(31,10,A)
 JOINKEYS F2=TA,FIELDS=(31,10,A)
 REFORMAT FIELDS=(F2:1,41)
/*
//CTL4CNTL DD *
 SORT FIELDS=(35,10,CH,A)
/*
//CTL5CNTL DD *
 OPTION COPY
/*
//CTL7CNTL DD *
 OPTION COPY
 INREC PARSE=(%01=(ABSPOS=1,FIXLEN=24)),
       BUILD=(1:%01)
/*
//CTL8CNTL DD *
 OPTION COPY
 INREC PARSE=(%01=(ABSPOS=1,FIXLEN=24)),
       BUILD=(1:%01,SQZ=(SHIFT=LEFT))
/*
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Thu Apr 28, 2011 12:01 pm
Reply with quote

Hi Dick,

for 24 spaces you can use
Code:
OUTFIL BUILD=(01,06,24X,SEQNUM,8,ZD,C' 1B'/
or
Code:
OUTFIL BUILD=(01,06,24C' ',SEQNUM,8,ZD,C' 1B'/
or
Code:
OUTFIL BUILD=(01,06,24X'40',SEQNUM,8,ZD,C' 1B'/



Gerry
Back to top
View user's profile Send private message
dbzTHEdinosauer

Global Moderator


Joined: 20 Oct 2006
Posts: 6966
Location: porcelain throne

PostPosted: Thu Apr 28, 2011 1:33 pm
Reply with quote

gerry,

thx
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Fri Apr 29, 2011 1:33 am
Reply with quote

dbzTHEdinosauer wrote:
The TS originally wanted the output as:
Code:

111111222222333333
555555
777777888888

in case the TS lost spacing, and wanted it like this:
Code:

111111222222333333
      555555
      777777888888


i created a little 19 part icetool job, to get what he wanted, and more.


dbzTHEdinosauer,

I ran your job as is and I did not get the output. Your sample input is

Code:

222222111111333333111111       
666666111111888888             
666666444444555555222222       
111111444444555555777777       
333333666666                   


The output from this should

Code:

222222111111333333     
666666      888888     
      444444555555     
                  777777


What am I missing here ? The following DFSORT/JCL with 3 passes of data will give you the desired results. You don't have to use additional passes to route the temp files to sysout. You can use OUTFIL FNAMES and write out to multiples files at the same time.

pass1out and t1 will have output from the copy operation
pass2out and t2 will have output from the select operator removing the duplicates

OUT will have the final output

Code:

//STEP0100 EXEC PGM=ICETOOL                                     
//TOOLMSG  DD SYSOUT=*                                         
//DFSMSG   DD SYSOUT=*                                         
//IN       DD *                                                 
222222111111333333111111                                       
666666111111888888                                             
666666444444555555222222                                       
111111444444555555777777                                       
333333666666                                                   
//T1       DD DSN=&&T1,DISP=(,PASS),SPACE=(CYL,(1,1),RLSE)     
//T2       DD DSN=&&T2,DISP=(,PASS),SPACE=(CYL,(1,1),RLSE) 
//PASS1OUT DD SYSOUT=*                                         
//PASS2OUT DD SYSOUT=*                                         
//OUT      DD SYSOUT=*                                         
//TOOLIN   DD *                                               
  COPY FROM(IN) USING(CTL1)                                   
  SELECT FROM(T1) FIRST ON(34,6,CH) TO(T2) USING(CTL2)         
  SPLICE FROM(T2) TO(OUT) ON(25,8,CH) USING(CTL3) KEEPNODUPS -
  WITHANY WITH(1,6) WITH(7,6) WITH(13,6) WITH(19,6)           
//*                                                           
//CTL1CNTL DD *                                               
  OUTREC OVERLAY=(25:SEQNUM,8,ZD)                             
  OUTFIL FNAMES=(T1,PASS1OUT),                                 
  BUILD=(01,6,18X,25,8,01,06,/,                               
         06X,07,6,12X,25,8,07,06,/,                           
         12X,13,6,06X,25,8,13,06,/,                           
         18X,19,6,25,8,19,06)                                 
//*                                                           
//CTL2CNTL DD *                                               
  OMIT COND=(33,6,CH,EQ,C' ')                                 
  SORT FIELDS=(33,6,CH,A,26,8,CH,A),EQUALS                     
  OUTFIL FNAMES=(T2,PASS2OUT)                                 
//*                                                           
//CTL3CNTL DD *                                               
  INREC BUILD=(1,32)                                           
  OUTFIL FNAMES=OUT,BUILD=(1,24)                               
//*
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Extracting Variable decimal numbers f... DFSORT/ICETOOL 17
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts Duplicate transid's declared using CEDA CICS 3
No new posts Access to non cataloged VSAM file JCL & VSAM 18
Search our Forums:

Back to Top