IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Syncsort: Eliminating the first occurance of dup record.


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Thu Feb 12, 2009 3:33 pm
Reply with quote

Hi,

Here is my req..
I have a file which has four columns and i am sorting based on the first three columns.
c1 c2 c3 c4
------------------------
a1 a2 a3 v1
a1 a2 a3 v2
a1 a2 a3 v3

I have to sort based on only the first three columns and eliminate the duplicates.But i wanted the third row in my output sorted file..
a1 a2 a3 v3 (Required row)

Please help....
Back to top
View user's profile Send private message
nelson.pandian

Active User


Joined: 09 Apr 2008
Posts: 133
Location: Phoenix, AZ

PostPosted: Thu Feb 12, 2009 4:16 pm
Reply with quote

Hi Naresh Kareti,

The DFSORT/ICETOOL job will gives you desire output.

Code:
//S1   EXEC  PGM=ICETOOL                   
//TOOLMSG   DD  SYSOUT=*                   
//DFSMSG    DD  SYSOUT=*                   
//IN1 DD *                                 
A1 A2 A3 V1                                 
A1 A2 A3 V2                                 
A1 A2 A3 V3                                 
/*                                         
//OUT DD SYSOUT=*                           
//TOOLIN DD *                               
SELECT FROM(IN1) TO(OUT) ON(1,8,CH) LASTDUP
/*                                               


Output:
Code:
A1 A2 A3 V3


Hope this helps you
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Thu Feb 12, 2009 7:24 pm
Reply with quote

Naresh Kareti,

What if your input has records without duplicates? Say something like this.
Code:
A1 A2 A3 V1
A1 A2 A3 V2
A1 A2 A3 V3
A1 A2 A4 V1
A1 A2 A5 V1
Back to top
View user's profile Send private message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Thu Feb 12, 2009 7:55 pm
Reply with quote

Arcvns,

Ur right..my file can contain few duplicates and few non-duplicates..My output should always contain the last occurance of duplicate..

In a file, if there are 'n' duplicates, i want the n'th record in the o/p file..

I have given the values in the first post as an example only..we cannot give the values directly to the INDD.

I have to give the input file to the sortin and want the last occurance of duplicate record in the sortout file..i have to sort based on few columns and note that all the rows will be at same level(refer for the example in the first post)

Please let me know if the req is not clear...
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Thu Feb 12, 2009 11:41 pm
Reply with quote

Naresh,

Post the expected output for the above input data.
Back to top
View user's profile Send private message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Fri Feb 13, 2009 8:10 am
Reply with quote

Arun,

From ur example..i wanted the the following in the sorted file

A1 A2 A3 V3---last duplicate from the top three rows as we r sorting based on first three columns
A1 A2 A4 V1
A1 A2 A5 V1
Back to top
View user's profile Send private message
gcicchet

Senior Member


Joined: 28 Jul 2006
Posts: 1702
Location: Australia

PostPosted: Fri Feb 13, 2009 8:27 am
Reply with quote

Hi,

try
Code:
//S1        EXEC PGM=ICETOOL                                     
//TOOLMSG   DD SYSOUT=*                                           
//DFSMSG    DD SYSOUT=*                                           
//IN1       DD *                                                 
A1 A2 A3 V1                                                       
A1 A2 A3 V2                                                       
A1 A2 A3 V3                                                       
A1 A2 A4 V1                                                       
A1 A2 A5 V1                                                       
/*                                                               
//OUT      DD SYSOUT=*                                           
//TOOLIN   DD *                                                   
SELECT FROM(IN1) TO(OUT) ON(1,8,CH) LAST                         
/*                                                               



Gerry
Back to top
View user's profile Send private message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Fri Feb 13, 2009 12:52 pm
Reply with quote

Let me tell wat i am doing currently

the pgm is as follows
Code:

//STEP01 EXEC SORTD                                       
//SORTIN   DD DSN=naresh.sample.file1,
//              DISP=SHR                                     
//SORTOUT  DD DSN=naresh.sample.file2,           
//            DISP=(NEW,CATLG,DELETE),                     
//            UNIT=SYSDA,                                 
//            SPACE=(CYL,(200,100),RLSE),                 
//            DCB=(RECFM=FB,LRECL=1000,BLKSIZE=0)         
//SYSIN    DD  *                                           
  SORT FIELDS=(1,3,CH,A,4,3,CH,A,7,3,CH,A)           
  SUM FIELDS=NONE                                         
/*                                 

The data in the sortin file(naresh.sample.file1) can contain millions of records,out of which..take for example

c1(1-3) c2(4-6) c3(7-9) c4(10-11) c5(12-13)
----------------------------------------------------------------------
.
.
.
123 101 121 23 45
123 101 121 74 14
123 101 121 10 89
.
.
.
------------------------------------------------------------------------

The above pgm will result the following row in the o/p file
123 101 121 23 45


but i wanted the last row of the duplicates
123 101 121 10 89

Note:I have to sort based on only first three columns..
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Fri Feb 13, 2009 1:04 pm
Reply with quote

Naresh,

For your requirement, you can slightly modify Gerry's card as
Code:
SELECT FROM(IN1) TO(OUT) ON(1,9,CH) LAST
Back to top
View user's profile Send private message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Fri Feb 13, 2009 4:24 pm
Reply with quote

if i want to sort the 1st, 3rd and fourth fields, then how shud i give the sort condition in the ON fileds..
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Fri Feb 13, 2009 8:50 pm
Reply with quote

Naresh Kareti,

Quote:
if i want to sort the 1st, 3rd and fourth fields, then how shud i give the sort condition in the ON fileds
The posted result works for your initial requirement. If this is a change in requirement, you need to post some sample data which reflects this changes.

If you're asking this just out of curiosity, you can achieve this by specifying multiple ON conditions like this if the key fields are not contiguous.
Code:
SELECT FROM(IN1) TO(OUT) ON(p1,l1,f1) ON(p2,l2,f2) .... LAST
Back to top
View user's profile Send private message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Mon Feb 16, 2009 9:35 am
Reply with quote

Thanks everyone...this is working absolutely fine...but cant we achieve the same result using 'SORT" techniques..

In my pjct i may not use ICETOOL..
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Mon Feb 16, 2009 10:32 am
Reply with quote

Quote:
In my pjct i may not use ICETOOL
Any business reasons behind this?. The solution posted above IS a 'SORT technique'. SYNCTOOL package is shipped by SyncSort. If you have SyncSort, you have SYNCTOOL as well. BTW let me remind you that you are having SYNCTOOL and not ICETOOL eventhough both will invoke the same module in your shop.

You can use the below SyncSort job for the above requirement. I have assumed an FB input of LRECL=80. You can modify it as per your file attributes.
Code:
//STEP1    EXEC PGM=SORT               
//SYSOUT   DD SYSOUT=*                 
//SORTIN   DD *
----+----1----+----2----+----3
1231011212345                         
1231011217414                         
1231011211089 LAST                     
1231011227414 NO DUPLICATE             
1231011237414                         
1231011231166 LAST                     
//SYSOUT   DD SYSOUT=*                 
//SORTOUT  DD SYSOUT=*                 
//SYSIN    DD *                       
  OPTION EQUALS                       
  SORT FIELDS=(1,9,CH,A)               
  OUTFIL REMOVECC,NODETAIL,           
         SECTIONS=(1,9,TRAILER3=(1,80))
SORTOUT
Code:
1231011211089 LAST       
1231011227414 NO DUPLICATE
1231011231166 LAST
Back to top
View user's profile Send private message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Mon Feb 16, 2009 11:42 am
Reply with quote

Hi,

I am actually sorting based on the following fileds
SORT FIELDS=(1,7,CH,A,26,18,CH,A,472,18,CH,A).

So how to give these fileds in the SECTIONS()...I am using lrecl of 1000 for the o/p file.
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Mon Feb 16, 2009 1:04 pm
Reply with quote

Naresh,

As per your latest post, you would need the below SyncSort job to achieve this.
Code:
//STEP1    EXEC PGM=SORT                         
//SYSOUT   DD SYSOUT=*                           
//SORTIN   DD DSN= Input  file -- FB/1000         
//SORTOUT  DD DSN= Output file -- FB/1000         
//SYSOUT   DD SYSOUT=*                           
//SYSIN    DD *                                   
  OPTION EQUALS                                   
  SORT FIELDS=(1,7,CH,A,26,18,CH,A,472,18,CH,A)   
  OUTFIL REMOVECC,NODETAIL,                       
         SECTIONS=(1,7,26,18,472,18,             
         TRAILER3=(1,255,256,255,511,255,766,235))

Back to top
View user's profile Send private message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Mon Feb 16, 2009 2:25 pm
Reply with quote

Arun,

Thank you very much..it is working as i expected.
Could you please explain how did u arrive at these numbers in the TRAILER3 command.
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Mon Feb 16, 2009 2:55 pm
Reply with quote

Quote:
Thank you very much..it is working as i expected.
Could you please explain how did u arrive at these numbers in the TRAILER3 command
Naresh,

You're welcome. It's very simple icon_smile.gif . The length field in the TRAILER3 parameter allows only values in the range 1-255. I just splitted your LRECL(1000) based on this.
Go through the SyncSort Manual for more details.
Back to top
View user's profile Send private message
nareshkareti

New User


Joined: 22 Jul 2008
Posts: 33
Location: Chennai

PostPosted: Tue Feb 24, 2009 5:31 pm
Reply with quote

I have a small issue again...I had the same logic and working fine when the input file have some records.

But when the input file is empty, then this logic is resulting in one row with all spaces in the sorted output.

Can anyone please tell the reason for this.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM

 


Similar Topics
Topic Forum Replies
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts SFTP Issue - destination file record ... All Other Mainframe Topics 2
No new posts FINDREP - Only first record from give... DFSORT/ICETOOL 3
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts To find whether record count are true... DFSORT/ICETOOL 6
Search our Forums:

Back to Top