IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

copy and delete duplicate set of records !!


IBM Mainframe Forums -> JCL & VSAM
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
ramsri

Active User


Joined: 18 Oct 2008
Posts: 380
Location: India

PostPosted: Thu Mar 03, 2011 9:13 pm
Reply with quote

Hi,

We have an input file (LRECL=80, RECFM=FB). It contains sets of records as shown below:

Code:

----+----1----+----2----+----3----+----4
HDR1

HDR2 -- AUTO F1 1234 04
RECORD1
RECORD2
RECORD3
RECORD4
TLR
HDR1

HDR2 -- PETR F2 2345 03
RECORD11
RECORD12
RECORD13
TLR
HDR1

HDR2 -- AUTO F1 1234 04
RECORD1
RECORD2
RECORD3
RECORD4
TLR
HDR1

HDR2 -- SELT F3 1096 02
RECORD21
RECORD22
TLR
HDR1

HDR2 -- AUTO F1 1234 04
RECORD1
RECORD2
RECORD3
RECORD4
TLR


Each set starts with HDR1 and ends with TLR. We manually delete duplicate sets of records after verifying below mentioned information on HDR2 record.

If you consider HDR2 records alone, it will appear like below:
Code:

----+----1----+----2----+----3----+----4
HDR2 -- AUTO F1 1234 04
HDR2 -- PETR F2 2345 03
HDR2 -- AUTO F1 1234 04
HDR2 -- SELT F3 1096 02
HDR2 -- AUTO F1 1234 04


If four values at positions 9, 14, 17 and 22 are unique then we consider it as valid set of records. If they appear more than once then we consider it as duplicate set of records and we manually delete it and retain only one set thus getting the final file to be supplied to batch job. So, our expected output will be like this after manually deleting the duplicate set of records:

Quote:

9th position - name of number appearing at 17th position
14th position - unique code assigned to name appearing at 9th position
17th position - number assigned to name appearing at 9th position
22nd position - number of records in each set (exclude HDR1, HDR2 & TLR1)


Code:

----+----1----+----2----+----3----+----4
HDR1

HDR2 -- AUTO F1 1234 04
RECORD1
RECORD2
RECORD3
RECORD4
TLR
HDR1

HDR2 -- PETR F2 2345 03
RECORD11
RECORD12
RECORD13
TLR
HDR1

HDR2 -- SELT F3 1096 02
RECORD21
RECORD22
TLR


Is this possible to achieve it using SORT utility? In this example I have shown only one set occuring more than once. But in reality, any set may occur more than once. In this case how to get rid of duplicate sets of records?

Please help.

Thanks in advance.
Back to top
View user's profile Send private message
Kjeld

Active User


Joined: 15 Dec 2009
Posts: 365
Location: Denmark

PostPosted: Thu Mar 03, 2011 10:00 pm
Reply with quote

You might be able to do it with a SORT utility, but it will possibly require some exit coding or multiple passes to be performed.

Just write a validation application program that stores the header keys every time it reads one that has not been encountered before, and writes the detail records. If it reads a header which has been read before just read over until next header is encountered.
Back to top
View user's profile Send private message
Arun Raj

Moderator


Joined: 17 Oct 2006
Posts: 2481
Location: @my desk

PostPosted: Fri Mar 04, 2011 12:39 am
Reply with quote

You can achieve this in a couple of data passes using Syncsort. But the optimum solution as Kjeld suggested will be to write some code which requires the input to be read only once.
Back to top
View user's profile Send private message
ramsri

Active User


Joined: 18 Oct 2008
Posts: 380
Location: India

PostPosted: Fri Mar 04, 2011 9:10 am
Reply with quote

Hi Kjeld and Arun,

Thanks for the suggestions. I'll try that approach.
Back to top
View user's profile Send private message
enrico-sorichetti

Superior Member


Joined: 14 Mar 2007
Posts: 10873
Location: italy

PostPosted: Fri Mar 04, 2011 1:12 pm
Reply with quote

something to consider and meditate is the REAL record pattern
Quote:
any set may occur more than once
in the almost worst case for example all the blocks might have duplicates
200000 blocks ==> 100000 good blocks
pretty large table to keep in memory!

so after all the simpler and GENERAL way would be a three step LOGICAL approach

1) sort in order to make the duplicate blocks contiguous
2) get rid of the duplicates
3) sort again to restore the original sequence!

I said LOGICAL steps, a smart sort might doit in one jcl step
as far as data passes are concerned ...
1 data pass on the input dataset to read it augment each record with the necessary data for the process
after that little can be said... all is left to the internals of sort data mangling
Back to top
View user's profile Send private message
ramsri

Active User


Joined: 18 Oct 2008
Posts: 380
Location: India

PostPosted: Fri Mar 04, 2011 4:57 pm
Reply with quote

Hi enrico-sorichetti, is it possible to sort sets of records and delete the duplicates? I know SUM FIELDS=NONE on individual records would do the trick but never tried block sort and delete.

Thanks.
Back to top
View user's profile Send private message
ramsri

Active User


Joined: 18 Oct 2008
Posts: 380
Location: India

PostPosted: Fri Mar 04, 2011 6:54 pm
Reply with quote

Kjeld / Arun,

May I request the pseudo code to achieve this? Sorry, am not getting clear picture on how it is possible to achieve it.

Thanks.
Back to top
View user's profile Send private message
Kjeld

Active User


Joined: 15 Dec 2009
Posts: 365
Location: Denmark

PostPosted: Fri Mar 04, 2011 7:04 pm
Reply with quote

Kjeld wrote:
You might be able to do it with a SORT utility, but it will possibly require some exit coding or multiple passes to be performed.

Just write a validation application program that stores the header keys every time it reads one that has not been encountered before, and writes the detail records. If it reads a header which has been read before just read over until next header is encountered.

I would think my answer above would be sufficient for structuring a program. Providing code will imply that I will get paid to do your job...
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> JCL & VSAM

 


Similar Topics
Topic Forum Replies
No new posts Compare 2 files(F1 & F2) and writ... JCL & VSAM 8
No new posts DELETE SPUFI DB2 1
No new posts Duplicate transid's declared using CEDA CICS 3
No new posts DSNTIAUL driven delete IBM Tools 0
No new posts Compare only first records of the fil... SYNCSORT 7
Search our Forums:

Back to Top