IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Remove Duplicate set of records


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
srinut123

New User


Joined: 11 Oct 2005
Posts: 62
Location: India

PostPosted: Wed Jun 01, 2011 4:48 pm
Reply with quote

I've a file with sample records as show below.. The data is in batch format.. with first record being the batch header and last record being the batch trailer. Each batch can have multiple invoices. Inv hdr starts with '1' and inv line starts with '2'. Is there any way to remove the duplicate invoices(inlcuding its line items)? In my example Inv# 111 details repeated twice. I wanted them to be removed so that I wont face duplicate inv error while loading them.

Code:

BATCH HDR - BATCH DATE
1 - INV HDR INV# 111
2 - INV LINE1
2 - INV LINE2
1 - INV HDR INV# 112
2 - INV LINE1
2 - INV LINE2
1 - INV HDR INV# 111
2 - INV LINE1
2 - INV LINE2
1 - INV HDR INV# 113
2 - INV LINE1
2 - INV LINE2
BATCH TRA - REC CNT#
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Wed Jun 01, 2011 7:15 pm
Reply with quote

srinut123,
Is there a limit to maximum number of detail records (i.e. inv line starts with '2')? Also please provide LRECL and RECFM for the input file and expected output as well.

Out of curiosity, you mentioned you wanted to remove duplicates, so that you won't face any problem loading them? Are you loading DB2 table?

Thanks,
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Wed Jun 01, 2011 10:09 pm
Reply with quote

srinut123,

Use the following DFSORT JCL which will give you the desired results. I assumed that your input lrecl is FB recfm and LRECL=80

Code:

//STEP0100 EXEC PGM=SORT                                           
//SYSOUT   DD SYSOUT=*                                             
//SORTIN   DD *                                                   
BATCH HDR - BATCH DATE                                             
----+----1----+----2----+----3----+----4----+----5----+----6----+--
1 - INV HDR INV# 111                                               
2 - INV LINE1                                                     
2 - INV LINE2                                                     
1 - INV HDR INV# 112                                               
2 - INV LINE1                                                     
2 - INV LINE2                                                     
1 - INV HDR INV# 111                                               
2 - INV LINE1                                                     
2 - INV LINE2                                                     
1 - INV HDR INV# 113                                               
2 - INV LINE1                                                     
2 - INV LINE2                                                     
BATCH TRA - REC CNT#                                               
//SORTOUT  DD SYSOUT=*                                             
//SYSIN    DD *                                                   
  INREC IFTHEN=(WHEN=INIT,OVERLAY=(81:C'D')),                     
  IFTHEN=(WHEN=GROUP,BEGIN=(1,1,CH,EQ,C'1'),PUSH=(82:18,3,ID=4)), 
  IFTHEN=(WHEN=(7,3,CH,EQ,C'HDR'),OVERLAY=(81:C'A')),             
  IFTHEN=(WHEN=(7,3,CH,EQ,C'TRA'),OVERLAY=(81:C'T'))   
           
  SORT FIELDS=(81,4,CH,A),EQUALS   
                               
  OUTREC IFTHEN=(WHEN=GROUP,KEYBEGIN=(81,4),PUSH=(89:85,4))       
  OUTFIL BUILD=(1,80),INCLUDE=(85,4,ZD,EQ,89,4,ZD)                 
//*
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Wed Jun 01, 2011 11:12 pm
Reply with quote

Skolusu,
May be I have misunderstood the requirement but based on below comment from OP, I think he wanted to check of duplicate set of records
Quote:
Is there any way to remove the duplicate invoices(inlcuding its line items)?

For example, If I change input records as below and re-run the same job, it still produces the same output even though entire "recordset" wasn't a duplicate.The job,in turn would remove second set of records with header '111'.

Like I said, may I have misunderstood the requirement but I thought OP wanted to compare line items as well.

Code:
BATCH HDR - BATCH DATE
1 - INV HDR INV# 111   
2 - INV LINE1         
2 - INV LINE2         
1 - INV HDR INV# 112   
2 - INV LINE1         
2 - INV LINE2         
1 - INV HDR INV# 111   
2 - INV LINE1         
2 - INV LINE3      --> LINE2 changed to LINE3.   
1 - INV HDR INV# 113   
2 - INV LINE1         
2 - INV LINE2         
BATCH TRA - REC CNT#   

OUTPUT with changed input data
Code:
BATCH HDR - BATCH DATE
1 - INV HDR INV# 111 
2 - INV LINE1         
2 - INV LINE2         
1 - INV HDR INV# 112 
2 - INV LINE1         
2 - INV LINE2         
1 - INV HDR INV# 113 
2 - INV LINE1         
2 - INV LINE2         
BATCH TRA - REC CNT# 

Thanks,
Back to top
View user's profile Send private message
Skolusu

Senior Member


Joined: 07 Dec 2007
Posts: 2205
Location: San Jose

PostPosted: Thu Jun 02, 2011 1:21 am
Reply with quote

Sqlcode1,

I think OP wanted to remove set of records where the header has the invoice number and detail records does not have the invoice number.

what would be the output from the sample you shown?

Code:


1 - INV HDR INV# 111   
2 - INV LINE1         
2 - INV LINE2         

1 - INV HDR INV# 111   
2 - INV LINE1         
2 - INV LINE3      --> LINE2 changed to LINE3.   


Do you want to retain this group entirely as all the records under this group are not a 1 to 1 match with the earlier group? and the output will be a copy of input to output as is?

Another way of eliminating the duplicates is in the second set of records the header and the first detail record is a perfect match with the earlier group and then you delete them and move the mismatched detail record to prior group like this?

Code:


1 - INV HDR INV# 111   
2 - INV LINE1         
2 - INV LINE2         
2 - INV LINE3      --> LINE2 changed to LINE3.   


or do you want to leave the not match record as an orphan without the header?

You open a can of worms if you really want to do a 1 to 1 match within a group.

Code:

1 - INV HDR INV# 111   
2 - INV LINE1         
2 - INV LINE2       
2 - INV LINE3

       
1 - INV HDR INV# 111   
2 - INV LINE3         
2 - INV LINE1       
2 - INV LINE2


Now do you consider the above as a duplicate set? Under each header does the sequence of detail records matter? If it matters then you don't have a duplicate at all.
Back to top
View user's profile Send private message
sqlcode1

Active Member


Joined: 08 Apr 2010
Posts: 577
Location: USA

PostPosted: Thu Jun 02, 2011 8:01 pm
Reply with quote

Kolusu,
I should have originally provided expected output based on my understanding of the requirement.

Yes, from sample input I provided, I was thinking all the records needed to be retained as atleast one of the detail records didn't match. I afraid to say but yes I thought OP was asking about 1 to 1 match. I know, if this is what OP wanted, then its going to be tricky and that's why I asked OP about max. number of detail records.

Again,this is my understanding of the requirement and I could be wrong.

Thanks,
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Jun 02, 2011 8:15 pm
Reply with quote

You get given an invoice by a car-repair company. It has a number on it, which will be unique for that company. Their copy of the invoice will be needed to produce their accounts/book-keeping and for their tax purposes (potentially many other things as well, stock for instance, blah, blah).

There can be more than one "thing" on the invoice. Like replacement parts, labour. Each "thing" will be one "line" on the invoice.

So, if there is a file with duplicate invoices, it should have all the lines matching the original invoice.

So, to this point, the requirment is the option as understood by Kolusu.

What if the invoice numbers are not unique? Data-entry error? Or the fact that a three-digit invoice doesn't give a lot of room for continuing uniqueness. Or maybe data-entry error with the "lines"?

TS hasn't asked for any of this to be covered in any way, so matching the invoice number and removing everything relating to the "duplicate" if it exists, is what he says he wants. Still with Kolusu's understanding.
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts Duplicate transid's declared using CEDA CICS 3
No new posts Sortjoin and Search for a String and ... DFSORT/ICETOOL 1
No new posts Compare only first records of the fil... SYNCSORT 7
No new posts Pulling a fixed number of records fro... DB2 2
No new posts Remove leading zeroes SYNCSORT 4
Search our Forums:

Back to Top