IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Sort,remove & capture dup recs having header/Trailer in rec


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Tue Apr 28, 2015 9:28 pm
Reply with quote

Hi,

Need info/suggestions on below mentioned problem..

My input file format
---------------------------------------------------------------------------
Code:
HEARDER1
01 12345 A  ADF12345  ASDFGHJKGL
01 23456 A  ASF12097  HJGKUTIOTP
01 12345 A  WER50345 HJTYERWOPP
01 78906 B  KLI09876   LOPIOOOOUP
01 78906 B  OUT11876  LOYRETFDOUP
TRAILER1
TRAILER2


By using ICETOOL/SORT can we get below result?

My output file format
----------------------------------------------------------------------------
Code:
HEARDER1
01 12345 A  ADF12345  ASDFGHJKGL
01 23456 A  ASF12097  HJGKUTIOTP
01 78906 B  KLI09876   LOPIOOOOUP
TRAILER1
TRAILER2


Means HEADER and TRAILER records will be copied to the output file as it was in input file. duplicates will be checked on below conditions...
1. 4 to 8 (5 places)& 10 (1 place)& 18 to 20 (3 places)

Duplicates will be captured in another DS

I know datasort works when we don't want to touch HEADER and TRAILER but detail records will be sorted out and ICETOOL SORTXSUM is the one through which we can capture all the duplicate values...

Now can these 2 be merged together to get the desired result? if anyone can give me any hint or share his/her thought it would be great..

Thanks
Subrata
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Apr 28, 2015 10:30 pm
Reply with quote

Have a look at ICETOOL's SELECT operator and the DFSORT Smart Trick "Keep dropped duplicate records (XSUM)".
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Tue Apr 28, 2015 10:43 pm
Reply with quote

Hi Bill..

Thanks for your reply.. Yes I have already used
Code:
SELECT FROM(IN) TO(OUT) ON(CONDITION 1) ON(CONDITION2) ON(CONDITION 3) FIRST DISCARD(SORTXSUM)


Now the problem I am getting with this is with both the HEADER and TRAILERs.. they are also getting sorted comes in somewhere middle of the sorted details rec.

Now I know if I use below code then it would work fine with HEADER and TRAILER recs (but then how will I discard those duplicates) but can both these 2 be used together (means SELECT..... and DATASORT)?

Code:
DATASORT FROM(IN) TO(OUT) HEADER TRAILER(2) USING(CTL1)
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Tue Apr 28, 2015 11:36 pm
Reply with quote

Use the SELECT. In INREC, you need to set a key which contains X'00' for the header, X'FF' for the trailer and X'10' for the data-records. Make that the major part of the key. It will not affect the SORT of the data, but the header and trailer will remain the first and last records.
Back to top
View user's profile Send private message
RahulG31

Active User


Joined: 20 Dec 2014
Posts: 446
Location: USA

PostPosted: Wed Apr 29, 2015 7:44 am
Reply with quote

May I know who deleted my post from this thread and why?
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Wed Apr 29, 2015 8:51 am
Reply with quote

Hi Rahul,

I am unaware of who has deleted your post. But if possible u can try to repost the same.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Apr 29, 2015 2:58 pm
Reply with quote

Rahul, guess who. Guess why. If still unclear, contact me by PM please.

subratarec,

The process was something like:

Copy the file once preserving the header (first record)

Copy the file again preserving the trailer (last two records)

Copy the file again ignoring the header and trailer

SELECT that last file to do the de-deplication, keeping the dropped records

Copy the header to an output file

Copy the sorted file to the same output file (JCL not shown but must be DISP=MOD)

Copy the trailer to the same output file

Copy the entire file again, for some reason

If you're happy with that, and it's only an e-mail, so it's not like it does a lot... until someone remembers it exists, and copies the code to do something else. And that gets repeated. And repeated.

Then your client has, scattered througout their system, this code for files will 10s or 100s of millions of records, day in, day out.
Back to top
View user's profile Send private message
RahulG31

Active User


Joined: 20 Dec 2014
Posts: 446
Location: USA

PostPosted: Wed Apr 29, 2015 8:15 pm
Reply with quote

@Bill: You are exaggerating the things.

1. There were 5 steps and NOT 8 that you mentioned above.

2. The files used were the temporary files and only the last file is output file.

3. I used SUBSET, that makes the person aware of it after reading. SUBSET can also be used in Header Trailer record. The person reading may use it later for some other issues that they may face. I believe, by deleting the post, you are robbing the people to have an alternate approach.


Now, as per your solution, please elaborate how do you plan to identify Header and Trailer records and put some High values and Low values in the key?

I don't see the questioner has mentioned anywhere on how to identify Header and Trailer

I would definitely like to see how this can be achieved with a lot less effort
.
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Apr 29, 2015 8:33 pm
Reply with quote

Here's the data sample which was provided. I feel it reasonable to be able to identify the header and trailers from this:

Code:
HEARDER1
01 12345 A  ADF12345  ASDFGHJKGL
01 23456 A  ASF12097  HJGKUTIOTP
01 12345 A  WER50345 HJTYERWOPP
01 78906 B  KLI09876   LOPIOOOOUP
01 78906 B  OUT11876  LOYRETFDOUP
TRAILER1
TRAILER2
Back to top
View user's profile Send private message
RahulG31

Active User


Joined: 20 Dec 2014
Posts: 446
Location: USA

PostPosted: Wed Apr 29, 2015 9:08 pm
Reply with quote

I think it's better if subratarec can provide the solution he has used.

@subratarec: Could you please let me know how did you achieve the result?
.
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Thu Apr 30, 2015 10:34 am
Reply with quote

Hi All,

Frankly I still haven't achieved the solution yet. I had gone through what Bill had posted and now I saw ur post. After reading both the post I am thinking how should go ahead and achieve the solution.. The same data I had posted it was not the exact data from our Prod file.. for Ref I will try to post the snapshot of exact Prod file into this blog.. so then it would be easy for us to find the correct way..

@RahulG31: If you have faced this kind of scenario in the past (may be not exactly the same but far similar to it) then plz let me know what was your approach or hint with some code snippet..

Thanks
Subrata
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Apr 30, 2015 12:53 pm
Reply with quote

Code:
  INREC IFTHEN=(WHEN=(identifyheader),
                      OVERLAY=(position:X'00')),
          IFTHEN=(WHEN=(identifytrailer),
                      OVERLAY=(position:X'FF')),
          IFTHEN=(WHEN=NONE,
                      OVERLAY=(position:X'10'))
  SORT FIELDS=(onebytefieldstartingatposition,A,key,A)
  OUTFIL FNAMES=yourname,
          BUILD=(originalrecorddatattodropextension)


The detail depends on whether your data is fixed-length (extend at the end of each record) or variable-length (extend at the beginning, using WHEN=INIT to make space for the extension whilst preserving the variable nature of the record, ie BUILD=(1,4,X,5) where X is a blank, and can be preceded by the number of bytes you want (you only want one byte, so only need the X).

Similar change to "undo" the extension when it is finished with, in OUTFIL.

This all goes in the CNTL file for the SELECT. The SELECT operates normally, but you can do, in addition, INCLUDE, INREC, SORT, SUM, OUTFIL. Not OUTREC, nor MODS, as that would mess with the way things work.
Back to top
View user's profile Send private message
RahulG31

Active User


Joined: 20 Dec 2014
Posts: 446
Location: USA

PostPosted: Thu Apr 30, 2015 8:58 pm
Reply with quote

@subratarec: If you haven't tried yet, below is what it should look like as per me (considering a fixed length of 34 bytes and key as 1-10). Keep in mind that SORTXSUM will contain the sequence number if you let SORT do the LRECL. So, you may want to code the DCB in JCL or you may want to remove the seq num in additional step/CNTL:
Code:
//TOOLIN DD *                                                           
 SELECT FROM(IN1) TO(T1) ON(1,10,CH) FIRST DISCARD(SORTXSUM) USING(CTL1)
 COPY FROM(T1) TO(OUT1) USING(CTL2)                                     
/*                                                                     
//CTL1CNTL DD *                                                         
 INREC IFTHEN=(WHEN=INIT,OVERLAY=(35:SEQNUM,8,ZD))                     
/*                                                                     
//CTL2CNTL DD *                                                         
 SORT FIELDS=(35,8,CH,A)                                               
 OUTREC BUILD=(1,34)                                                   
/*                                                                     
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Apr 30, 2015 9:51 pm
Reply with quote

There is no strong indication that data needs to be returned to the original sequence. So ask, rather than doing extra. At the moment you have two SORTs, and two passes of the data, whereas that may not be needed. You're also increasing the size of the data by nearly a quarter to cater for this.

Using LRECL in the JCL to cut down records gives you two places to look when something needs change, and to have them at different values will cause confusion. Not good advice, and neither is another step to drop the data.

You use the COPY operator, and then use SORT in the CNTL file. If you are going to SORT, use the SORT operator.
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Thu Apr 30, 2015 10:36 pm
Reply with quote

Hi All,

Thanks for all the efforts. I really appreciate your help.. As I had told that I will provide the excerpt of my prod data.. here it is..

Code:
01  82244  B           1-2M056         20        0        0        0
01  82244  B           2-4M062         19        0        0        0
01  85481  A        7.5-8.5040        177        0        0        0
01  85481  A           9-10042        339        0        0        0
01  85481  A        10-11.5044        221        0        0        0
01  83560  C           8-9Y134         75        0        0       75
01  83560  C          9-10Y140         57        0        0       57
01  83560  C         10-11Y146         57        0        0       57
01  83560  C         11-12Y152         79        0        0       79
01  83560  C         12-13Y158         53        0        0       53
99 58725                           
00US12015042910486722457142 0     


I know you all will say.. that is where is the Header rec. Actually for specifically for this file (for which we are facing issue in prod) doesn't have header data in them (it has 2 trailer recs) BUT I wanted to get an idea how the approach should be when Header will also be there.. cause many of the DSs I can see has both Header and Trailer recs..

I am going all the posts provided by you both.. and I will try them in my test env this above prod data excerpt will give you a clear pics...here the first trailer starts with HIGH value 99 and the last with Low vale 00.

Thanks
Subrata
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Thu Apr 30, 2015 10:38 pm
Reply with quote

Sorry I forgot to add one more thing.. that is the duplicate rec search criteria.

from 5 to 9 (5 digits)
from 12 to 12 (1 digit)
from 28 to 30 (3 digits)
Back to top
View user's profile Send private message
RahulG31

Active User


Joined: 20 Dec 2014
Posts: 446
Location: USA

PostPosted: Thu Apr 30, 2015 10:53 pm
Reply with quote

@subratarec: I think you have more than enough to proceed. I'll be out for a couple of days. Good luck with your testing.
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Thu Apr 30, 2015 10:59 pm
Reply with quote

Hi Rahul,

Yep Thanks!! I will check and test. Will let you both know what is the result.

Thanks
Subrata
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Thu Apr 30, 2015 11:40 pm
Reply with quote

Can you provide expected output for your latest sample, please? That will answer the question of what order you want the data in.
Back to top
View user's profile Send private message
subratarec

Active User


Joined: 22 Dec 2007
Posts: 126
Location: Bangalore

PostPosted: Sat May 02, 2015 11:57 pm
Reply with quote

Hi Bill,

Actually I followed the approach being described by you and Rahul. What I did is as follows..

Code:
//*-----------------------------------------------------------       
//*  COPYING ONLY DATA FROM INPUT FILE                               
//*-----------------------------------------------------------       
//SORT1    EXEC PGM=ICETOOL                                         
//TOOLMSG  DD  SYSOUT=*                                             
//DFSMSG   DD  SYSOUT=*                                             
//SORTIN   DD  DSN=<Main input file with both Trailer rec & duplicates>,                     
//             DISP=SHR                                             
//SORTOUT  DD  DSN=&&TEMP1, <data with dup recs no trailer>                                   
//             DISP=(NEW,KEEP),                                     
//             LIKE=<Main input file>                 
//SYSOUT   DD  SYSOUT=*                                             
//TOOLIN   DD *                                                     
  SUBSET FROM(SORTIN) TO(SORTOUT) INPUT REMOVE TRAILER(2) USING(CTL1)
//CTL1CNTL DD *                                                     
  SORT FIELDS=COPY                                                   
/*                                                                   
//*-----------------------------------------------------------     
//*  COPYING ONLY TRAILER REC FROM INPUT FILE                     
//*-----------------------------------------------------------     
//SORT2    EXEC PGM=ICETOOL                                       
//TOOLMSG  DD  SYSOUT=*                                           
//DFSMSG   DD  SYSOUT=*                                           
//SORTIN   DD  DSN=<Main input file with both Trailer rec & duplicates>,                   
//             DISP=SHR                                           
//SORTOUT  DD  DSN=&&TEMP2,   <only Trailer recs>                                     
//             DISP=(NEW,KEEP)                                     
//SYSOUT   DD  SYSOUT=*                                           
//TOOLIN   DD *                                                   
  SUBSET FROM(SORTIN) TO(SORTOUT) INPUT KEEP TRAILER(2)           
/*                                                                 
//*-----------------------------------------------------------       
//*  REMOVE DUPLICATE RECS,CREATE NEW O/P FILE AND KEEP DUPLICATE     
//*  TO ANOTHER FILE                                                 
//*-----------------------------------------------------------       
//SORT3    EXEC PGM=ICETOOL                                           
//TOOLMSG  DD  SYSOUT=*                                               
//DFSMSG   DD  SYSOUT=*                                               
//SORTIN   DD  DSN=&&TEMP1, <file dup recs main data. no trailer rec>                                         
//             DISP=SHR                                               
//SORTOUT  DD  DSN=&&TEMP3, <file non dup recs. only distinct data>                                     
//             DISP=(NEW,KEEP),                                       
//             LIKE=HXSUCHK.Q.HWW080.TRANS.INPUT                     
//SORTXSUM DD  DSN=HXSUCHK.STCKRECN.OUTPUT.DUP, <dup recs>                       
//             DISP=OLD                                               
//SYSOUT   DD  SYSOUT=*                                               
//TOOLIN   DD *                                                       
  SELECT FROM(SORTIN) TO(SORTOUT) ON(5,5,CH) ON(12,1,CH) ON(28,3,CH)-
  FIRST DISCARD(SORTXSUM)                                             
/*                                                                   
//*----------------------------------------------------------- 
//*  WILL APPEND TRAILER TO PURIFIED DATA AND WILL CREATE NEW   
//*  O/P FILE                                                   
//*----------------------------------------------------------- 
//MERGE    EXEC PGM=SORT                                       
//SORTIN   DD  DSN=&&TEMP3, <file non dup recs. only distinct data>                                 
//             DISP=SHR                                         
//         DD  DSN=&&TEMP2,   <only Trailer recs>                                 
//             DISP=SHR                                         
//SORTOUT  DD  DSN=HXSUCHK.STCKRECN.OUTPUT,  <pure data with trailer recs>                   
//             DISP=OLD                                         
//SYSOUT   DD  SYSOUT=*                                         
//SYSIN    DD *                                                 
  SORT FIELDS=COPY                                             
/*                             


From my end I have checked and it looks ok to me. But I need to send both the files for further checkup. Plz check and let me know if this can be done in a much better way (may be within less number of steps).

Thanks
Subrata
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Sun May 03, 2015 5:52 am
Reply with quote

When I described that approach, I was attempting an approximation to RahulG31's approach from memory. I was not recommending that, and I wouldn't (reason I deleted it).

If you look back a few posts, you'll find my suggestion for the CNTL file for the SELECT. This will get you your results, assuming that the order you want the output in is key order, not the original order with duplicates removed.
Back to top
View user's profile Send private message
RahulG31

Active User


Joined: 20 Dec 2014
Posts: 446
Location: USA

PostPosted: Mon May 04, 2015 10:41 pm
Reply with quote

@subratarec: Please read the posts which were 'Posted: Thu Apr 30, 2015'
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts Sortjoin and Search for a String and ... DFSORT/ICETOOL 1
No new posts JCL sort card - get first day and las... JCL & VSAM 9
No new posts Sort First/last record of a subset th... DFSORT/ICETOOL 7
No new posts how to calculate SUM value for VB fil... DFSORT/ICETOOL 1
Search our Forums:

Back to Top