View previous topic :: View next topic
|
Author |
Message |
subratarec
Active User
Joined: 22 Dec 2007 Posts: 126 Location: Bangalore
|
|
|
|
Hi,
Need info/suggestions on below mentioned problem..
My input file format
---------------------------------------------------------------------------
Code: |
HEARDER1
01 12345 A ADF12345 ASDFGHJKGL
01 23456 A ASF12097 HJGKUTIOTP
01 12345 A WER50345 HJTYERWOPP
01 78906 B KLI09876 LOPIOOOOUP
01 78906 B OUT11876 LOYRETFDOUP
TRAILER1
TRAILER2 |
By using ICETOOL/SORT can we get below result?
My output file format
----------------------------------------------------------------------------
Code: |
HEARDER1
01 12345 A ADF12345 ASDFGHJKGL
01 23456 A ASF12097 HJGKUTIOTP
01 78906 B KLI09876 LOPIOOOOUP
TRAILER1
TRAILER2 |
Means HEADER and TRAILER records will be copied to the output file as it was in input file. duplicates will be checked on below conditions...
1. 4 to 8 (5 places)& 10 (1 place)& 18 to 20 (3 places)
Duplicates will be captured in another DS
I know datasort works when we don't want to touch HEADER and TRAILER but detail records will be sorted out and ICETOOL SORTXSUM is the one through which we can capture all the duplicate values...
Now can these 2 be merged together to get the desired result? if anyone can give me any hint or share his/her thought it would be great..
Thanks
Subrata |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Have a look at ICETOOL's SELECT operator and the DFSORT Smart Trick "Keep dropped duplicate records (XSUM)". |
|
Back to top |
|
|
subratarec
Active User
Joined: 22 Dec 2007 Posts: 126 Location: Bangalore
|
|
|
|
Hi Bill..
Thanks for your reply.. Yes I have already used
Code: |
SELECT FROM(IN) TO(OUT) ON(CONDITION 1) ON(CONDITION2) ON(CONDITION 3) FIRST DISCARD(SORTXSUM) |
Now the problem I am getting with this is with both the HEADER and TRAILERs.. they are also getting sorted comes in somewhere middle of the sorted details rec.
Now I know if I use below code then it would work fine with HEADER and TRAILER recs (but then how will I discard those duplicates) but can both these 2 be used together (means SELECT..... and DATASORT)?
Code: |
DATASORT FROM(IN) TO(OUT) HEADER TRAILER(2) USING(CTL1) |
|
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Use the SELECT. In INREC, you need to set a key which contains X'00' for the header, X'FF' for the trailer and X'10' for the data-records. Make that the major part of the key. It will not affect the SORT of the data, but the header and trailer will remain the first and last records. |
|
Back to top |
|
|
RahulG31
Active User
Joined: 20 Dec 2014 Posts: 446 Location: USA
|
|
|
|
May I know who deleted my post from this thread and why? |
|
Back to top |
|
|
subratarec
Active User
Joined: 22 Dec 2007 Posts: 126 Location: Bangalore
|
|
|
|
Hi Rahul,
I am unaware of who has deleted your post. But if possible u can try to repost the same. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Rahul, guess who. Guess why. If still unclear, contact me by PM please.
subratarec,
The process was something like:
Copy the file once preserving the header (first record)
Copy the file again preserving the trailer (last two records)
Copy the file again ignoring the header and trailer
SELECT that last file to do the de-deplication, keeping the dropped records
Copy the header to an output file
Copy the sorted file to the same output file (JCL not shown but must be DISP=MOD)
Copy the trailer to the same output file
Copy the entire file again, for some reason
If you're happy with that, and it's only an e-mail, so it's not like it does a lot... until someone remembers it exists, and copies the code to do something else. And that gets repeated. And repeated.
Then your client has, scattered througout their system, this code for files will 10s or 100s of millions of records, day in, day out. |
|
Back to top |
|
|
RahulG31
Active User
Joined: 20 Dec 2014 Posts: 446 Location: USA
|
|
|
|
@Bill: You are exaggerating the things.
1. There were 5 steps and NOT 8 that you mentioned above.
2. The files used were the temporary files and only the last file is output file.
3. I used SUBSET, that makes the person aware of it after reading. SUBSET can also be used in Header Trailer record. The person reading may use it later for some other issues that they may face. I believe, by deleting the post, you are robbing the people to have an alternate approach.
Now, as per your solution, please elaborate how do you plan to identify Header and Trailer records and put some High values and Low values in the key?
I don't see the questioner has mentioned anywhere on how to identify Header and Trailer
I would definitely like to see how this can be achieved with a lot less effort
. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Here's the data sample which was provided. I feel it reasonable to be able to identify the header and trailers from this:
Code: |
HEARDER1
01 12345 A ADF12345 ASDFGHJKGL
01 23456 A ASF12097 HJGKUTIOTP
01 12345 A WER50345 HJTYERWOPP
01 78906 B KLI09876 LOPIOOOOUP
01 78906 B OUT11876 LOYRETFDOUP
TRAILER1
TRAILER2 |
|
|
Back to top |
|
|
RahulG31
Active User
Joined: 20 Dec 2014 Posts: 446 Location: USA
|
|
|
|
I think it's better if subratarec can provide the solution he has used.
@subratarec: Could you please let me know how did you achieve the result?
. |
|
Back to top |
|
|
subratarec
Active User
Joined: 22 Dec 2007 Posts: 126 Location: Bangalore
|
|
|
|
Hi All,
Frankly I still haven't achieved the solution yet. I had gone through what Bill had posted and now I saw ur post. After reading both the post I am thinking how should go ahead and achieve the solution.. The same data I had posted it was not the exact data from our Prod file.. for Ref I will try to post the snapshot of exact Prod file into this blog.. so then it would be easy for us to find the correct way..
@RahulG31: If you have faced this kind of scenario in the past (may be not exactly the same but far similar to it) then plz let me know what was your approach or hint with some code snippet..
Thanks
Subrata |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Code: |
INREC IFTHEN=(WHEN=(identifyheader),
OVERLAY=(position:X'00')),
IFTHEN=(WHEN=(identifytrailer),
OVERLAY=(position:X'FF')),
IFTHEN=(WHEN=NONE,
OVERLAY=(position:X'10'))
SORT FIELDS=(onebytefieldstartingatposition,A,key,A)
OUTFIL FNAMES=yourname,
BUILD=(originalrecorddatattodropextension) |
The detail depends on whether your data is fixed-length (extend at the end of each record) or variable-length (extend at the beginning, using WHEN=INIT to make space for the extension whilst preserving the variable nature of the record, ie BUILD=(1,4,X,5) where X is a blank, and can be preceded by the number of bytes you want (you only want one byte, so only need the X).
Similar change to "undo" the extension when it is finished with, in OUTFIL.
This all goes in the CNTL file for the SELECT. The SELECT operates normally, but you can do, in addition, INCLUDE, INREC, SORT, SUM, OUTFIL. Not OUTREC, nor MODS, as that would mess with the way things work. |
|
Back to top |
|
|
RahulG31
Active User
Joined: 20 Dec 2014 Posts: 446 Location: USA
|
|
|
|
@subratarec: If you haven't tried yet, below is what it should look like as per me (considering a fixed length of 34 bytes and key as 1-10). Keep in mind that SORTXSUM will contain the sequence number if you let SORT do the LRECL. So, you may want to code the DCB in JCL or you may want to remove the seq num in additional step/CNTL:
Code: |
//TOOLIN DD *
SELECT FROM(IN1) TO(T1) ON(1,10,CH) FIRST DISCARD(SORTXSUM) USING(CTL1)
COPY FROM(T1) TO(OUT1) USING(CTL2)
/*
//CTL1CNTL DD *
INREC IFTHEN=(WHEN=INIT,OVERLAY=(35:SEQNUM,8,ZD))
/*
//CTL2CNTL DD *
SORT FIELDS=(35,8,CH,A)
OUTREC BUILD=(1,34)
/* |
|
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
There is no strong indication that data needs to be returned to the original sequence. So ask, rather than doing extra. At the moment you have two SORTs, and two passes of the data, whereas that may not be needed. You're also increasing the size of the data by nearly a quarter to cater for this.
Using LRECL in the JCL to cut down records gives you two places to look when something needs change, and to have them at different values will cause confusion. Not good advice, and neither is another step to drop the data.
You use the COPY operator, and then use SORT in the CNTL file. If you are going to SORT, use the SORT operator. |
|
Back to top |
|
|
subratarec
Active User
Joined: 22 Dec 2007 Posts: 126 Location: Bangalore
|
|
|
|
Hi All,
Thanks for all the efforts. I really appreciate your help.. As I had told that I will provide the excerpt of my prod data.. here it is..
Code: |
01 82244 B 1-2M056 20 0 0 0
01 82244 B 2-4M062 19 0 0 0
01 85481 A 7.5-8.5040 177 0 0 0
01 85481 A 9-10042 339 0 0 0
01 85481 A 10-11.5044 221 0 0 0
01 83560 C 8-9Y134 75 0 0 75
01 83560 C 9-10Y140 57 0 0 57
01 83560 C 10-11Y146 57 0 0 57
01 83560 C 11-12Y152 79 0 0 79
01 83560 C 12-13Y158 53 0 0 53
99 58725
00US12015042910486722457142 0 |
I know you all will say.. that is where is the Header rec. Actually for specifically for this file (for which we are facing issue in prod) doesn't have header data in them (it has 2 trailer recs) BUT I wanted to get an idea how the approach should be when Header will also be there.. cause many of the DSs I can see has both Header and Trailer recs..
I am going all the posts provided by you both.. and I will try them in my test env this above prod data excerpt will give you a clear pics...here the first trailer starts with HIGH value 99 and the last with Low vale 00.
Thanks
Subrata |
|
Back to top |
|
|
subratarec
Active User
Joined: 22 Dec 2007 Posts: 126 Location: Bangalore
|
|
|
|
Sorry I forgot to add one more thing.. that is the duplicate rec search criteria.
from 5 to 9 (5 digits)
from 12 to 12 (1 digit)
from 28 to 30 (3 digits) |
|
Back to top |
|
|
RahulG31
Active User
Joined: 20 Dec 2014 Posts: 446 Location: USA
|
|
|
|
@subratarec: I think you have more than enough to proceed. I'll be out for a couple of days. Good luck with your testing. |
|
Back to top |
|
|
subratarec
Active User
Joined: 22 Dec 2007 Posts: 126 Location: Bangalore
|
|
|
|
Hi Rahul,
Yep Thanks!! I will check and test. Will let you both know what is the result.
Thanks
Subrata |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Can you provide expected output for your latest sample, please? That will answer the question of what order you want the data in. |
|
Back to top |
|
|
subratarec
Active User
Joined: 22 Dec 2007 Posts: 126 Location: Bangalore
|
|
|
|
Hi Bill,
Actually I followed the approach being described by you and Rahul. What I did is as follows..
Code: |
//*-----------------------------------------------------------
//* COPYING ONLY DATA FROM INPUT FILE
//*-----------------------------------------------------------
//SORT1 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//SORTIN DD DSN=<Main input file with both Trailer rec & duplicates>,
// DISP=SHR
//SORTOUT DD DSN=&&TEMP1, <data with dup recs no trailer>
// DISP=(NEW,KEEP),
// LIKE=<Main input file>
//SYSOUT DD SYSOUT=*
//TOOLIN DD *
SUBSET FROM(SORTIN) TO(SORTOUT) INPUT REMOVE TRAILER(2) USING(CTL1)
//CTL1CNTL DD *
SORT FIELDS=COPY
/*
//*-----------------------------------------------------------
//* COPYING ONLY TRAILER REC FROM INPUT FILE
//*-----------------------------------------------------------
//SORT2 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//SORTIN DD DSN=<Main input file with both Trailer rec & duplicates>,
// DISP=SHR
//SORTOUT DD DSN=&&TEMP2, <only Trailer recs>
// DISP=(NEW,KEEP)
//SYSOUT DD SYSOUT=*
//TOOLIN DD *
SUBSET FROM(SORTIN) TO(SORTOUT) INPUT KEEP TRAILER(2)
/*
//*-----------------------------------------------------------
//* REMOVE DUPLICATE RECS,CREATE NEW O/P FILE AND KEEP DUPLICATE
//* TO ANOTHER FILE
//*-----------------------------------------------------------
//SORT3 EXEC PGM=ICETOOL
//TOOLMSG DD SYSOUT=*
//DFSMSG DD SYSOUT=*
//SORTIN DD DSN=&&TEMP1, <file dup recs main data. no trailer rec>
// DISP=SHR
//SORTOUT DD DSN=&&TEMP3, <file non dup recs. only distinct data>
// DISP=(NEW,KEEP),
// LIKE=HXSUCHK.Q.HWW080.TRANS.INPUT
//SORTXSUM DD DSN=HXSUCHK.STCKRECN.OUTPUT.DUP, <dup recs>
// DISP=OLD
//SYSOUT DD SYSOUT=*
//TOOLIN DD *
SELECT FROM(SORTIN) TO(SORTOUT) ON(5,5,CH) ON(12,1,CH) ON(28,3,CH)-
FIRST DISCARD(SORTXSUM)
/*
//*-----------------------------------------------------------
//* WILL APPEND TRAILER TO PURIFIED DATA AND WILL CREATE NEW
//* O/P FILE
//*-----------------------------------------------------------
//MERGE EXEC PGM=SORT
//SORTIN DD DSN=&&TEMP3, <file non dup recs. only distinct data>
// DISP=SHR
// DD DSN=&&TEMP2, <only Trailer recs>
// DISP=SHR
//SORTOUT DD DSN=HXSUCHK.STCKRECN.OUTPUT, <pure data with trailer recs>
// DISP=OLD
//SYSOUT DD SYSOUT=*
//SYSIN DD *
SORT FIELDS=COPY
/* |
From my end I have checked and it looks ok to me. But I need to send both the files for further checkup. Plz check and let me know if this can be done in a much better way (may be within less number of steps).
Thanks
Subrata |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
When I described that approach, I was attempting an approximation to RahulG31's approach from memory. I was not recommending that, and I wouldn't (reason I deleted it).
If you look back a few posts, you'll find my suggestion for the CNTL file for the SELECT. This will get you your results, assuming that the order you want the output in is key order, not the original order with duplicates removed. |
|
Back to top |
|
|
RahulG31
Active User
Joined: 20 Dec 2014 Posts: 446 Location: USA
|
|
|
|
@subratarec: Please read the posts which were 'Posted: Thu Apr 30, 2015' |
|
Back to top |
|
|
|