Using DFSORT/ICETOOL, I would like to generate an output using the following filters:
Include the current record if string "USER001" found in any column within the current record. If the string "USER001" found in the current record, also include succeeding records if the succeeding records contain string "---" at column 1. There can be 0 to 3 succeeding records which contain string "---" at column 1.
1) LRECL/RECFM have absolutely nothing to do with this issue
2) What if:
Code:
DELETE USER008 AAAAAAAAA
DELETE USER001 BBBBBBBBB
DELETE USER007 CCCCCCCC
DEFINE USER001 DDDDDDDD
---NAME JOHN
---DEPT ABC
EEEEEEEEEEEEEEEEEEEEEEEEE
LIST USER111
---TO BE OR NOT TO BE?
LIST USER001
---ALL
DELETE USER009 FFFFFFFFFFFF
1) LRECL/RECFM have absolutely nothing to do with this issue
Can do if temporary sequence numbers have to be used.
The main problem is: the author has not a minor idea about applicable approach for this task. At this level of understanding LRECL/RECFM do not play any role, at all...
Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
The understanding is that LRECL and RECFM MAY be needed fo an accurate solution and it is better to have it up front rather than requesting it in a day or two's time. Frank and Kolusu (IBM DFSort developers) always requested it.
Usually, I use my basic knowledge of Rexx for this kind of requirement. However this time, a yearly job will run to merge & move my DASD input into a large multiple Cartridges file. As I learned from this site, experts discourage using REXX for a large input file. I only know basic of DFSORT but in this site I found elegant DFSORT/ICETOOL solution for various requirements.
I would pursue the direction I got here "temporary sequence numbers have to be used", I will post here in case I find the solution.
Looking forward & will appreciate much receiving more clues toward a final solution
Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
I suspect that you will have to use GROUP starting the group when USER001 is found and ending when the following records no longer have '---' in cc 1 - 3. But I am no expert.
I suspect that you will have to use GROUP starting the group when USER001 is found and ending when the following records no longer have '---' in cc 1 - 3. But I am no expert.
From the original posts it's not clear, what is the logic to select required groups/headers/trailers/data records?
The TS ignores the request to clarify his real requirements in a more clear manner. That's why the Nic's suggestion may be either correct, or not...
In general GROUP processing is needed, for sure. All other tricks do depend on more specific record selection rules (and very-very little depend on LRECL/RECFM!)
The major problem I see (especially for newbies) - the input record fields are not aligned. That's why some extra tricks are needed. I avoid to give any working example unless I see that the author has tried at least something by himself.
Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
I think it is pretty clear: a group starts wth USER001 somewhere in the record (SS - search string) and contais 0,1,2 or 3 lines beginning '---' (1,3, eq,'---') and a group ends when the first 3 bytes of the next are not '---'.
OK, could there EVER be more than 3 consecutive records beginning '---'? TS says there can be UP TO 3 so if there are more then more than 3 will be selected if counting is not used, but the stated problem says there will not be 4 or more. So give him that.
1) From my point of view, a group must start not from the word USER001, but from one of valid "control statements": 'DELETE', 'DEFINE', 'LIST', something else.
2) UserID must be considered as the first parameter of this "control statement": 'USER001', 'USER002', . . ., 'USER199', which is not always in the same position of the record.
3) Only those groups with desired UserID(s) must be selected; I don't think in real life only 'USER001' will be needed, that's why I asked for detailed requirements.
4) Next, one needs to detect: which of selected groups do include line(s) '---'?
5) Finally, only those groups selected by UserID, with extra limitation on '---' presence, must be copied to output.
This is so called "business logic". The author is welcome to try implementing it using SORT code.
After exploring some of the suggestions here, I was able to generate the desired output using the code shown above. I'm sure there is a much better solution compared to what I was able to code sofar.
Thanks for the replies, my replies below:
I think it is pretty clear: a group starts wth USER001 somewhere in the record (SS - search string) and contais 0,1,2 or 3 lines beginning '---' (1,3, eq,'---') and a group ends when the first 3 bytes of the next are not '---'.
Reply: I believe you described it clearer.
OK, could there EVER be more than 3 consecutive records beginning '---'? TS says there can be UP TO 3 so if there are more then more than 3 will be selected if counting is not used, but the stated problem says there will not be 4 or more. So give him that.
Reply: I checked several months of input data so far I have not found more than 4. I was also looking for a solution that can handle more than 4 , I believe the above code can.
From my point of view, a group must start not from the word USER001, but from one of valid "control statements": 'DELETE', 'DEFINE', 'LIST', something else.
Reply: Usually, I use control statement. When I checked the input I found some records without control statement but userid. Both control statement & userid not always in the same position of the record. I choose userid since it is the one more pertinent for this requirement. This report will be used upon request for audit trail for particular user. The actual records include time stamp and audit-related data.
My original post was reformatted on my behalf.
Also, I can no longer update my original post now.
Please allow me a recaf and show again my original post below:
Using DFSORT/ICETOOL, I would like to generate an output using the following filter:
Include the current record if string "USER001" found in any column within the current record. If the string "USER001" found in the current record, also include succeeding records if the succeeding records contain string "---" at column 1. There can be 0 to 3 succeeding records which contain string "---" at column 1."
Desired Output:
DELETE USER001 BBBBBBBBBB
DEFINE USER001 DDDDDDDDDD
---NAME JOHN
---DEPT ABC
LIST USER001
---ALL
All of the above was my original post. I omitted the LRECL/RECFM.
When I posted this topic I have no idea where to start.
Then I pursued one idea from here:
"I suspect that you will have to use GROUP starting the group when USER001 is found and ending when the following records no longer have '---' in cc 1 - 3.
After some browsing of DFSORT Application Programming Guide & trial and error, I was able to generate the desired output using the code shown below.
I'm sure there is a much better solution compared to what I was able to code sofar. Looking forward & will appreciate much receiving more suggestions on improving my code.
Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
Just make it more general you can use a paramater (JPn) to pass in the user id from the EXEC JCL statement so that you do not need to change the DFSORT control statements.
1) If you only tested your own example once you would notice the result:
Code:
********************************* TOP OF DATA ****
DELETE USER001 BBBBBBBBBB
DEFINE USER001 DDDDDDDDDD
---NAME JOHN
---DEPT ABC
LIST USER001
---ALL
******************************** BOTTOM OF DATA **
2) If really possible input data changed a little bit ("step right / step left" case)
Code:
DELETE USER008 AAAAAAAAA
DELETE USER001 BBBBBBBBB
DELETE USER007 CCCCCCCC
DEFINE USER001 DDDDDDDD
XXXXXXXXXXXXXXXXXXXXXXXXX
---NAME JOHN
---DEPT ABC
EEEEEEEEEEEEEEEEEEEEEEEEE
LIST USER111
---TO BE OR NOT TO BE?
LIST USER001
---ALL
DELETE USER009 FFFFFFFFFFFF
the result would be even worse:
Code:
********************************* TOP OF DATA ****
DELETE USER001 BBBBBBBBB
DEFINE USER001 DDDDDDDD
LIST USER001
---ALL
******************************** BOTTOM OF DATA **
That's why I hate any absolutely non-flexible straightforward solutions which may work well for one particular sample of input data only. Every "new comma" in input data will require a new "patch", or "upgrade" of such code.
Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
1) If you only tested your own example once you would notice the result:
Code:
********************************* TOP OF DATA ****
DELETE USER001 BBBBBBBBBB
DEFINE USER001 DDDDDDDDDD
---NAME JOHN
---DEPT ABC
LIST USER001
---ALL
******************************** BOTTOM OF DATA **
Nic, The first record do not need to be part of the output since it do not have any subsequent ‘-‘ records associated with it, even though it is USER001 it shouldn’t be there. If it was that straightforward then simple INCLUDE would do.
Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
You will note that, now that I have corrected the coding in the original post, that ALL records pertaining to a particular user are to be selected even if not followed by '---' records. Judging by the comments from OP this is an extract from a larger report of all records pertaining to a user.
Quote:
In my opinion it makes more sense to talk to the guys who produce such crappy input.
Probably have to talk to CA or IBM then as it is probably a security audit report.
********************************* TOP OF DATA ********
DEFINE USER001 DDDDDDDD
---NAME JOHN
---DEPT ABC
LIST USER001
---ALL
******************************** BOTTOM OF DATA ******