Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
I need to convert some big report files to leave only heading-of-table rows, so that a post-processing program cab be fed with those for RTF-ification.
I've started with the obvious(?):
Code:
//SYSIN DD *
ALTSEQ CODE=(F040,F140,F240,F340,F440,
F540,F640,F740,F840,F940,
4B40,7A40)
INREC FIELDS=(1,218,TRAN=ALTSEQ)
OMIT COND=(2,2,CH,EQ,C'+-')
SORT FIELDS=(1,218,CH,A)
SUM FIELDS=NONE
END
//*
The above reduces the number of lines by about 95%, but it still leaves me a hell of a lot of lines just containing '|', which I cannot blank out as they're also contained in heading lines. Ditto for some multi-character strings like
Code:
| - - |
| - |
| - |
In essence, I'm only interested in the lines that are enclosed between two separator (aka of the type +-----+---+-------+...) but not having kept up-to-date with SORT, I don't know how (or even if) if it possible to keep only the manually '**' marked lines from sample below:
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
That's correct. Kind of. But unfortunately the END= will kick in straight away with the first line that is the start of the group. The thing will be to be able to END it, or to not be concerned about the END.
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
Bill Woodger wrote:
That's correct. Kind of. But unfortunately the END= will kick in straight away with the first line that is the start of the group. The thing will be to be able to END it, or to not be concerned about the END.
That's what I already found out. My problem is that some tables have the format
while others do not have the f1/f2 footer. I need to capture both the always present h1/h2 line, and the only sometimes present f1/f2 line, and there isn't really anything that tells me if a table has a footer.
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
Bill Woodger wrote:
OK :-)
Get one bit working and the spec changes...
Is it true that blank lines only appear between the different reports, ie even across pagination?
Code:
b
+-
¦ want
¦
+-
¦ don't want
+-
b
+-
¦ want
¦
+-
¦ don't want
+-
¦ want when present
+-
b
And then exclude all the +- lines, and all those lines that only contain ¦ or -.
Yes there are blank lines between the tables, or ASA control characters, but in both of those cases column 2 is blank. And the spec in the first post may not have been 100% clear, but the examples were.
And your comment about zapping all lines that only contain '|' (and occasionally '-') characters is correct.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
For your 1,3,NE,C' ' I don't think there's a position you can rely upon for sure. There may be, but because there are lots of different reports from a massive amount of data, we can't tell.
Although ICE201I F has WHEN=GROUP, KEYBEGIN does not appear until ICE201I H.
Although ICE201I F has WHEN=GROUP, KEYBEGIN does not appear until ICE201I H.
That's discouraging.
So, the idea was to make a group before the row (with all spaces) and write the first line. For multiple header lines another group is created starting with earlier group key and ending with column 3 not equal to spaces.
Since KEYBEGIN doesn't work, we need to find another way to identify the group start.
For footer, I think it would be a lot easier if the input data can be modified so that the footer is separated and looked something like this:
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
Bill Woodger wrote:
For your 1,3,NE,C' ' I don't think there's a position you can rely upon for sure. There may be, but because there are lots of different reports from a massive amount of data, we can't tell.
The third column for data rows is always a blank, all data is contained within the boxes with (at least) one space between the box and the data on either side, so I should have used
I was using the below to mark the end of the group (for multiple header lines):
Code:
END=(3,1,CH,NE,C' ')
Since I was starting with a KEYBEGIN and ending with a non-space at position 3 this was giving multiple header lines as output.
It doesn't matter whether the data rows have something in column 3 as the group would have ended before that (assuming the header line will always have data in column 3).
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
Thanks guys, I will look at this later, not before the weekend. Just shut down my z/OS system as I will busy with more pressing matters tomorrow, and on Thursday/Friday I will be hitchhiking back to Vilnius, where our car will hopefully be back in a drivable state.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Rahul,
Consider the effect of each report being preceded by a blank line. The first report has no blank line, so you make one (single-record file, concatenate the real data to it).
Then you should be able to identify start/end of the heading group, the data group, and footer group if present.
Don't forget the bit about ignoring lines with only | and - on them (once the TRAN=ALTSEQ has done some cleaning). But get the main part up-and-running first.
Oh, reminds me. Prino, you have a typo in your ALTSEQ.
The problem for WHEN=GROUP is that the logical start and end of the group has the same data (the +- lines) which means the the group ends immediately (both the BEGIN and END are satisfied by the same line.
So you need to change one of those lines before you can do the WHEN=GROUP.
All reports except the first are prefixed by a blank line, which could be used to amend the first line of the group.
So make, by concatenating to a single-record file which contains a blank line, a blank line precede the first report as well.
Then use WHEN=GROUP for the blank line to clobber part of the first line of the logical group.
Identifying the footer-group is then also possible.
The detritus lines, containing only vertical bars and dashes, are removed by first saving the line (with PARSE, into a PARSEd field) and then FINDREPing the characters to blank.
If the resultant line is not blank, put the PARSEd field back onto the record.
It is then a case of arranging the INCLUDE on OUTFIL to only get the records which are actually required.
I suspect there is a requirement for the multi-line headings to remain together, but we'll see.
I also suspect the SORT and SUM FIELDS=NONE is not required, but the answer to that lies in the data - is there more than one copy of the same report, so that headings which are duplicate are not logically contiguous.
Tested with 80-byte lines, adjust for actual lengths.
In detail:
Identify blank lines with WHEN=GROUP and PUSH the blank identified onto the first +- line for a report, giving blank-then-dash.
Do Prino's required translation to get various things in heading lines to blank (does not matter one jot that it affects data lines as well, as they are not required in the output) with presumed typos fixed.
Code:
IFTHEN=(WHEN=INIT,
OVERLAY=(1:1,80,TRAN=ALTSEQ)),
Save the current state of the record in a PARSEd field (ignoring the extensions to the record from PUSH).
Code:
IFTHEN=(WHEN=INIT,
PARSE=(%01=(FIXLEN=80))),
Use FINDREP to get rid of the characters that would be unwanted if there is no other data on the line.
If, having got rid of the unwanted characters, the line is not blank, restore the original data from the PARSEd field.
Code:
IFTHEN=(WHEN=(1,80,CH,NE,C' '),
BUILD=(%01,81,2))
The SORT and SUM as per Prino's original code.
Code:
SORT FIELDS=(1,80,CH,A)
SUM FIELDS=NONE
Ignore blank lines and otherwise use the flags set from the second two GROUPs to identify the header content and footer content that is required. Probably don't need to ignore the blank lines because all the GROUPs are limited (by RECORDS or by END) but I had that there whilst developing it.
With SORT solutions, concentrate thought on the data. When you think "if only the data were like that, I could do this", work out how to get the data like that.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
And add one to all positions for the A in the RECFM (so ignore it for the PARSE, and on the FINDREP have a STARTPOS=2 (and adjust the ENDPOS for the last byte of data) :-)
To create the single-record file with a blank line:
Code:
OPTION COPY,STOPAFT=1
INCLUDE COND=(2,1,CH,EQ,C' ')
Joined: 07 Feb 2009 Posts: 1306 Location: Vilnius, Lithuania
Thanks guys, this works nicely. As for the blank line at the top issue, there actually is one, every page starts with a line just containing a '1' ASA control character.