I've got a dataset with a key and 1 to n lines of text, and it looks something like this:
KEY1 Text
KEY2 Text (ends with X'0D')
Text
KEY3 Text (ends with X'0D')
Text (ends with X'0D')
Text
The Text will vary in length, to a maximum of 80 chars per line. I want to merge all lines of text into one record per key, removing trailing spaces and the X'0D' indicator in the process. Can I do this with SORT?
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
Is the input RECFM=VB and LRECL=80 or something else?
You show some of the lines ending with X'0D' and some not. Is that correct or does every line end with X'0D'? If not, what are the rules for which lines end with X'0D' and which lines don't?
Would the output RECFM be VB? What would the output LRECL be?
All keys has at least one record, showing the key value (of 20 chars) and the first 80 chars of the text. If the text is longer than 80 chars, the last char is X'0D', and the text continues in the next record, but WITHOUT the key value. Only if the text continues in the next record does it have the X'0D' indicator.
The output will be VB 424 (as a max. of 5 textlines of 80 chars can be found).
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
I still need you to clarify some things about the structure of the various records.
KEY1 text1
No continuation. so it has the 4-byte RDW, a 20 byte key, and 80 characters of text, or can it be less than 80 characters of text? In other words, would this type of record always be 104 bytes (4+20+80) or could it be less than 104 bytes (e.g. 4+20+50)?
KEY2 text2 (ends with X'0D')
text3
Continuation, so the first line has the 4-byte RDW, a 20 byte key, 80 characters of text and a X'0D'? That would be 105 bytes so the LRECL would have to be VB 105? Or would it only have 79 characters of text followed by the X'0D' to get VB 104? Would this type of line always have the X'0D' in position 105 (or 104?) or could it have less than 80 characters with the X'0D' earlier on, e.g. 4-byte RDW, a 20 byte key, 50 characters of text and a X'0D'?
Would the second line always be padded out to 80 characters (4-byte RDW + 80 bytes), or could it have less than 80 characters (e.g. 4-byte RDW + 50 bytes)?
You don't actually want a space between the key and each text segment - right?
Finally, is the key identifiable in some way (e.g. it starts with 'KEY')? If not, how do we know when we have the start of a line of text? Is it just that the previous line did not end with X'0D'?
The text length can vary, to a maximum of 80 chars per record (including the X'0D'). If it has a continuation, it can be anywhere in the text. Here's an example (where # denotes a X'0D' ):
KEY1Text1
KEY2Text2#
continuation of text2#
blanks can even be padded before X'0D' #
endofthiskey
KEY3#
This is possible too...!#
#
<blank line>
KEY4End of example
So none of the lines are padded out to the full record length.
I can identify the key, let's just keep it fairly simple and use KEY in the first three letters.
In fact, the output must be slightly different than I first explained. Still VB 424, but I'd like the following layout:
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
Here's a DFSORT/ICETOOL job that will do what you asked for. You'll need z/OS DFSORT V1R5 PTF UK90007 or DFSORT R14 PTF UK90006 (April, 2006) in order to use DFSORT's PARSE function. If you don't have the April, 2006 PTF, ask your System Programmer to install it (it's free). For complete details on all of the new DFSORT and ICETOOL functions available with the April, 2006 PTF, see:
Joined: 15 Feb 2005 Posts: 7129 Location: San Jose, CA
Quote:
I hope I challenged you a little bit there...? ;-)
Claes,
Yes, you did. It was an interesting problem and took me a while to solve. Good thing I added PARSE and VLENMAX to DFSORT/ICETOOL as I needed them for the solution.
Joined: 18 Nov 2006 Posts: 3156 Location: Tucson AZ
Look at the Sort manual, there is a parameter that handles that.
Quote:
If the variable-length record was too short to contain all SORT, MERGE, or SUM fields, use the VLSHRT option to prevent DFSORT from terminating.
If a variable-length record was too short to contain all INCLUDE or OMIT fields, use the VLSCMP or VLSHRT option to prevent DFSORT from terminating.