IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Merging lines of text. Can SORT do this?


IBM Mainframe Forums -> DFSORT/ICETOOL
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
Claes Norreen

Active User


Joined: 20 Dec 2005
Posts: 137
Location: Denmark

PostPosted: Fri Apr 27, 2007 5:38 pm
Reply with quote

Hi experts!

I've got a dataset with a key and 1 to n lines of text, and it looks something like this:

KEY1 Text
KEY2 Text (ends with X'0D')
Text
KEY3 Text (ends with X'0D')
Text (ends with X'0D')
Text

The Text will vary in length, to a maximum of 80 chars per line. I want to merge all lines of text into one record per key, removing trailing spaces and the X'0D' indicator in the process. Can I do this with SORT?
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri Apr 27, 2007 8:22 pm
Reply with quote

Is the input RECFM=VB and LRECL=80 or something else?

You show some of the lines ending with X'0D' and some not. Is that correct or does every line end with X'0D'? If not, what are the rules for which lines end with X'0D' and which lines don't?

Would the output RECFM be VB? What would the output LRECL be?

If the input is:

KEY1 Text1
KEY2 Text2
Text3
KEY3 Text4
Text5
Text6

What would the expected output be?
Back to top
View user's profile Send private message
Claes Norreen

Active User


Joined: 20 Dec 2005
Posts: 137
Location: Denmark

PostPosted: Fri Apr 27, 2007 10:05 pm
Reply with quote

Hi Frank,

Let's say the input is VB 104.

All keys has at least one record, showing the key value (of 20 chars) and the first 80 chars of the text. If the text is longer than 80 chars, the last char is X'0D', and the text continues in the next record, but WITHOUT the key value. Only if the text continues in the next record does it have the X'0D' indicator.

The output will be VB 424 (as a max. of 5 textlines of 80 chars can be found).

Sample output:
KEY1 Text1
KEY2 Text2 Text3
KEY3 Text4 Text5 Text6
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Fri Apr 27, 2007 10:39 pm
Reply with quote

I still need you to clarify some things about the structure of the various records.

KEY1 text1

No continuation. so it has the 4-byte RDW, a 20 byte key, and 80 characters of text, or can it be less than 80 characters of text? In other words, would this type of record always be 104 bytes (4+20+80) or could it be less than 104 bytes (e.g. 4+20+50)?

KEY2 text2 (ends with X'0D')
text3

Continuation, so the first line has the 4-byte RDW, a 20 byte key, 80 characters of text and a X'0D'? That would be 105 bytes so the LRECL would have to be VB 105? Or would it only have 79 characters of text followed by the X'0D' to get VB 104? Would this type of line always have the X'0D' in position 105 (or 104?) or could it have less than 80 characters with the X'0D' earlier on, e.g. 4-byte RDW, a 20 byte key, 50 characters of text and a X'0D'?

Would the second line always be padded out to 80 characters (4-byte RDW + 80 bytes), or could it have less than 80 characters (e.g. 4-byte RDW + 50 bytes)?

You don't actually want a space between the key and each text segment - right?

Finally, is the key identifiable in some way (e.g. it starts with 'KEY')? If not, how do we know when we have the start of a line of text? Is it just that the previous line did not end with X'0D'?
Back to top
View user's profile Send private message
Claes Norreen

Active User


Joined: 20 Dec 2005
Posts: 137
Location: Denmark

PostPosted: Sat Apr 28, 2007 12:12 am
Reply with quote

The text length can vary, to a maximum of 80 chars per record (including the X'0D'). If it has a continuation, it can be anywhere in the text. Here's an example (where # denotes a X'0D' ):

KEY1Text1
KEY2Text2#
continuation of text2#
blanks can even be padded before X'0D' #
endofthiskey
KEY3#
This is possible too...!#
#
<blank line>
KEY4End of example

So none of the lines are padded out to the full record length.

I can identify the key, let's just keep it fairly simple and use KEY in the first three letters.

In fact, the output must be slightly different than I first explained. Still VB 424, but I'd like the following layout:

RDW + Key (20) + Text1 (80) + [Text2 (80)] + [Text3 (80)] + [Text4 (80)] + [Text5 (80)].

Thanks for your time, Frank. I hope this clarifies it?
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Sat Apr 28, 2007 2:32 am
Reply with quote

Here's a DFSORT/ICETOOL job that will do what you asked for. You'll need z/OS DFSORT V1R5 PTF UK90007 or DFSORT R14 PTF UK90006 (April, 2006) in order to use DFSORT's PARSE function. If you don't have the April, 2006 PTF, ask your System Programmer to install it (it's free). For complete details on all of the new DFSORT and ICETOOL functions available with the April, 2006 PTF, see:

Use [URL] BBCode for External Links

Code:

//S1    EXEC  PGM=ICETOOL
//TOOLMSG   DD  SYSOUT=*
//DFSMSG    DD  SYSOUT=*
//IN DD DSN=...  input file (VB/104)
//T1 DD DSN=&&T1,UNIT=SYSDA,SPACE=(CYL,(5,5)),DISP=(,PASS)
//OUT DD DSN=...  output file (VB/424)
//TOOLIN   DD    *
COPY FROM(IN) USING(CTL1)
SPLICE FROM(T1) TO(OUT) ON(5,8,ZD) KEEPNODUPS -
  WITHEACH WITH(113,80) WITH(193,80) WITH(273,80) WITH(353,80) -
  VLENMAX USING(CTL2)
/*
//CTL1CNTL DD *
  INREC IFTHEN=(WHEN=(5,3,CH,EQ,C'KEY'),
          PARSE=(%01=(ABSPOS=25,ENDBEFR=X'0D',FIXLEN=80)),
          BUILD=(1,24,%01)),
        IFTHEN=(WHEN=(5,3,CH,NE,C'KEY'),
          PARSE=(%02=(ABSPOS=5,ENDBEFR=X'0D',FIXLEN=80)),
          BUILD=(1,4,%02,20X))
   OUTREC IFTHEN=(WHEN=INIT,BUILD=(1,4,5:SEQNUM,8,ZD,29:5)),
          IFTHEN=(WHEN=(29,3,CH,EQ,C'KEY'),
                 OVERLAY=(5:SEQNUM,8,ZD)),
          IFTHEN=(WHEN=NONE,
                  OVERLAY=(13:SEQNUM,8,ZD,
                          5:5,8,ZD,SUB,13,8,ZD,M11,LENGTH=8,
                          21:SEQNUM,8,ZD,RESTART=(5,8)))
   OUTFIL FNAMES=T1,
     IFTHEN=(WHEN=(21,8,CH,EQ,C' '),
             BUILD=(1,12,13:29,100)),
     IFTHEN=(WHEN=(21,8,ZD,EQ,+1),
             BUILD=(1,12,113:29,80)),
     IFTHEN=(WHEN=(21,8,ZD,EQ,+2),
             BUILD=(1,12,193:29,80)),
     IFTHEN=(WHEN=(21,8,ZD,EQ,+3),
             BUILD=(1,12,273:29,80)),
     IFTHEN=(WHEN=(21,8,ZD,EQ,+4),
             BUILD=(1,12,353:29,80))
/*
//CTL2CNTL DD *
  OUTFIL FNAMES=OUT,BUILD=(1,4,5:13)
/*
Back to top
View user's profile Send private message
Claes Norreen

Active User


Joined: 20 Dec 2005
Posts: 137
Location: Denmark

PostPosted: Sat Apr 28, 2007 12:00 pm
Reply with quote

Wow, almost can't wait till Monday...!

Thanks Frank! I hope I challenged you a little bit there...? ;-)
Back to top
View user's profile Send private message
Frank Yaeger

DFSORT Developer


Joined: 15 Feb 2005
Posts: 7129
Location: San Jose, CA

PostPosted: Sat Apr 28, 2007 8:15 pm
Reply with quote

Quote:
I hope I challenged you a little bit there...? ;-)


Claes,

Yes, you did. It was an interesting problem and took me a while to solve. Good thing I added PARSE and VLENMAX to DFSORT/ICETOOL as I needed them for the solution.
Back to top
View user's profile Send private message
Claes Norreen

Active User


Joined: 20 Dec 2005
Posts: 137
Location: Denmark

PostPosted: Mon Apr 30, 2007 12:23 pm
Reply with quote

Hi Frank,

It works very well icon_biggrin.gif - except for lines containing nothing but X'0D' in which case SORT gives CC=16 and error message is:

Code:
ICE218A 6 5 BYTE VARIABLE RECORD IS SHORTER THAN 24 BYTE MINIMUM FOR          FIELDS

This must be because of the test for the key in position 5,20...? Can it be fixed?

This data does have relevance since it'll generate an empty line in the resulting file.
Back to top
View user's profile Send private message
William Thompson

Global Moderator


Joined: 18 Nov 2006
Posts: 3156
Location: Tucson AZ

PostPosted: Mon Apr 30, 2007 12:52 pm
Reply with quote

Look at the Sort manual, there is a parameter that handles that.
Quote:
If the variable-length record was too short to contain all SORT, MERGE, or SUM fields, use the VLSHRT option to prevent DFSORT from terminating.
If a variable-length record was too short to contain all INCLUDE or OMIT fields, use the VLSCMP or VLSHRT option to prevent DFSORT from terminating.
Back to top
View user's profile Send private message
Claes Norreen

Active User


Joined: 20 Dec 2005
Posts: 137
Location: Denmark

PostPosted: Mon Apr 30, 2007 12:55 pm
Reply with quote

Thanks, VLSCMP did the trick ;-)
Back to top
View user's profile Send private message
Claes Norreen

Active User


Joined: 20 Dec 2005
Posts: 137
Location: Denmark

PostPosted: Thu May 03, 2007 2:44 pm
Reply with quote

DFSORT reduced CPU usage by a factor 8 compared to the application program that handled this task before! icon_biggrin.gif

Thanks again!
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> DFSORT/ICETOOL

 


Similar Topics
Topic Forum Replies
No new posts How to save SYSLOG as text data via P... All Other Mainframe Topics 4
No new posts Need to set RC4 through JCL SORT DFSORT/ICETOOL 5
No new posts How to split large record length file... DFSORT/ICETOOL 10
No new posts JCL sort card - get first day and las... JCL & VSAM 9
No new posts Sort First/last record of a subset th... DFSORT/ICETOOL 7
Search our Forums:

Back to Top