Not quite sure how best to explain this scenario. Records may span across multple lines within a file, looking for a sort solution to have one complete logical record per line.
We have a couple of vendors in which we pick up FTP data from. From time to time, the files that they provide have the records as a single contiguous line, which ends up truncated to whatever or LOCSITE lrecl is set for. Setting an FTP command to NOTRUNC allows us to capture the transmission data into 80 byte, FB records with the data wrapped from line to line. The issue is how best to reconstruct this data into single logical records of one record per line within a file.
Getting the Vendors to change their method of creating the files have not been very successful.
The line/record CRLF characters may vary, say x'0A' or x'0D'. One option that has been suggested was to pull the file in binary type and use our encryption software to convert the data. That works fine, however, our sysprog has hinted that we may be changing our encryption software vendor in the near future. So I'm not too keen on that as a permanent solution.
We did develop a COBOL program that will handle formatting the input retrieved with the notrunc option into the desired output format. That's a working solution but obviously involves adding another step into the proc, as would the encryption software solution, too.
I accum all of our daily transmissions from these vendors into accum files as we pick up their files throughout the day. Therefore, I've been trying to figure out how to incorporate a solution into the sort step I use for the accum action as opposed to running a COBOL step for this.
I have made attempts but just quite don’t have the experience with DFSORT to figure it out to the end….. Is it possible to do this with sort?
My various attempts at putting the data together in OUTRECS and OUTFIL have not been successful though, and my start as shown above may very well be flawed, too.
The typical input data has a minimum of 50 bytes and a max of 80 bytes coming to us. The data records can be varying lengths, but the input is FB 80. Number of records are typically a couple hundred; can be as many as just over a 1000.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
PARSE and GROUP, yes. You're going to need to write up to two records per one input (if you are accurate with the minimum of 50, despite your sample data showing shortter) so you'll need BUILD on OUTFIL with the / (slash operator).
You shouldnt be able to get more than two end-of-records per record. An 80-byte record plus end-of-record is going to span two records with the end-of-record in byte one of the second. Another "special case" is where the end-of-record is in byte 80, in all other cases you have data to carry from one record to another (hence the GROUP).
PARSE and GROUP, yes. You're going to need to write up to two records per one input (if you are accurate with the minimum of 50, despite your sample data showing shortter) so you'll need BUILD on OUTFIL with the / (slash operator).
You shouldnt be able to get more than two end-of-records per record. An 80-byte record plus end-of-record is going to span two records with the end-of-record in byte one of the second. Another "special case" is where the end-of-record is in byte 80, in all other cases you have data to carry from one record to another (hence the GROUP).
Interesting task. Doable I think.
Oops, sorry, I typed 50 for the minimum, I meant to type 40.
The actual record length is rarely ever less then 50 as the field elements are delimited in the files themselves with one required field being 23 bytes. Parsing for that is already handled further on in the cycle - as long as all records are one per line.
Joined: 07 Dec 2007 Posts: 2205 Location: San Jose
John del,
It is easier to split the records based on the delimiter into separate records however the tricky part is the join the records which are split across records.
I am not sure as to why you need WHEN=GROUP in your original JCL as you are not using the ID anywhere.
Here is an ICETOOL JCL which will give you the desired results. I assumed you have a max of 6 records within a single record.
It is easier to split the records based on the delimiter into separate records however the tricky part is the join the records which are split across records.
I am not sure as to why you need WHEN=GROUP in your original JCL as you are not using the ID anywhere.
Here is an ICETOOL JCL which will give you the desired results. I assumed you have a max of 6 records within a single record.
Hi Skolusu -
Excellent, thank you. I would not have thought of using RESIZE for this. I also need to remember how useful OVERLAY is and how to go about using it.
Since my original post, I had changed to using ICETOOL with a temp DSN and tinkered with the parameters. I separated the records in the 1st CNTL similar to as you have but I continued to fall down trying to get the records back together in my 2nd CNTL. My main issue was in how to reliably identify the single records from the others.
I see how and why you utilized OVERLAY in the first INREC to identify those records WITHOUT the char string, as opposed to those that do have it - which is what I had been doing. Utilizing OVERLAY as you did makes it less difficult to identify the single records that were contained wholly on one line and also in the grouping of those records with parts that spanned multiple records.
Note, my use of the WHEN=GROUP to assign an ID in the original post was from when I had tried concatenating the input file with itself to create a key using an ID that I could use for grouping later. For the JCL that I posted, it had no bearing on the rest of the parameters or the output. Sorry if it caused confusion.