I would like to split my input file into several output use the following rules:
1. Split every group of lines that start with "B" and follow several "C", "D", "E" lines into a new file
2. If col "23-25" is not equal to "203" in line C, then omit the whole group "B", "C", "D", "E", "C", "D", "E" lines.
"B" defines the start of a group, no other types of group?
Why isn't your entire input included in output file 1?
Does a "C" record immediately follow a "B" record?
What do you want to do when there are two (or more) "C" records withing a "B" group?
Probably more, but difficult to tell when so little is know.
Here's my requirements, the file is FB, length 512, and records will start like "B", "C", "D", "E". It's ok to do not split the file but reformat it to copy the B line before every C line.
Here's my group definition, start with B line and end with E line, like BCDECDE...CDE or BCDE.
If within a group, col 23-25 in C line is not '203', then omit the whole BCDE or BCDECDE..CDE group.
If within a group, col 23-25 in C line is '203', then reformat it from BCDECDECDE to BCDEBCDEBCDE to copy B line before every CDE line.
The output need to be like this, B line need to be copy into the second CDE line, and the last group of "BCDE" with 227 in col 23-25 at C line need to be omitted.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
OK, that makes things clearer, and easier.
Use OPTION COPY, INREC IFTHEN=(WHEN=GROUP to identify the B-record and PUSH the entire record to a temporary extension of your data, PUSH=(513:1,512).
The B record is now tucked away safely for when you need it.
In OUTFIL, use OMIT= to get rid of all the original B records. Use IFTHEN=(WHEN=(logical expression to identity the C records, then use the / (Slash Operator) to on BUILD to create two records for output, the B record first, followed by the C. BUILD=(513,512,/,1,512). Use IFOUTLEN=512 to set the record-length (which will achieve the chopping off of the B records from position 513 on all the other records).
I would write something to check, 100%, the structure of your file, and consider how it remains valid, but you've probably already dealt with this...
Use OPTION COPY, INREC IFTHEN=(WHEN=GROUP to identify the B-record and PUSH the entire record to a temporary extension of your data, PUSH=(513:1,512).
The B record is now tucked away safely for when you need it.
In OUTFIL, use OMIT= to get rid of all the original B records. Use IFTHEN=(WHEN=(logical expression to identity the C records, then use the / (Slash Operator) to on BUILD to create two records for output, the B record first, followed by the C. BUILD=(513,512,/,1,512). Use IFOUTLEN=512 to set the record-length (which will achieve the chopping off of the B records from position 513 on all the other records).
I would write something to check, 100%, the structure of your file, and consider how it remains valid, but you've probably already dealt with this...
Hi Bill,
Thanks very much for the guide, i tried with the below code, but received an error message with "END OF SORTOUT FIELD BEYOND MAXIMUM RECORD LENGTH"
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
The original solution was removing all the B records, as they were being inserted when a C record was encountered.
Input
Code:
B
C
D
E
C
D
E
After INREC
Code:
B+B
C+B
D+B
E+B
C+B
D+B
E+B
After OMIT= on OUTFIL
Code:
C+B
D+B
E+B
C+B
D+B
E+B
After IFTHEN with BUILD and slash operator
Code:
B
C
D+B
E+B
B
C
D+B
E+B
As the records are written from OUTFIL (the other extended Bs disappear due to the IFOUTLEN=512)
Code:
B
C
D
E
B
C
D
E
This would be for all B groups, which is not what you wanted. You only want extra B records inserted if a C is 203. For first sub-group within a B, the C which is a 203 already has a B, so you need to code for that.
So, the OMIT= must be removed, as the B records for other groups are not going to be inserted before each C. This is nothing to do with omitting groups, it just happens to be the same word.
Because the OMIT= goes, you have this for your IFTHEN in OUTFIL
Code:
B+B
C+B
D+B
E+B
C+B
D+B
E+B
Now we pretend that both of those Cs have 203 on, so we want to generate a B for them.
Except, for the first C, we don't. It already has the B.
So we need to differentiate between the first C in a group, and other Cs in a group.
This is what is adding the entire B record to each record. Now we need something else. If you look at PUSH, you'll see there is not much else we can add. An ID (so a sequence number at group level) or SEQ (a sequence number within the group).
With only one of each of D and E, and exactly one, we could always predict a number for a C. We don't need to do that. We only need to be able to identify the first C, and, from what you have said, we can always guarantee that by testing for the value 002.
This is what is adding the entire B record to each record. Now we need something else. If you look at PUSH, you'll see there is not much else we can add. An ID (so a sequence number at group level) or SEQ (a sequence number within the group).
With only one of each of D and E, and exactly one, we could always predict a number for a C. We don't need to do that. We only need to be able to identify the first C, and, from what you have said, we can always guarantee that by testing for the value 002.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
I think you've got that, including the NE.
I missed/mis-interpreted the need to drop some stuff.
So back to dropping all the Bs, and generating all the Bs, since it is only the 203 groups you need.
Add a new WHEN=GROUP with PUSH for the position of the 203. Do this only when the seq added in the first PUSH is "two" and END when 'B'. Ensure that the SEQ is big enough for the maximum number of records in a group, and then add another digit.
Use the 203 (which is now on all relevant records if present).
Note that other than the first B, each B will have the 203-position-value of the previous group (as they will end the group), but this does not matter, as all the Bs are ignored.
Your INCLUDE=/OMIT= should:
Ignore all Bs in position 1.
Keep all other records which have 203 in your new PUSHed field.
In the OUTFIL, you need to generate a B with the Slash Operator for every C.
Although you no longer need to reference the sequence number in the OUTFIL, it is now needed earlier for the new WHEN=GROUP.
I missed/mis-interpreted the need to drop some stuff.
So back to dropping all the Bs, and generating all the Bs, since it is only the 203 groups you need.
Add a new WHEN=GROUP with PUSH for the position of the 203. Do this only when the seq added in the first PUSH is "two" and END when 'B'. Ensure that the SEQ is big enough for the maximum number of records in a group, and then add another digit.
Use the 203 (which is now on all relevant records if present).
Note that other than the first B, each B will have the 203-position-value of the previous group (as they will end the group), but this does not matter, as all the Bs are ignored.
Your INCLUDE=/OMIT= should:
Ignore all Bs in position 1.
Keep all other records which have 203 in your new PUSHed field.
In the OUTFIL, you need to generate a B with the Slash Operator for every C.
Although you no longer need to reference the sequence number in the OUTFIL, it is now needed earlier for the new WHEN=GROUP.
Hi Bill,
I didn't get clearly about your suggestion about add the new WHEN=GROUP with PUSH for the position of the 203.
I tried with the below code, seems the group is not determined correctly, only those 'B' lines for '227' were deleted .
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
OK. Good going. I'd have used NE 203, but if there are only 227 and 203 the results would be the same, if a little less clear (as that knowledge of the data is required to understand the code).
If you have a new question, posit it as new question please.