Can you review and provide your suggestion for the below scenario.
My input file has a set of groups. Each group begins with a record starts with the characters ISA and ends with a record that starts with IEA. Anything in between these records (including these 2 records) is a GROUP.
My requirement is to identify such groups along with another condition : if every 3rd record of the group starts with characters GS, then I need that particular group be captured into an output file.
I have tried the below code and stuck up in between. Any assistance would be greatly appreciated.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
You are testing 1,1 for C'ISA'. That isn't going to work, although it may look like it works if there are no other "C"s in the first column of the records.
You don't have any code to END the group. You will need to check also that the end of the group is present, it doesn't know way up at the top of the group that there is going to be an ending record.
You don't test for the value on the third record, you are just adding a constant to the third record of a group, all of them.
Can there be a starting record without an ending record in your data? Can you provide representative sample data, and expected output?
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
Try this. You will need to understand it. If you do have a group without the end record, and with the GS in the third position, it will be extracted and remember that the group will run on until the next start. You said it can't happen, so this is a solution.
Hi Bill, The code ran fine, but the output looks exactly same as the input. Here are the messages from the spool; There seems to be a condition missing to eliminate other GROUPs of records which doesn't have a value of "GS" in first 2 characters of 3rd record. I'm looking at my end to see if I can find out any gap, but would you please let me know if you see an issue with the code.
Code:
SYNCSORT LICENSED FOR CPU SERIAL NUMBER XXXXX, MODEL XXXX XX LICENSE/PRODUCT EXPIRATION DATE: 31 DEC 2016
SYSIN :
INREC IFTHEN=(WHEN=GROUP, 00009804
BEGIN=(1,3,CH,EQ,C'ISA'), 00009904
END=(1,3,CH,EQ,C'IEA'), 00010004
PUSH=(81:ID=8,90:SEQ=6)), 00010104
IFTHEN=(WHEN=GROUP, 00010204
BEGIN=(90,6,CH,EQ,C'000001'), 00010304
PUSH=(100:1,80), 00010404
RECORDS=3), 00010504
IFTHEN=(WHEN=GROUP, 00010604
BEGIN=(90,6,CH,EQ,C'000002'), 00010704
PUSH=(180:1,80), 00010804
RECORDS=2), 00010904
IFTHEN=(WHEN=GROUP, 00011004
BEGIN=(90,6,CH,EQ,C'000003',&,1,2,CH,EQ,C'GS'), 00012005
END=(1,3,CH,EQ,C'IEA'), 00014005
PUSH=(97:1,2)) 00015005
SORT FIELDS=COPY 00016004
OUTFIL INCLUDE=(90,6,CH,GE,C'000003',&,97,2,CH,EQ,C'GS'), 00017005
IFTHEN=(WHEN=(90,6,CH,EQ,C'000003'), 00020005
BUILD=(100,80,/, 00030005
180,80,/, 00040005
1,80)), 00050005
IFTHEN=(WHEN=NONE, 00060005
BUILD=(1,80)) 00070005
WER108I SORTIN : RECFM=FB ; LRECL= 80; BLKSIZE= 27920
WER073I SORTIN : DSNAME=D956.EDISQ.PBACKUP.INBOUND.EDIDATA.G3172V00
WER257I INREC RECORD LENGTH = 259
WER238I POTENTIALLY INEFFICIENT USE OF INREC
WER110I SORTOUT : RECFM=FB ; LRECL= 80; BLKSIZE= 27920
WER074I SORTOUT : DSNAME=#956.EDISQ.PBACKUP.INBOUND.EDID
WER405I SORTOUT : DATA RECORDS OUT 23133; TOTAL RECORDS OUT 23507
WER449I SYNCSORT GLOBAL DSM SUBSYSTEM ACTIVE
WER054I RCD IN 23507, OUT 23507
WER169I RELEASE 1.4 BATCH 0520 TPF LEVEL 0.1
WER052I END SYNCSORT - #9101164,SORT001,,DIAG=8200,51CE,AA00,00E6,CAFE,6DE2,AA08,2CE4
My apologies. My input data has got all GS record groups. I have now changed the data and tested it multiple times again and have seen it is working perfectly fine. Thank you very much for being patient and for your time.
However, I understood the logic until SORT FIELDS=COPY. I did not completely understood the BUILD statement in the OUTFIL option.
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
No problem. I know that. You'll only get output if GS is present on the third record of a group, and you were getting everything output, so GS was present in all groups.
Records one and two of a group will be ignored - their data has been appended to record three (strictly, record one's data is appended to records one through three, and record two's data appended to records two through three - record three is used because it has (or doesn't have) the GS on it).
When record three gets past the INCLUDE=, we need to strip out the extra data from that record, and make it into new records one and two again.
This is done with the slash operator (/). This, in BUILD in OUTFIL only, says "finished with the previous, now make a new record". If you look at the reporting functions of OUTFIL you'll find uses of it to create multiple lines for headers, trailers and control breaks as well.
So, from the third record of the original group (now the first record of the group) three records are created, firstly from the data on the original first record, then the data on the original second record, then the actual data belonging to the current record.
Then, for records which aren't the third, a simple BUILD to drop of the excess bytes (group ID, group sequence, GS marker, and 160 bytes of blank).
You can change that simple IFTHEN and BUILD to IFOUTLEN=80, once you understand the process. This will set the record-length to 80 at the end of IFTHEN processing.
If you want to see the data as it was created out of INREC, drop the IFTHEN processing on the OUTFIL and send the SORTOUT to a dataset. You can then see the extended records and the data they contain.
Thank you very much Bill. Everything makes sense now. Thank you for the explanation. I understood the first part even better now, yes, I would like to play around it to know more practically by directing the data to SORTOUT by dropping IFTHEN. Have a good day ahead