View previous topic :: View next topic
|
Author |
Message |
ABINAYATHULASI
New User
Joined: 28 May 2016 Posts: 3 Location: India
|
|
|
|
I have a file with a single record spanning across multiple lines.There are many such records in that file.
EG: Note :the single record spans from H1 TO S1. it may be even 15000 lines in a single record .
H1.......
H2........
H3......
D1......
D2.......
DN ST .........
DN BZ.........
DH.......
D1....
D2.....
S1........
I want to validate if DNST line(highlighted) alone matches with wat i pass in control card .If matches i have to write the entire block to output file if it doesnt match then ignore them. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10889 Location: italy
|
|
|
|
Quote: |
I have a file with a single record spanning across multiple lines.There are many such records in that file. |
it might be wiser for You to review Your understanding of the terminology used for zOS/MVS data management
zOS data management is RECORD oriented
so ...
You have RECORDS not LINES
which can be identified as header/trailer/detail records
which according to the header/trailer/detail can be seen as different logical groups |
|
Back to top |
|
|
Nic Clouston
Global Moderator
Joined: 10 May 2007 Posts: 2454 Location: Hampshire, UK
|
|
|
|
More terminology - it is a 'data set' not 'file'.
Is that really one record or is it a group of related records? Perhaps with header(s) and maybe a trailer?
If it is a group of records then you should peruse the DFsort forum for similar reuirements. |
|
Back to top |
|
|
Abid Hasan
New User
Joined: 25 Mar 2013 Posts: 88 Location: India
|
|
|
|
Hello,
Basis what is shared, and if I understand it correctly, the requirement is to read a dataset and test each record for a certain value. All of these records being tested WILL BE a part of a group which can be uniquely identified using an existing header and tail record. There can be multiple groups, and each group can have as high as 15k records.
If the value being tested matches, then the entire group of records need to be written to output, else this group needs to be discarded.
If the above understanding is correct, AND you're looking for a COBOL solution - which will be much simpler to code as against DFSORT. The underlying challenge here is to hold the group of records in place until the test is complete; and until this point TS has not stated where THIS record which is to be tested will be present - so assumption is that it can occur anywhere in the entire group.
Define a table large enough to hold the entire group of records, say 15000.
Read each record, identify if it is header, start writing to the table. At the same time also test if YOUR condition holds true. If condition is true, then set a flag to true, then first write the complete table written until this point to output DS, followed by writing subsequent records directly to output (no need to write table anymore until the tail record for this group is encountered).
Once tail record is encountered, stop writing to output, initialize the table - and repeat the complete process as stated earlier.
In case of DFSORT, at best I think 2 passes of data will be required, and presumably the code will be a little complex - reason remains same - finding an algorithm to uniquely identify the data to be grouped and written.
Edit: The complexity increases if the position of the value to be checked is not fixed in the record. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10889 Location: italy
|
|
|
|
Quote: |
Define a table large enough to hold the entire group of records, say 15000. |
simpler to access the same dataset using two file-definitions
synchronize file1/file2 reads on the same header
use file1 to keep reading until the next header remembering if the current block must be copied.
if so read file2 and write until the next header
otherwise read file2 until the next header
repeat until done
beware ...
You have to process the last block of records remembering that You have an end of file instead of a new header |
|
Back to top |
|
|
RahulG31
Active User
Joined: 20 Dec 2014 Posts: 446 Location: USA
|
|
|
|
Abid Hasan, I would say we use both DFSORT and COBOL program.
Step1: Use DFSORT, IFTHEN=(WHEN=GROUP with BEGIN=Header Identifier and PUSH=Header Record (or Part of header record that can uniquely identify among different headers).
OUTFIL Include=Condition you want to check with build the PUSHed header.
The output of this Step should be the Uniquely Identifiable header records for the groups that have the required condition satisfied.
Step2: Create a COBOL program that uses the Original Input file as well as the file that we got from Step1 as the Input. The first step in COBOL program is to load the file from Step1 to Internal table and then READ the Original Input file to match Header against this Internal table. If a match is found then Write to Output.
The advantage over COBOL program only solution, will be that you don't have to write and initialize table multiple times and it won't be as large a table as it could be, I suppose.
. |
|
Back to top |
|
|
Arun Raj
Moderator
Joined: 17 Oct 2006 Posts: 2481 Location: @my desk
|
|
|
|
I think we have discussed similar topics a few months ago in the DFSORT forum. If we are bringing the sort product into this, then I would think JOINing the original input with a shortened version of itself (created as Rahul mentioned) using a JNFnCNTL would do this.
But well then, we are discussing a cobol solution here, I see enrico's logic has eliminated the need of any working storage table. |
|
Back to top |
|
|
Abid Hasan
New User
Joined: 25 Mar 2013 Posts: 88 Location: India
|
|
|
|
Hello Rahul/Arun,
You guys are spot on with the approaches involving DFSORT, the only problem that I could think of at the time of posting was at least 2 passes of data, again the reason being - first pass is required to traverse through the data once and identify the groups and pad an identifier. Rest of it is an algorithm to play with this and segregate.
Mr. Sorichetti,
Aah, you got me there a much better approach indeed. I didn't like the solution I'd posted because of the 'undefined/large' (going by what TS had posted initially) size of the table. The revisit to the table is a costly affair, but would have been quick since all operation would be in-memory. But your solution solves it in a much better way. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
With Enterprise COBOL V4, there'll be no problem storing 15000 8000-byte records in a table. With V5+, no problem storing 15000 of any length from a sequential data set. |
|
Back to top |
|
|
|