Design Conundrum

vasan2 · New User Joined: 09 Apr 2010 Posts: 3 Location: uk

I’m working on an application design and have a tricky design issue to handle.

I have 2 input files, say File A & File B. Both files are huge in terms of Volume (Millinons of records) and Record length (Record consists of about 2000 fields).

The applications design is for a migration project. My File A is the data from old system, File B is the data from new system and My File C is the output which will contain combination of both data.

The Cobol program has to read File A, File B and produce File C. This is simple enough. But the tricky part is,

- The program has to dynamically determine where it should get the data for each field on the output file.

- Say A1 through A9999 are my File A fields, B1 through B9999 are my File B fields and C1 through C9999 are my output file C fields, at runtime the program has to determine,

1. The data for C1 should come from File A or B

2. If File A which field?

3. If File B which field?

Point 1 above, this can be controlled via a DB2 table, so the program can read the table and determine which file to use.

But I don’t have a clear idea on how point 2 & 3 can be achieved. To me it sounds like a dynamic cobol MOVE statements which is not supported by cobol I believe.

Any thoughts on how this can be done?

CICS Guy · Posted: Thu Apr 15, 2010 5:03 pm

vasan2 · New User Joined: 09 Apr 2010 Posts: 3 Location: uk

To Answer CICS GUY,

Well, File C will be a replacement of File A with some formatting at end of the migration project.

But business is looking at rolling out the new system in phases. So we do know what data will go into output file for some fields though, don't know the data source for rest at this stage.

The requirement is to have a flexible design to dynamically determine which field data should map to a particulat field on the output file (mapping can perhaps maintained as a DB2 table?). So given that situation wondering it's possible to have such a flexible design in a cobol program.

Robert Sample · Posted: Thu Apr 15, 2010 6:38 pm

Reference modification allows you to specify a starting location and length for a move and both location and length can be variables, so what you are wanting to do can be done. However, if you're not sure at this point what fields you are wanting to move from which file then I suggest using reference modification would be the height of silliness. The design must be complete before coding. How do you know when the design is complete? When the source -- which file and which bytes of that file -- of every field in the output file is known and documented.

Ronald Burr · Posted: Thu Apr 15, 2010 6:51 pm

Hopefully, neither File-A nor File-B have fields that are subject to OCCURS DEPENDING ON clauses. If not, here is what I would suggest.

1) In the program, construct an internal table for each file (File-A, File-B, and File-C) which will contain (max-no-of-fields) entries of 3 elements each: an argument of Field-Number (1 to max), and results of Field-Offset in the file (relative to 1) and Field-Length. These tables ( Table-A, Table-B, and Table-C ) can be defined either in the program source, or constructed at run-time from external definition files.
2) In the program, allocate a fourth table ( Table-D ) to be built at run-time, which will consist of 4 elements for each entry: File-A-Field-Offset, File-B-Field-Offset, File-C-Field-Offset, and File-C-Field-Length (it is assumed that the field length for File-A, File-B, and File-C are the same).
3) At run time, read in your "dynamic" requirements, which will contain records defining, for each output field, the File-C-Field-Number, the File-A-Field-Number (0 if the field is to come from File-B), and the File-B-Field-Number (0 if the field is to come from File-A).
4) As you read in each "dynamic" requirement record, use the Field-C-Field-Number as a subscript into Table-C to retrieve the Field-Offset and Field-Length for File-C. Store these values into Table-D using the Field-C-Field-Number as a subscript. Likewise, use the File-A-Field-Number (if not zero) as a subscript into Table-A to retrieve the Field-Offset for File-A and the File-B-Field-Number (if not zero) as a subscript into Table-B to retrieve the Field-Offset for File-B. Store the resulting values (zero for the offset of the unused file) into Table-D as appropriate.
5) Now, as you populate the input file records and prepare to construct File-C, loop thru Table-D from beginning to end for each output record, like the following.
In working storage:

dbzTHEdinosauer · Posted: Thu Apr 15, 2010 7:07 pm

One of the goals of a migration is to normalize entities (fields representing balance, number of..., last date of ....) - usually to convert from one file representation to another.

reference modification does not allow for the numeric conversion of fields (display to comp, comp-3, etc...).

I would prefer to drive from db2 tables, but external files read in to internal cobol tables will do also.

I would CODE every file a field to file c MOVE - without ref mod, each move within its own paragraph/section. Also, every file b to file c.

in addition, I would have routines to make adjustments (rounding, adding, subtracting factors) which would be contained in the table with the addition of a code to indicate what needed to be done to migrate from file a/b field to file c.

then have two rather large EVALUATEs (one for each file a and b) (to identify which field )(or a GOTO depending on) that would perform a routine to generate the required file c field.

you could drive it the over way and use file c fields as the determining factor for the EVALUATE with the subroutines based on file a or b.

either way, I would stay away from reference modification.

dick scherrer · Posted: Thu Apr 15, 2010 7:29 pm

Hello,

Both files need to be in order by the same "key".

Then use a match/merge to position the process within the files. If there is an entry in fileA and not B, then all of the data in C would come from A. The same with fileB.

When there is a match, then the decision whether to use fileA or fileB data comes into play.

Suggest that a member be made in a pds that has an entry for each field. These entries would contain the field name and a A/B indicator. This member would be read into the program at the beginning and used to determine the "source" for each field of the matched records.

As the phases progress, this member would be changed to reflect the current "rules". The code would simply be:

vasan2 · New User Joined: 09 Apr 2010 Posts: 3 Location: uk

Many thanks for taking time to read my post and posting valuable suggestions.

Dick, I did think about this before. But the issue is, I wouldn't know at run time which field on file B will hold the data that i want to move to File C. Though i can specify the mapping in the DATA member or a file/Table, can't figure out how to use that physical field name in the move statement. Well, possibly can use EVALUTE to validate all my fields in the files which is a massive task , considering the volume is in Millions/billions and each record contains about 2000 fields.

Ronald, Thanks for the well explained solution. I'm slightly apprehensive in using reference modification. Considering it's dynamic, any slight format variation will produce unpredictable results.

Thanks again for your valuable suggestions. It has definetly given me something to think about.

enrico-sorichetti · Posted: Thu Apr 15, 2010 8:28 pm

dick scherrer · Posted: Thu Apr 15, 2010 10:01 pm

Hello,

enrico-sorichetti · Posted: Thu Apr 15, 2010 10:03 pm