I have a huge file with 60 million records. These records have to be loaded in a target table. Before loading it into the target table for each and every record I have to read some other tables to get some additional information.
What is quicker?
- Read the input file into a cobol-program, get the additional information by reading the necessary tables, form a BMC-output file (file has the layout of the target table) and then dump the BMC-file into the target table
- Store the input file into a DB2-table, get the necessary information from the other tables and then insert row by row into the target table?
In other words: is sequentailly reading a table quicker than a join in a cursor?
I'd strongly suspect getting all the data you need andthen loading with a utility would be quicker.
For an answer with your set-up, best would be to do a "mock-up" with a reasonable sample of test data for the task.
Another advantage of doing it in two logical pieces is that if you get errors from the pre-processing of the file, you haven't even disturbed the DB yet. You could also consider "splitting" the file into more manageable chunks for the DB load rather than 60 million at once.
If your target table is partitioned then I would suggest split your input file into multiple smaller file using SORT. Run jobs parallely to insert/update different partitions at the same time.
If you try to insert/update with 60 Million records without using partitions at a time then you might end up with -911.