I was wondering if anybody could help me with a minor niggle I have....
I've created a Data Mining job that starts with a list of account numbers and builds up the different characteristics for each account from lots of different input files.
As not all of the accounts have a data on the other files, when I update a field on the base file, I create a new file of matched records and then have to re-merge (sort-dedupe) the original base file with the new file so I've always got the same number of records I started out with :-
Will you have duplicate keys on 1,16 in either of your input data sets here : MINED.DATA.BASE/ORDER.TOTALS? If no, then the below might help.
If the field from ORDER.TOTALS has to be written to output only if you find a match in it, and if you like to retain the original data when no match is found you could do it with a JOIN UNPAIRED,F1 in your first step. Currently you do not have a JOIN statement that means - return only matching records. With JOIN UNPAIRED,F1 you get all the records from F1 plus F2 fields when a match is found.
And you might need an INREC IFTHEN to selectively OVERLAY the F2 field only when a match is found. To know whether a match is found, you might want to use the matchmarker ("?") in your REFORMAT that assumes values "1" - File1 Only, "2" - File2 Only OR 'B' - Both.
Yes, you should retain the original value in REFORMAT and then in the INREC IFTHEN, evaluate if it is a matched record, and then OVERLAY the data from F2. So only for matched records the field will get overwritten. For non-matching records, the field will remain unchanged.
This is untested, but it would look something like this: