6 byte alphanumeric field from File2 could be found anywhere on File1. SORT should output those records from File1 where a matching string is found, this string can be anywhere on the record, no fixed position.
Output File3 has records from File1 as strings 123456 and 654321 from File2 are found on File1.
Actual File2 will have approx 40K records. File1 is also large.
Yes, and without further explanation, I thought it looked so silly that I'd question it. Why it is important is that you'll potentially get false hits with it. It would also be faster if you only had to match six-against-six.
Presumably this is a one-off exercise?
For SORT, OMIT the stuff you know you don't want, and have an OUTFIL with BUILD which generates 128 records for each input, using the slash operator (/). Include generated record-number on each output record.
Use JOINKEYS on that file (omit any records containing blanks or any non-numeric in the JNFnCNTL for that file) to do the match.
Output from the JOINKEYS is the matched records from the generated file.
Then another JOINKEYS to get back to the original file and extract, noting the (possibly multiple) matches.
For a programming language, you'll need a binary search, and only do that search for the data you'd have got after the JNFnCNTL OMIT/INCUDE mentioned above.