View previous topic :: View next topic
|
Author |
Message |
zh_lad
Active User
Joined: 06 Jun 2009 Posts: 115 Location: UK
|
|
|
|
Hi,
I have a list if strings (6 byte field) on a FB LRECL=6 file. I want find them one by one on a different FB, LRECL=133 file, if string is found then output that record on another file.
Could you please give hint how can I do it?
Thanks. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
You need sample input and expected output which demonstrates what you mean.
Also, does the FB 6 file contain fixed data, or can it change? How many records, or maximum number.
Most likely with SORT is to generate a SS (substring field-type) search, but there is a limit to the amount of storage available for control cards.
You would have to test if that limit is close to your maximum. |
|
Back to top |
|
|
zh_lad
Active User
Joined: 06 Jun 2009 Posts: 115 Location: UK
|
|
|
|
File1
FBA 134 file, Its a ISPF 3.14 output:
Code: |
ACP01001 --------- STRING(S) FOUND -------------------
1 P0 000 000000001 123456999
ACP01002 --------- STRING(S) FOUND -------------------
1 P6 000 0 4S001S001
ACP01501 --------- STRING(S) FOUND -------------------
1 P1 654321XX 8000700 LS001S001
ACP01502 --------- STRING(S) FOUND -------------------
1 P1 000 008000701 LS001S001
ACP01503 --------- STRING(S) FOUND -------------------
1 P1 000 008000702 LS001S001 |
File2, FB 6:
Code: |
987654
333334
123456
876890
102343
654321
997766 |
Output File3 should be:
Code: |
1 P0 000 000000001 123456999
1 P1 654321XX 8000700 LS001S001 |
6 byte alphanumeric field from File2 could be found anywhere on File1. SORT should output those records from File1 where a matching string is found, this string can be anywhere on the record, no fixed position.
Output File3 has records from File1 as strings 123456 and 654321 from File2 are found on File1.
Actual File2 will have approx 40K records. File1 is also large.
Thanks. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
I don't think Plan A is going to work with 40,000 6-byte search values.
A unthinking program to do this is going to run like a dog (127 tests of 40,000 per record) so you need some thinking.
Can you, if not already done, give representative samples of what will be searched for? If you are only looking for six-digit numbers, do they have leading zeros when less than 100000?
Is the "hit" for the search always the start of a string of characters? Ie 123456789 amd 789123456 should those both be matched by 123456 or only the first? In all instances?
We need as full a description as possible of the data.
In a program, I'd suspect a binary search of the 40,000 values, only for those which are possible values from the 134 bytes (which is presumably 133 bytes of data plus a control character?).
With SORT, I'd look to extract all the possible hits from the 134-byte file, along with a record-number, then JOINKEYS, and then JOINKEYS back to the original on a generated sequence number. |
|
Back to top |
|
|
zh_lad
Active User
Joined: 06 Jun 2009 Posts: 115 Location: UK
|
|
|
|
We will search for:
Code: |
987654
123456
997766
000001
000022 |
I am only looking for 6 digit number, they can have leading zeroes e.g. 000001.
123456789 amd 789123456
In this case, both are matched. We will output both records.
We are deleting 6 digit numbers from our system (database - DB2 tables). In above, exercise we are targetting JCLs, SORT cards, other Control cards for hard coded 6 digit values.
First step was to ISPF 3.14 search for 2 consecutive digits. File1 is the output this 3.14 search.
We have list the of entries to be deleted (6 digit number, total 40K) File2, in step 2, we are searching these 6 digit numbers in file from step1.
Thanks. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
123456789 and 789123456 only co-indicently contain your six-digit number. Do you really want to match, or are you just saying it does match? |
|
Back to top |
|
|
zh_lad
Active User
Joined: 06 Jun 2009 Posts: 115 Location: UK
|
|
|
|
I wrote...
We will output both records. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Yes, and without further explanation, I thought it looked so silly that I'd question it. Why it is important is that you'll potentially get false hits with it. It would also be faster if you only had to match six-against-six.
Presumably this is a one-off exercise?
For SORT, OMIT the stuff you know you don't want, and have an OUTFIL with BUILD which generates 128 records for each input, using the slash operator (/). Include generated record-number on each output record.
Use JOINKEYS on that file (omit any records containing blanks or any non-numeric in the JNFnCNTL for that file) to do the match.
Output from the JOINKEYS is the matched records from the generated file.
Then another JOINKEYS to get back to the original file and extract, noting the (possibly multiple) matches.
For a programming language, you'll need a binary search, and only do that search for the data you'd have got after the JNFnCNTL OMIT/INCUDE mentioned above. |
|
Back to top |
|
|
|