Matching whole key from one file to any position in other

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

Hi,

I have two files FILEA and FILEB and their details are as below

FILEA:-

dbzTHEdinosauer · Posted: Sat Aug 04, 2012 1:01 pm

during INREC of FILEB,
if you parse the field2 into two new fields of 40 and 40
you could then JOINKEYS with field2 of filea and the new field2 of fileb.

That way during your format you can refer to the original field2 of fileb.

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

dbzTHEdinosauer · Posted: Sat Aug 04, 2012 2:58 pm

well, I assumed (my error, i realize) that the first part of the name field was the short name.

using inrec build you can essentially 'dynamically' increase the length
and outrec/outfil build you can 'dynamically' decrease the length.
(you really should remove the word 'dynamically' from you vocabulary)

what is field1 in both files?
if this is some kind of account number, you could use that for JOINKEYS and then IFTHEN yourself into oblivion.

you should look at the parse instruction in the manual.
I have seen examples in the forum where the return was zero length.
(i.e. nothing to parse)

if the shortname (field2 of filea) is really capable of being any portion of the long name (field2 of fileb)
yes, this exercise could be fun.

keep in mind, in sort,
you can not compare a value in one record to a value in another record.
if there is someway to join the records (field1?)
you could then do your compare of field2-filea to substring of field2-fileb.

at this point, i am at the end of my suggestions,
somebody else may come along this weekend and add theirs.

good luck.

knickraj · New User Joined: 11 Jun 2007 Posts: 50 Location: Euro

You may try this..
here I am assuming your FIELD1(9) in both the files are assigned to same person i.e

123456789 BAKER
123456789 JOSEPH BAKER

you may build a dynamic include statement from file A using a sort in a temp file.

knickraj · New User Joined: 11 Jun 2007 Posts: 50 Location: Euro

CORRECTION:

instead of include COND you may use IFTHEN overlay

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

dbzTHEdinosauer · Posted: Sat Aug 04, 2012 9:58 pm

i can't help but feel that I am missing something
(as well as apologizing for re-entering the conversation).

does the short name field only contain one (1) word?
word is defined as a string of characters bounded by one of the following:

Begining of field and space
space and space
space and End of field

If there is indeed only one (1) word to be found in short name,
then continue reading,
else disregard the rest of this post

JOINKEYS on acct no - field1 of both filea and fileb.

SS can not be use because there are too many records to create the constant needed for the SS construct.

so, parse the short name field so that it contains no leading or trailing spaces.
parse the long name field into 20 parse units also so that none contain leading or trailing spaces.

outrec INCLUDE if short name parse eq to any of the 20 long name parses.

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

dbzTHEdinosauer ,

Your solution will work perferct like charm if my short name contains only one word.

But my file also contains (Only few in those lakhs of records) greater than one word.

Thanks for your Idea this would work really fine for one word in my short name. But my short name contains two words like
Baker II(for e.g.) in some records.

Bill Woodger · Posted: Sun Aug 05, 2012 1:34 am

If you have two "words", parse them into individuals like the other. Word1 will always precede Word2. So, if Word2 not blank, do the match on pairs, else singly.

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

someway to join the records (field1?)
you could then do your compare of field2-filea to substring of field2-fileb.

dbZ,

If i join two files with FIEL1 as key how could i compare of field2-filea to substring of field2-fileb.

knickraj · New User Joined: 11 Jun 2007 Posts: 50 Location: Euro

Hello,
even parse seems tricky, coz your Short name and full name is/are same length 40.

even if you parse to the short name string with STARTAT=string or ENDAT=String,or something like that to remove leading trailing spaces ,
what you would give the FIXLEN value, as your short name
can go upto 40 which is the length, also for the full name .

Another way may be to SQZ right and then Compare byte by byte..

but still...how to compare from which byte to which byte.

Is there any specific reason for both short name and full name to be of length 40?

Bill Woodger · Posted: Sun Aug 05, 2012 5:21 pm

Why would length be a problem? If you give FIXLEN=40, can contain zero to 40 non-space bytes, the number depending on the rest of the PARSE.

I do think we need more details of the requirement, why if a number is available is there a match on names needed?

dbzTHEdinosauer · Posted: Sun Aug 05, 2012 5:35 pm

Bill,

for some reason, the TS wants to differentiate between those accounts
where the short name is a subset of long name
and those
where the short name is not part of the long name.

enrico-sorichetti · Posted: Sun Aug 05, 2012 5:53 pm

time to lock the topic,
there is no utility solution until the TS tells which <word> of the <name> constitutes the short name ...
in the first example it is BAKER ( second <word> ) in the second it is TIM ( first <word> )

until the <word> chosen will be clearly identified we are just wasting time

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

enrico-sorichetti · Posted: Sun Aug 05, 2012 6:44 pm

the last description does not match the initial description of the requirement!

since we are not dentist for what reason getting reasonable and clear info is like pulling a tooth ?

repost a COMPLETE description of the requirement with a proper sample of the data
long name , short name , honorific/title ( Mr, Mrs, Dr. ... )
if the honorific/title cannot be parsed by comparison Your requirement DOES NOT HAVE a solution

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

Bill Woodger · Posted: Mon Aug 06, 2012 1:00 am

Field1 can be used for a match. Why would the names be different, other than the short vs long.

What I was trying to say before is that if the short name contains two (or more) elements, there is no problem with parsing each element to 40 bytes, even though the source field is only 40 bytes (or whatever size).

If, like "BAKER II", the elements appear in order and consecutively in the long name, then the match can be done reasonably easily.

However, if "MR BAKER" is expected to match "MR JOHN BAKER" then the match becomes more problematic, in that there will be a large number of IFTHENs. The IFTHENs can be generated, but you have to know the limits of elements in the long and short names.

Before 5pm, European time, you need to have an exact explanation of how you want it to work with a full set of input samples and expected output. That will give Kolusu a chance to look at it without a lot more toing-and-froing.

If you get to something yourself, that will be very refreshing and you'll have our heartfelt congratulations.

dick scherrer · Posted: Mon Aug 06, 2012 1:29 am

Hello,

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

I Have been successful in completing this requirement but i have used the Combination of DFSORT and Easytrieve

First i have parsed the Full name file for e.g.

Bill Woodger · Posted: Mon Aug 06, 2012 5:11 pm

But this won't work with two elements in the short name.

The code is very similar in DFSORT anyway.

Are you happy with what you have?

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA

Bill Woodger · Posted: Mon Aug 06, 2012 7:11 pm

Very well done then. It is not often that someone here just seeks a bit of advice and then gets on with it. Congratulations and good luck.

southee · New User Joined: 17 Jun 2012 Posts: 20 Location: INDIA