View previous topic :: View next topic
|
Author |
Message |
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
Hi,
I have two files FILEA and FILEB and their details are as below
FILEA:-
Code: |
FIELD1(9) FIELD2(40) FIELD3(15)
123456789 BAKER 0000000000000123
123456799 TIM 0000000000000123 |
FIELB:-
Code: |
FIELD1(9) FIELD2(40) FIELD3(20)
123456789 JOSEPH BAKER 100750000000000000123
123456799 TIM MENDAHIL 100750000000000000123 |
Now i need to match FILEA with FILEB on FIELD2 .
FIELD2 in in FILEA is short name where as FIELD2 in FILEB is Full name.
Is tthere a way to specify IN dfsort so that whole key from FILEA will be Matched with anyposition in FILEB
I woul like to have all the records from two files.
Both the files are FB with record length of 80.
Code'd and aligned finally |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
during INREC of FILEB,
if you parse the field2 into two new fields of 40 and 40
you could then JOINKEYS with field2 of filea and the new field2 of fileb.
That way during your format you can refer to the original field2 of fileb. |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
dbzTHEdinosauer wrote: |
during INREC of FILEB,
if you parse the field2 into two new fields of 40 and 40
you could then JOINKEYS with field2 of filea and the new field2 of fileb.
That way during your format you can refer to the original field2 of fileb. |
Thank you very much for your quick reply.
The number of words in Full name might be more than two words. This way based on the number of words i need to parse full name dynamically which in turn i need to increase the Record length greater than 80
What if i have more than 2 words in full name like
MR JOSEPH BAKER II
This way i need to parse this into 4 fields of 40 length and need to increase record length.
Is there a way where i can dynimaccly parse and increase LRECL dynamically to satisfy above method. |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
well, I assumed (my error, i realize) that the first part of the name field was the short name.
using inrec build you can essentially 'dynamically' increase the length
and outrec/outfil build you can 'dynamically' decrease the length.
(you really should remove the word 'dynamically' from you vocabulary)
what is field1 in both files?
if this is some kind of account number, you could use that for JOINKEYS and then IFTHEN yourself into oblivion.
you should look at the parse instruction in the manual.
I have seen examples in the forum where the return was zero length.
(i.e. nothing to parse)
if the shortname (field2 of filea) is really capable of being any portion of the long name (field2 of fileb)
yes, this exercise could be fun.
keep in mind, in sort,
you can not compare a value in one record to a value in another record.
if there is someway to join the records (field1?)
you could then do your compare of field2-filea to substring of field2-fileb.
at this point, i am at the end of my suggestions,
somebody else may come along this weekend and add theirs.
good luck. |
|
Back to top |
|
|
knickraj Warnings : 1 New User
Joined: 11 Jun 2007 Posts: 50 Location: Euro
|
|
|
|
You may try this..
here I am assuming your FIELD1(9) in both the files are assigned to same person i.e
123456789 BAKER
123456789 JOSEPH BAKER
you may build a dynamic include statement from file A using a sort in a temp file.
Code: |
outfil removecc,
header( INCLUDE COND=((Dummy condition for syntax),OR,
Build ( (11,40,SS,EQ,C'BAKER'),OR,
Trailer((Dummy condition for syntax )) |
you will get a o/p file like for all the records in file A
Code: |
INCLUDE COND=(dummy condition ),OR
(11,40,SS,EQ,C'BAKER'),OR,
(11,40,SS,EQ,C'TIM'),OR,
(11,40,SS,EQ,C'XX'),OR |
use this temp file as sortcard sysin (inlclude) for sorting with file B to another temp file .
now you can join on file A and this second temp file on FIELD1(9)
123456789
This also wont give correct result, if a short name is someone else first name, e.g a full name as TIM BAKER, or BAKER TIM |
|
Back to top |
|
|
knickraj Warnings : 1 New User
Joined: 11 Jun 2007 Posts: 50 Location: Euro
|
|
|
|
CORRECTION:
instead of include COND you may use IFTHEN overlay
Code: |
IFTHEN=(WHEN=(11,40,SS,EQ,C'BAKER'),
OVERLAY=(81:C'BAKER') |
this would pull out matching short name from full name, the file may look like
Code: |
123456789 JOSEPH BAKER 100750000000000000123 81:BAKER
123456799 TIM MENDAHIL 100750000000000000123 81:TIM |
then it can be joined with file A, using all fields and short name,exclluding full name. |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
knickraj wrote: |
CORRECTION:
instead of include COND you may use IFTHEN overlay
Code: |
IFTHEN=(WHEN=(11,40,SS,EQ,C'BAKER'),
OVERLAY=(81:C'BAKER') |
this would pull out matching short name from full name, the file may look like
Code: |
123456789 JOSEPH BAKER 100750000000000000123 81:BAKER
123456799 TIM MENDAHIL 100750000000000000123 81:TIM |
then it can be joined with file A, using all fields and short name,exclluding full name. |
Hi,
I can use field1 in both files to take the matched records but at the same time i need to take those records only where in the condition " compare of field2-filea to substring of field2-fileb. " has to be satisfied.
As sample i provided only two records for understanding purpose but there are some lakhs of records.
knickraj i couldn't understand the steps you mentioned for above problem. could you please eloborate. Sorry for that. |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
i can't help but feel that I am missing something
(as well as apologizing for re-entering the conversation).
does the short name field only contain one (1) word?
word is defined as a string of characters bounded by one of the following:- Begining of field and space
- space and space
- space and End of field
If there is indeed only one (1) word to be found in short name,
then continue reading,
else disregard the rest of this post
JOINKEYS on acct no - field1 of both filea and fileb.
SS can not be use because there are too many records to create the constant needed for the SS construct.
so, parse the short name field so that it contains no leading or trailing spaces.
parse the long name field into 20 parse units also so that none contain leading or trailing spaces.
outrec INCLUDE if short name parse eq to any of the 20 long name parses. |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
dbzTHEdinosauer ,
Your solution will work perferct like charm if my short name contains only one word.
But my file also contains (Only few in those lakhs of records) greater than one word.
Thanks for your Idea this would work really fine for one word in my short name. But my short name contains two words like
Baker II(for e.g.) in some records. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
If you have two "words", parse them into individuals like the other. Word1 will always precede Word2. So, if Word2 not blank, do the match on pairs, else singly. |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
someway to join the records (field1?)
you could then do your compare of field2-filea to substring of field2-fileb.
dbZ,
If i join two files with FIEL1 as key how could i compare of field2-filea to substring of field2-fileb. |
|
Back to top |
|
|
knickraj Warnings : 1 New User
Joined: 11 Jun 2007 Posts: 50 Location: Euro
|
|
|
|
Hello,
even parse seems tricky, coz your Short name and full name is/are same length 40.
even if you parse to the short name string with STARTAT=string or ENDAT=String,or something like that to remove leading trailing spaces ,
what you would give the FIXLEN value, as your short name
can go upto 40 which is the length, also for the full name .
Another way may be to SQZ right and then Compare byte by byte..
but still...how to compare from which byte to which byte.
Is there any specific reason for both short name and full name to be of length 40? |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Why would length be a problem? If you give FIXLEN=40, can contain zero to 40 non-space bytes, the number depending on the rest of the PARSE.
I do think we need more details of the requirement, why if a number is available is there a match on names needed? |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
Bill,
for some reason, the TS wants to differentiate between those accounts
where the short name is a subset of long name
and those
where the short name is not part of the long name. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
time to lock the topic,
there is no utility solution until the TS tells which <word> of the <name> constitutes the short name ...
in the first example it is BAKER ( second <word> ) in the second it is TIM ( first <word> )
until the <word> chosen will be clearly identified we are just wasting time |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
enrico-sorichetti wrote: |
time to lock the topic,
there is no utility solution until the TS tells which <word> of the <name> constitutes the short name ...
in the first example it is BAKER ( second <word> ) in the second it is TIM ( first <word> )
until the <word> chosen will be clearly identified we are just wasting time |
The Full name consists of
Title(like MR,DR,.......)+First name+Middle Name+Last Name
Where as the Short name is any TITLE(IF available) + (Any of the First name+Middle Name+Last Name) |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
the last description does not match the initial description of the requirement!
since we are not dentist for what reason getting reasonable and clear info is like pulling a tooth ?
repost a COMPLETE description of the requirement with a proper sample of the data
long name , short name , honorific/title ( Mr, Mrs, Dr. ... )
if the honorific/title cannot be parsed by comparison Your requirement DOES NOT HAVE a solution |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
enrico-sorichetti wrote: |
the last description does not match the initial description of the requirement!
since we are not dentist for what reason getting reasonable and clear info is like pulling a tooth ?
repost a COMPLETE description of the requirement with a proper sample of the data
long name , short name , honorific/title ( Mr, Mrs, Dr. ... )
if the honorific/title cannot be parsed by comparison Your requirement DOES NOT HAVE a solution |
Enrico,
SINCE there are lakhs of records with different permutations it is difficult to tel all the records structre.
I will try with solutions given by dBZ of parsing full name and comparing them with the short name using or clause.
Thank you very much for the soluion dbZ . I will post the code if i'm successful. |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Field1 can be used for a match. Why would the names be different, other than the short vs long.
What I was trying to say before is that if the short name contains two (or more) elements, there is no problem with parsing each element to 40 bytes, even though the source field is only 40 bytes (or whatever size).
If, like "BAKER II", the elements appear in order and consecutively in the long name, then the match can be done reasonably easily.
However, if "MR BAKER" is expected to match "MR JOHN BAKER" then the match becomes more problematic, in that there will be a large number of IFTHENs. The IFTHENs can be generated, but you have to know the limits of elements in the long and short names.
Before 5pm, European time, you need to have an exact explanation of how you want it to work with a full set of input samples and expected output. That will give Kolusu a chance to look at it without a lot more toing-and-froing.
If you get to something yourself, that will be very refreshing and you'll have our heartfelt congratulations. |
|
Back to top |
|
|
dick scherrer
Moderator Emeritus
Joined: 23 Nov 2006 Posts: 19244 Location: Inside the Matrix
|
|
|
|
Hello,
Quote: |
it is difficult to tel all the records structre. |
And this is some reason to be imcomplete. . .
How long do you intend to "discover" data patterns that also need consideration. . .? |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
I Have been successful in completing this requirement but i have used the Combination of DFSORT and Easytrieve
First i have parsed the Full name file for e.g.
Code: |
Key full name
010653607 WHITE FAMILY FOUNDATION
020166013 DE AMORIM MARIA C
033448266 WALLS TRAMMELL SANDRA C
|
into
Code: |
Key ful-name1 full-name2 full name3 full name4
010653607 WHITE FAMILY FOUNDATION
020166013 DE AMORIM MARIA C
033448266 WALLS TRAMMELL SANDRA C
|
Till this point of parsing i used DFSORt but later to comapre short name with either Full name1 or Fullname2 or Fullname3 i used easytrieve logic to match and pull the recrods to output...
Then i have used Easytrieve
Code: |
SORT BNRFILE TO BNRFILE USING (IN1-KEY)
SORT TRANFILE TO TRANFILE USING (IN2-KEY)
JOB INPUT (BNRFILE KEY(IN1-KEY) +
TRANFILE KEY(IN2-KEY))
IF MATCHED
IF ((BN-SHORT-NAME = TX-NAME1) OR +
(BN-SHORT-NAME = TX-NAME2) OR +
(BN-SHORT-NAME = TX-NAME3)OR +
(BN-SHORT-NAME = TX-NAME4))
PUT OUTFILE
END-IF |
NOTE: as my logic depends on only these two fields i haven't pasted the rest of the columns in each record here |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
But this won't work with two elements in the short name.
The code is very similar in DFSORT anyway.
Are you happy with what you have? |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
Bill Woodger wrote: |
But this won't work with two elements in the short name.
The code is very similar in DFSORT anyway.
Are you happy with what you have? |
Yes Bill as i said i just posted you the method not the actual code i actually parsed my full name into 10 subnames because i have few names like ESTATE OF JUDITH JANE MARSHALL-DE COSMO like for e.g.
Code: |
OPTION COPY
INREC PARSE=(%01=(ENDBEFR=C' ',FIXLEN=10),
%02=(ENDBEFR=C' ',FIXLEN=20),
%03=(ENDBEFR=C' ',FIXLEN=20),
%04=(ENDBEFR=C' ',FIXLEN=20),
%05=(ENDBEFR=C' ',FIXLEN=20),
%06=(ENDBEFR=C' ',FIXLEN=20),
%07=(ENDBEFR=C' ',FIXLEN=20),
%08=(ENDBEFR=C' ',FIXLEN=20),
%09=(ENDBEFR=C' ',FIXLEN=20),
%10=(ENDBEFR=C' ',FIXLEN=20)),
BUILD=(%01,%02,%03,%04,%05,%06,%07,%08,%09,%10) |
and for the shorter name i have parsed them into 3 subanmes and during imatched them using Easytrieve.
It worked like a charm.
In easytrieve logic i have used two outputs one for matched and the other for unmatched this way i have checked the unmatched too... and eded up with perfect result....
Thank you for all your help and support...... |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Very well done then. It is not often that someone here just seeks a bit of advice and then gets on with it. Congratulations and good luck. |
|
Back to top |
|
|
southee
New User
Joined: 17 Jun 2012 Posts: 20 Location: INDIA
|
|
|
|
Bill Woodger wrote: |
Very well done then. It is not often that someone here just seeks a bit of advice and then gets on with it. Congratulations and good luck. |
Thank you very much for all giving some valuable sugestions. |
|
Back to top |
|
|
|