IBM Mainframe Forum Index
 
Log In
 
IBM Mainframe Forum Index Mainframe: Search IBM Mainframe Forum: FAQ Register
 

Find the occurrence of keywords using INSPECT


IBM Mainframe Forums -> COBOL Programming
Post new topic   Reply to topic
View previous topic :: View next topic  
Author Message
suraaj

New User


Joined: 16 Apr 2009
Posts: 69
Location: Canada

PostPosted: Fri Aug 30, 2013 8:27 pm
Reply with quote

Hi

I am trying to check for the occurence of any of the keywords from the list defined in a record using INSPECT. Please see the code below.

Code:

WS-APARTMENT.
 10  WS-APT-TYPE        PIC X(6).
     88  APT-TYPE        VALUE 'APT'
                               'APP'
                               'PH'
                               'SUITE'
                               'UNIT'
                               'A TERR'
                               'BUREAU'
                               'UNITÉ'
                               '#'.


Processing:

Code:

INSPECT ADDRESS-LINE TALLYING WS-CHAR-COUNTER-1
         FOR CHARACTERS BEFORE INITIAL WS-APT-TYPE


Once I have found the position of the start of the keyword, I need to strip out the data from the keyword to another field. Please advise

Thanks Suraaj
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Fri Aug 30, 2013 9:42 pm
Reply with quote

Well, depending on your data it probably won't work, but do you have a specific question?

Searching for "#" like that will really be searching for

Code:
"#      "


and the other short ones likewise.
Back to top
View user's profile Send private message
suraaj

New User


Joined: 16 Apr 2009
Posts: 69
Location: Canada

PostPosted: Fri Aug 30, 2013 10:00 pm
Reply with quote

If not INSPECT how can this be done in COBOL. Basically I need to parse the line to find the keywords.
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8697
Location: Dubuque, Iowa, USA

PostPosted: Fri Aug 30, 2013 10:24 pm
Reply with quote

Two approaches are generally used:
1. redefine the data as an array of PIC X(01) and look for matches one byte at a time.
2. use reference modification of the data to match against the values
Back to top
View user's profile Send private message
dick scherrer

Moderator Emeritus


Joined: 23 Nov 2006
Posts: 19244
Location: Inside the Matrix

PostPosted: Sat Aug 31, 2013 12:15 am
Reply with quote

Hello,

How much input data is to be searched/parsed?

You will probably want 2 "arrays". The first which is the input data that is to be parsed. The second would contain the apt-types and their length.

For each apt-type (using reference modification) compare that length against the "current byte" in the input data. If all types are checked and not found at this byte, increment to the next byte in the input. Make sure the compare does not go beyond the end of the text.

Keeep in mind this will use LOTS of cpu per input record . . .
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Sat Aug 31, 2013 4:34 am
Reply with quote

There are a number of ways to do this. Why don't you show some sample input, expected output, and describe the input as fully as you can?
Back to top
View user's profile Send private message
suraaj

New User


Joined: 16 Apr 2009
Posts: 69
Location: Canada

PostPosted: Sun Sep 01, 2013 6:37 pm
Reply with quote

Input:

Code:

31 heather street apt2013
45 carlview street unit 2512


In this case of input I need the position where the 'APT' word starts.

Code:


19
20


Once I have found the position, I will strip out the data starting from the keyword.


Thanks Suraaj
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Sun Sep 01, 2013 9:56 pm
Reply with quote

I think it is a stretch to call "apt" a keyword in this case.


Code:
31 clapton pond apt2013
31 apthill road apt2013


You have a lot of work to do if you want to find that apt2013 (or the 2013) in a "string" of text.
Back to top
View user's profile Send private message
don.leahy

Active Member


Joined: 06 Jul 2010
Posts: 765
Location: Whitby, ON, Canada

PostPosted: Tue Sep 03, 2013 6:25 pm
Reply with quote

Have you tried UNSTRING?
Code:
unstring address-line                     
    delimited by               'APT'   
              or               'APP'   
              or               'PH'   
              or               'SUITE'
              or               'UNIT' 
              or               'A TERR'
              or               'BUREAU'
              or               'UNITÉ'
              or               '#'     
    into ws-part-1                     
         ws-part-2                             
 end-unstring                         
Back to top
View user's profile Send private message
don.leahy

Active Member


Joined: 06 Jul 2010
Posts: 765
Location: Whitby, ON, Canada

PostPosted: Tue Sep 03, 2013 7:27 pm
Reply with quote

Note that the UNSTRING approach that I outlined will not handle cases like the one that Bill pointed out:
Code:
31 apthill road apt2013 
31 suite street suite 2013
28-31 suite street  (this is a format recommended by Canada Post, where the apartment number precedes the street number)

It doesn't matter what programming language you are using, parsing addresses is not a trivial task.
Back to top
View user's profile Send private message
suraaj

New User


Joined: 16 Apr 2009
Posts: 69
Location: Canada

PostPosted: Tue Sep 03, 2013 11:50 pm
Reply with quote

The apartment number in my case would come at the end of the line and not at the beginning. By this I mean if the

Code:

31 apthill road apt2013


is the input then the "apt" that is present at the end is the one that we should consider.

Thanks Suraaj
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Sep 04, 2013 1:04 am
Reply with quote

Start at the "back" of the line. Search for a non-blank, to ignore trailing spaces. After finding a non-blank, continue searching backwards for a blank.

Ensure that the field being searched is not entirely blank, and that your code works when there are no trailing blanks.

Having isolated the start and end, it is easy to get hold of the data and see if it is what you want.

Here's an example which is much more complicated than your task, but could be easily adapted.

What country's addresses are you looking at?

I see you have need of one with two words. When you have the first word from above, it is easy to get the 2nd-last word as well. Just remember not to assume that there is one.
Back to top
View user's profile Send private message
suraaj

New User


Joined: 16 Apr 2009
Posts: 69
Location: Canada

PostPosted: Wed Sep 04, 2013 7:35 pm
Reply with quote

Bill,

I am looking at addresses worldwide.

Thanks Suraaj
Back to top
View user's profile Send private message
Nic Clouston

Global Moderator


Joined: 10 May 2007
Posts: 2455
Location: Hampshire, UK

PostPosted: Wed Sep 04, 2013 8:34 pm
Reply with quote

Quote:
I am looking at addresses worldwide.

Good luck as a lot of countries seem to have their own format!
Back to top
View user's profile Send private message
Robert Sample

Global Moderator


Joined: 06 Jun 2008
Posts: 8697
Location: Dubuque, Iowa, USA

PostPosted: Wed Sep 04, 2013 9:09 pm
Reply with quote

My first observation is that your unit list is short -- the US Post Office has officially 23 different abbreviations for the secondary unit name, which implies you need 46 names / abbreviations in your search algorithm. There will be others for non-US addresses.

My second observation is that someone needs to do a LOT of due diligence on the address lists since not all countries use the same format (Costa Rice, for example, does not use street numbers / names in addresses but rather distances from local landmarks).
Back to top
View user's profile Send private message
Bill Woodger

Moderator Emeritus


Joined: 09 Mar 2011
Posts: 7309
Location: Inside the Matrix

PostPosted: Wed Sep 04, 2013 10:14 pm
Reply with quote

200+ coutries and territories?

If you are trying to fully parse addresses from around the world (or even 10-20 major countries) you have a huge task on your hands.

I'd suggest you look at a commercial service, they have taken years develloping and refining their systems, so you don't have to...
Back to top
View user's profile Send private message
View previous topic :: :: View next topic  
Post new topic   Reply to topic View Bookmarks
All times are GMT + 6 Hours
Forum Index -> COBOL Programming

 


Similar Topics
Topic Forum Replies
No new posts To find whether record count are true... DFSORT/ICETOOL 6
No new posts Find the size of a PS file before rea... COBOL Programming 13
No new posts Find the occurrence of Key Field (Par... DFSORT/ICETOOL 6
No new posts Find a record count/numeric is multip... COBOL Programming 1
No new posts Need to find a specific STRING COBOL Programming 11
Search our Forums:

Back to Top