View previous topic :: View next topic
|
Author |
Message |
vinu78
Active User
Joined: 02 Oct 2008 Posts: 179 Location: India
|
|
|
|
Hi,
My reqt is as follows.
Need to replace the Post Box if it comes as - 'P.O. BOX', 'P.O.BOX', 'PO BOX'
to standard format - 'P.O. BOX'.
Eg: "P.O.BOX 123, ROUTE MM" should be "P.O. BOX 123, ROUTE MM"
"P. O. BOX M, STATE HWY" should be "P.O. BOX M, STATE HWY"
"PO BOX 455 FRONT ROAD" should be "P.O. BOX 455 FRONT ROAD"
i.e., if the address contains Post Box in the above mentioned ways, standardize it as 'P.O. BOX' followed by remaining values.
The Post Box format can come in any ways however the input file contains majority of them coming in these 3 ways.
Algorithm
1. Store the address in WS-ADDRESS
2. INSPECT WS-ADDRESS REPLACING 'P.O.BOX ' BY 'P.O. BOX'
The problem with above method is that the source and target field lengths should be equal.
Can anyone please help with the algorithm to accomplish this reqt ?
Thanks
Vinu |
|
Back to top |
|
|
CICS Guy
Senior Member
Joined: 18 Jul 2007 Posts: 2146 Location: At my coffee table
|
|
|
|
Been there, done that....grin....
Set up a perform that scans for the expected bad text with reference modification and do the replacement.
Tedious, but programmable.
Show us some code to see if we can advance your solution. |
|
Back to top |
|
|
vinu78
Active User
Joined: 02 Oct 2008 Posts: 179 Location: India
|
|
|
|
Hi,
I started this by the logic
The input string is 30 bytes in length
- Declare numeric displacement ws-d and move 1 to it
Code: |
IF INPUT-STRING(WS-D : 7) = 'P.O.BOX'
MOVE 'P.O. BOX' TO OUTPUT-STRING(1:8)
WS-LENGTH = WS-D + 7
WS-TOT = 30-WS-LENGTH
MOVE INPUT-STRING(WS-LENGTH:WS-TOT) TO OUTPUT-STRING(9:22)
END-IF |
Liekwise I need to add IF conditon for diff cases and WS-LENGTH changes for each case.
I am stuck up somewhere in this logic. If Post Box comes in the middle of the string, then this logic wont work.
Thanks
Chidam |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
Back to top |
|
|
Robert Sample
Global Moderator
Joined: 06 Jun 2008 Posts: 8696 Location: Dubuque, Iowa, USA
|
|
|
|
Create an output variable and move data from the input variable, byte by byte, until you find one of the fields that need to be changed. Move it, then move the rest of the input variable to the output variable. |
|
Back to top |
|
|
vinu78
Active User
Joined: 02 Oct 2008 Posts: 179 Location: India
|
|
|
|
Hi Dick,
You got it right. It was the same requirement that I posted earlier.
Earlier the reqt was to SPACE OUT whenever we find 'P.O. BOX <number>' or 'PO. BOX <num>' etc., So that time we have used INSPECT replacing by spaces.
But now the requirement is to standardize Post Box as 'P.O. BOX' followed by the remaining values. Here I can't use INSPECT REPLACING BY 'P.O. BOX' since the source and target string are not the same.
Robert - Thanks for the logic. I will try implementing this logic
Thanks
Chidam |
|
Back to top |
|
|
CICS Guy
Senior Member
Joined: 18 Jul 2007 Posts: 2146 Location: At my coffee table
|
|
|
|
Heck, talk about a VERY deep IF loop...
Code: |
If byte = P
then if byte+1 = O
found PO
else if byte+1 = .
then found P.
else if byte+1 = space
then found 'P '
etc........ |
Reference modification and a 'BIG' perform loop is your answer, have fun and let us know what you have discovered in your quest. |
|
Back to top |
|
|
Marso
REXX Moderator
Joined: 13 Mar 2006 Posts: 1353 Location: Israel
|
|
|
|
You can easily cover all cases with the following code:
Code: |
MOVE 0 TO counter
UNSTRING old-address DELIMITED BY 'BOX '
INTO work-area COUNT IN counter
IF counter = LENGTH OF address THEN
DISPLAY ' BOX not found. No change'
MOVE old-address TO new-address
ELSE
MOVE 'P. O. ' TO new-address
ADD 1 TO counter
MOVE old-address (counter:) TO new-address (7:)
END-IF |
However, the text preceding "BOX" is not checked, so it can lead to funny results:
Code: |
Before process: <P.O.BOX 123, ROUTE MM >
After process: <P. O. BOX 123, ROUTE MM >
Before process: <P.O..BOX 123, ROUTE MM >
After process: <P. O. BOX 123, ROUTE MM >
Before process: <P. O. BOX M, STATE HWY >
After process: <P. O. BOX M, STATE HWY >
Before process: <P. A. BOX M, STATE HWY >
After process: <P. O. BOX M, STATE HWY >
Before process: <PO BOX 455 FRONT ROAD >
After process: <P. O. BOX 455 FRONT ROAD >
Before process: <P.O. BIX 123, ROUTE MM >
BOX not found. No change
Before process: <LETTERBOX M, STATE HWY >
After process: <P. O. BOX M, STATE HWY >
Before process: <PO NO BOX 455 FRONT ROAD >
After process: <P. O. BOX 455 FRONT ROAD >
Before process: <PA NO BOX 455 FRONT ROAD >
After process: <P. O. BOX 455 FRONT ROAD >
Before process: <NO BOX IN FRONT ROAD >
After process: <P. O. BOX IN FRONT ROAD > |
As the text preceding BOX is placed in field "work-area" (in the UNSTRING command), it can be checked.
There are many ways to do that... |
|
Back to top |
|
|
vinu78
Active User
Joined: 02 Oct 2008 Posts: 179 Location: India
|
|
|
|
Thanks Marso for the logic.
I have achieved amost 70% of the requirement using your logic.
The input-string and output-string is 35 bytes in length.
Code: |
INSPECT INPUT-STRING TALLYING A-COUNT FOR ALL
'P.O. BOX' 'P. O. BOX' 'PO BOX' P O BOX'
IF A-COUNT > 0
UNSTRING INPUT-STRING DELIMITED BY 'BOX ' INTO WS-WORK-AREA
COUNT IN B-COUNT
ADD 4 TO B-COUNT
COMPUTE WS-LENGTH = 35 - B-COUNT
MOVE 'P.O. BOX' TO OUTPUT-STRING(1:8)
MOVE INPUT-STRING(D-COUNT:WS-LENGTH) TO OUTPUT-STRING(9:27)
ELSE
MOVE INPUT-STRING TO OUTPUT-STRING
END-IF |
Now if Post Box comes in the beginning of the string, the above code works perfect. Else it will skip the address coming before Post Box.
So the address "1510 SURREY PO BOX 123 ROUTE 5" will be
"P.O. BOX 123 ROUTE 5" using the above code.
It should have been "1510 SURREY P.O. BOX 123 ROUTE 5"
I am actively trying for the logic but somehow not getting.
Whether anybody can give some helping hand with the above logic.
The variable WS-WORK-AREA contains the whole address before 'BOX' and the B-COUNT contains the length of that.
Thanks
Chidam |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
When you do the UNSTRING, I would also collect the remainder in another field.
Code: |
05 First-half pic x(30).
05 Second-half pic x(30).
UNSTRING INPUT-STRING
delimited by 'BOX '
INTO First-half
,Second-half
end-unstring |
INSPECT REVERSE(first-half) for first P tallying p-position
you now know the position of P (of P. O. or PO or .,...)
so now
Code: |
STRING first-half(1:length of first-half - p-position) <<may be +1???
'P.O. BOX'
second-half
INTO Final-Area
|
|
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
and in the event that p-position = 30 (when the PO BOX is first)
then
Code: |
IF P-POSITION = 30
THEN
STRING 'P.O. BOX '
SECOND-HALF
INTO FINAL-AREA
END-STRING
ELSE
STRING first-half(1:length of first-half - p-position) <<may be +1???
'P.O. BOX '
second-half
INTO Final-Area
END-STRING
END-IF
|
of course always space out first-half, second-half and final-area
before doing the unstring. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10873 Location: italy
|
|
|
|
I dare to say that Your expectations are quite unreasonable,
You cannot expect to be able to implement a parser for a transformation
without defining a <grammar>
the parser will succeed only for the strings satisfying the <grammar>
for such cases usually expressed as a sequence of regular expression rules
first express in plain language the rules, forgetting about the associative logic of human brain
how are You going to parse the words before BOX,
tell the rules and everything will be easy
visual parsing can not be implemented in a programming language
define the acceptable percentage of unparsable string and proceed from that
in any case You will alway have a residual that must be normalized by hand |
|
Back to top |
|
|
PeterHolland
Global Moderator
Joined: 27 Oct 2009 Posts: 2481 Location: Netherlands, Amstelveen
|
|
Back to top |
|
|
vinu78
Active User
Joined: 02 Oct 2008 Posts: 179 Location: India
|
|
|
|
Thanks Dick for the logic.
I will try that and will let the group know about the results.
Thanks
Chidam |
|
Back to top |
|
|
dneufarth
Active User
Joined: 27 Apr 2005 Posts: 419 Location: Inside the SPEW (Southwest Ohio, USA)
|
|
|
|
Side note question?
I remember doing an address scrubbing project that used some data provided by the USPS for proper zipcodes and I thought the USPS wanted no punctuation in any part of the address. Is that an old requirement or am I just off track. |
|
Back to top |
|
|
dneufarth
Active User
Joined: 27 Apr 2005 Posts: 419 Location: Inside the SPEW (Southwest Ohio, USA)
|
|
Back to top |
|
|
vinu78
Active User
Joined: 02 Oct 2008 Posts: 179 Location: India
|
|
|
|
Hi,
Thanks all for the help.
I did the coding as follows and got the desired results
Code: |
INSPECT WS-INPUT TALLYING A-COUNT FOR ALL 'P.O.BOX ' 'P. O. BOX' 'PO BOX' 'P.O. BOX' 'P O BOX'
IF A-COUNT > 0
UNSTRING WS-ADDRESS-POST DELIMITED BY 'BOX ' INTO WS-FIRST-HALF, WS-SECOND-HALF
MOVE FUNCTION REVERSE(WS-FIRST-HALF) TO WS-REVERSE
PERFORM VARYING WS-J FROM 1 BY 1 UNTIL WS-J > 30
IF WS-REVERSE(WS-J:1) = 'P'
MOVE WS-J TO P-POSITION
MOVE 31 TO WS-J
END-IF
END-PERFORM
IF P-POSITION = 30
STRING 'P.O. BOX ', WS-SECOND-HALF DELIMITED BY SIZE INTO WS-OUTPUT
ELSE
STRING WS-FIRST-HALF (1:30-P-POSITION),'P.O. BOX ', WS-SECOND-HALF DELIMITED BY SIZE INTO WS-OUTPUT
END-IF
ELSE
MOVE WS-INPUT TO WS-OUTPUT
END-IF. |
I can keep adding different formats of P.O Box in the first INSPECT statement.
Thanks
Chidam |
|
Back to top |
|
|
|