View previous topic :: View next topic
|
Author |
Message |
abhit007
New User
Joined: 18 Oct 2006 Posts: 10 Location: Bangalore
|
|
|
|
Hi Guys,
I am facing a problem here. I want to write a REXX program to pick up all the email Ids present in a report. for example.
XYZ.ABC.DEF.GIH (PS)
**************************************************************
THIS REPORT IS USED TO FIND OUT THE NEW EMAIL IDS
INTRODUCED
**************************************************************
%EMAIL (ABC@GMAIL.COM,ABC.XYZ@GMAIL.COM ) -
%EMAIL (SUPPORT@GMAIL.COM CARDSSD@GMAIL.COM) -
%EMAIL (REPORT@GMAIL.COM ) -
%EMAIL (SUPPORT@GMAIL.COM ) -
%EMAIL (XXXXXXXXXXX@GMAIL.COM ) -
%EMAIL (SUPPORT@GMAIL.COM ) -
%EMAIL (BONUS@GMAIL.COM, PRODUCTS@YAHOO.CO.IN) -
From the above report I want to pick out all the email ids and write in other pds as:
ABC@GMAIL.COM
ABC.XYZ@GMAIL.COM
SUPPORT@GMAIL.COM
CARDSSD@GMAIL.COM
REPORT@GMAIL.COM
SUPPORT@GMAIL.COM
XXXXXXXXXXX@GMAIL.COM
SUPPORT@GMAIL.COM
BONUS@GMAIL.COM
PRODUCTS@YAHOO.CO.IN
Any idea??
Thanks,
Abhishek |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
where are You stuck ....
in the logic or the code ? |
|
Back to top |
|
|
abhit007
New User
Joined: 18 Oct 2006 Posts: 10 Location: Bangalore
|
|
|
|
To start with I am struck at the logic itself. :P |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
1.
access a record
isolate and store email-address(es)
more records? goto 1
ouput all stored email-addresses
what would your logic in cobol be?
why did you pick REXX for this project?
how many lines in the report? |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
Identify the lines from first word, or by location, being %e-mail.
Get rid of first word, opening and closing brackets, commas.
Take each remaining word and treat as an e-mail address. |
|
Back to top |
|
|
prino
Senior Member
Joined: 07 Feb 2009 Posts: 1314 Location: Vilnius, Lithuania
|
|
|
|
Translate everything not alphanumerical, '@' and '.' to space, and use the REXX "word" and "words" functions. Any "word" containing an '@' sign is (probably) an email address. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
faster to write/test post the code that trying to explain
Code: |
#!/usr/bin/rexx
list.1 = "%EMAIL (ABC@GMAIL.COM,ABC.XYZ@GMAIL.COM ) "
list.2 = "%EMAIL (SUPPORT@GMAIL.COM CARDSSD@GMAIL.COM) "
list.3 = "%EMAIL (REPORT@GMAIL.COM ) "
list.4 = "%EMAIL (SUPPORT@GMAIL.COM ) "
list.5 = "%EMAIL (XXXXXXXXXXX@GMAIL.COM ) "
list.5 = "%EMAIL (SUPPORT@GMAIL.COM ) "
list.6 = "%EMAIL (BONUS@GMAIL.COM, PRODUCTS@YAHOO.CO.IN) "
list.0 = 6
addr.0 = 0
do i = 1 to list.0
buff = list.i
if left(buff,6) = "%EMAIL" then do
/* check for a properly delimited list */
if (pos("(",buff)) = 0 | ,
(pos(")",buff)) = 0 then do
say " bad EMAIL record" right(i,2) buff
iterate
end
parse var buff with ."(" list ")" .
/* provide both for comma/wsp delimited addresses */
list = translate(list," ",",")
do j = 1 to words(list)
a = addr.0 + 1
addr.a = strip(word(list,j))
addr.0 = a
end
end
end
do a = 1 to addr.0
say a addr.a
end
|
if the input format changes, do a bit of work Yourself
for example You might need to build from multi card list |
|
Back to top |
|
|
abhit007
New User
Joined: 18 Oct 2006 Posts: 10 Location: Bangalore
|
|
|
|
Hi Enrico,
The code you have provided is working perfectly after a little tweaking.
I am reading the report from my dataset into a stem and then running the piece of code you have given to filter out the mail addresses. Comma and spaces were the only delimiters.
Since %EMAIL is not the only thing at the first place in a line, so I have removed the first para of your code. However, there are still some minor glitches as it is taking special characters like '-' and writing it in a new line.
Hi All,
Thanks for your help as well.
Many thanks for all the help.
Abhishek |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
post an example of where MY the code fails,
my snippet as posted will parse correctly ANY list within parentheses
and build the single appearances correctly unless it contains commas or spaces |
|
Back to top |
|
|
don.leahy
Active Member
Joined: 06 Jul 2010 Posts: 765 Location: Whitby, ON, Canada
|
|
|
|
Enrico, re your code sample: on my display, the lowercase "l" and the numeral "1" appear identical in the code window.
ie the line "do l = 1 to list.0" looks like "do l = l to list.0" |
|
Back to top |
|
|
Bill Woodger
Moderator Emeritus
Joined: 09 Mar 2011 Posts: 7309 Location: Inside the Matrix
|
|
|
|
1l1l1l1l1l1l1l1l1l1
Code: |
1l1l1l1l1l1l1l1l1l1 |
1l1l1l1l1l1l1l1l1l1
It is not just enrico's stuff, it seems. How many years has it taken for anyone to spot that (well done Don!)? When you copy/paste from the coded stuff, you get the right ones, as above (first typed, copy/paste, coded, preview, copy/paste from Coded stuff on Preview).
Or is this a recent change? How could we tell :-)
EDIT: This is with Firefox, anyway. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
Quote: |
Enrico, re your code sample: on my display, the lowercase "l" and the numeral "1" appear identical in the code window. |
that' a common problem with some fonts
my code sample worked well AS CODED
just as a courtesy to Your eyes I edited the code to use j for the inner loop and i for the outer |
|
Back to top |
|
|
dbzTHEdinosauer
Global Moderator
Joined: 20 Oct 2006 Posts: 6966 Location: porcelain throne
|
|
|
|
i personally never use the single alpha i, l, or o as a variable
since people tend to see what they want
instead of what is there. |
|
Back to top |
|
|
abhit007
New User
Joined: 18 Oct 2006 Posts: 10 Location: Bangalore
|
|
|
|
Hi Enrico,
Your code is working perfectly. I just had to alter it to suit my needs.
Hi All,
Next part of the problem:
Now I have to take each of these Ids from one dataset and search in my database (another dataset) for existence. If any of the address is not present then write in another file.
This database is again not a formatted one, having email Ids randomly, so we might have to use POS or something like that.
Can you please help?
Thanks,
Abhishek |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2593 Location: Silicon Valley
|
|
|
|
Quote: |
This database is again not a formatted one, having email Ids randomly, so we might have to use POS or something like that.
|
1. Change all '(', ')', and ',' to blanks.
2. for each word of each line, use POS function to see if it has the '@' character. |
|
Back to top |
|
|
abhit007
New User
Joined: 18 Oct 2006 Posts: 10 Location: Bangalore
|
|
|
|
enrico-sorichetti wrote: |
post an example of where MY the code fails,
my snippet as posted will parse correctly ANY list within parentheses
and build the single appearances correctly unless it contains commas or spaces |
Hi Enrico,
There is one problem. There are certain instances where parenthesis are not there. For example
%YMAIL (XYZVITY@GMAIL.COM,XG.RPCZPYRATIZNX@GMAIL.COM, -
CZYRYL.CZIAM@GMAIL.COM,XYNGCZAI.CZAN@GMAIL.COM, -
(DZZZA.AZMAD@GMAIL.COM,MARIYTTA.BZZCZRRYJADZ@GMAIL.COM, -
DYVAXXY.ATTZKARYN@GMAIL.COM,ALKA.JZA@GMAIL.COM, -
AAAT.XINGZ3@GMAIL.COM,VIXZAL.ZJZA@GMAIL.COM) -
FRZM (XYZVITY@GMAIL.COM) -
How this can be parsed?
Thanks,
Abhishek. |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
I made a note about ....
Quote: |
for example You might need to build from multi card list |
implied some kind of continuation ...
but You must set some rules ...
a string concatenation/sequence from a parsing point of view is an expression
and in any language in an expression the parentheses must be balanced
the example You posted is NOT parsable because the open and close parentheses are not balanced
You cannot blame on the parsing if Your input is wrong! |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
here is a snippet that will parse a properly parenthesized string
Code: |
#!/usr/bin/rexx
list.1 = "%EMAIL (ABC@GMAIL.COM,ABC.XYZ@GMAIL.COM ) "
list.2 = "%EMAIL (SUPPORT@GMAIL.COM CARDSSD@GMAIL.COM -"
list.3 = "REPORT@GMAIL.COM - "
list.4 = "SUPPORT@GMAIL.COM ) "
list.5 = "%EMAIL (XXXXX_XXX.XXX@somefuckng-strange.address.COM ) "
list.6 = "%EMAIL (SUPPORT@GMAIL.COM ) "
list.7 = "%EMAIL (BONUS@GMAIL.COM, PRODUCTS@YAHOO.CO.IN) "
list.0 = 7
addr.0 = 0
indx = 1
buff = ""
do while ( indx <= list.0 )
buff = strip(list.indx)
say indx "*1 " buff
indx = indx + 1
do while ( right(buff,1) = "-" & indx <= list.0 )
buff = strip(buff,,"-") strip(list.indx)
say indx "*2 " buff
indx = indx + 1
end
/* do some checking for a wrongful continuation on last card */
say indx "*3 " buff
if left(buff,6) = "%EMAIL" then do
/* check for a properly delimited list */
if (pos("(",buff)) = 0 | ,
(pos(")",buff)) = 0 then do
say "bad EMAIL record" right(indx,2) buff
iterate
end
parse var buff with ."(" list ")" .
/* provide both for comma/wsp delimited addresses */
list = translate(list," ",",")
do i = 1 to words(list)
a = addr.0 + 1
addr.a = strip(word(list,i))
addr.0 = a
end
end
end
do a = 1 to addr.0
say a addr.a
end
|
|
|
Back to top |
|
|
abhit007
New User
Joined: 18 Oct 2006 Posts: 10 Location: Bangalore
|
|
|
|
Quote: |
the example You posted is NOT parsable because the open and close parentheses are not balanced
|
Hi Enrico,
%EMAIL (XYZVITY@GMAIL.COM,XG.RPCZPYRATIZNX@GMAIL.COM, -
CZYRYL.CZIAM@GMAIL.COM,XYNGCZAI.CZAN@GMAIL.COM, -
DZZZA.AZMAD@GMAIL.COM,MARIYTTA.BZZCZRRYJADZ@GMAIL.COM, -
DYVAXXY.ATTZKARYN@GMAIL.COM,ALKA.JZA@GMAIL.COM, -
AAAT.XINGZ3@GMAIL.COM,VIXZAL.ZJZA@GMAIL.COM) -
FRZM (XYZVITY@GMAIL.COM) -
Ignore the example in prev post , consider this example where parenthesis is balanced.
Since there are multiple Ids it is coming in multiple lines.
Thanks
Abhishek |
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
and what about the
Quote: |
FRZM (XYZVITY@GMAIL.COM) |
? |
|
Back to top |
|
|
Anuj Dhawan
Superior Member
Joined: 22 Apr 2006 Posts: 6248 Location: Mumbai, India
|
|
|
|
abhit007 wrote: |
Ignore the example in prev post , consider this example where parenthesis is balanced.
Since there are multiple Ids it is coming in multiple lines. |
Enrico, Working on a distant-contract, by any chance? |
|
Back to top |
|
|
Pedro
Global Moderator
Joined: 01 Sep 2006 Posts: 2593 Location: Silicon Valley
|
|
|
|
I think this is more simpler logic:
Quote: |
1. Change all '(', ')', and ',' to blanks.
2. for each word of each line, use POS function to see if it has the '@' character.
|
|
|
Back to top |
|
|
enrico-sorichetti
Superior Member
Joined: 14 Mar 2007 Posts: 10886 Location: italy
|
|
|
|
Quote: |
Enrico, Working on a distant-contract, by any chance? |
nope!
writing scanners and parsers is one of my preferred subjects
so when I see something related I like to explore all the alternatives, not only just working ,
but also not trivial from a scanning/parsing algorithmic point of view
stay tuned I hav a nice snippet coming ! |
|
Back to top |
|
|
|