[geeklog-users] fix in COM_isemail
Bob Apthorpe
apthorpe+geeklog at cynistar.net
Tue Dec 9 16:43:42 EST 2003
Hi,
On Tue, 9 Dec 2003, Lucas Gonze wrote:
> I added '+' character to the list of allowed chars in on the left side
> of an email address. This permits email addresses like
> joe+geeklog at site.com, which some people use to flag the source of spam.
>
> New code:
> function COM_isemail( $email )
> {
> if( eregi(
> "^([-_0-9a-z+])+([-._0-9a-z+])*@[0-9a-z+]([-.]?[0-9a-z])*.[a-z]{2,3}$",
> $email, $check ))
> // was:
> // if( eregi(
> "^([-_0-9a-z])+([-._0-9a-z])*@[0-9a-z]([-.]?[0-9a-z])*.[a-z]{2,3}$",
> $email, $check ))
> {
> return TRUE;
> }
> else
> {
> return FALSE;
> }
> }
You can reduce your regex to:
"^[-_0-9a-z+]+[-._0-9a-z+]*@[0-9a-z+]([-.]?[0-9a-z])*.[a-z]{2,3}$",
I've stripped out some of the unnecessary parens. I'm not sure if you want
to allow addresses of the format +++++++ at example.com, but that's
technically allowed.
Here's what I'm using:
if( eregi(
"^[-_0-9a-z][-_.0-9a-z]*\\+?[-_.0-9a-z]*@[0-9a-z]([-.]?[0-9a-z])*\\.[a-z]{2,6}$",
$email, $check))
Breaking it down, here's what it does:
"^[-_0-9a-z] # starts with one of [-_0-9a-z]
[-_.0-9a-z]* # followed by 0 or more of [-_.0-9a-z]
\\+? # then 0 or 1 '+'
[-_.0-9a-z]* # then 0 or more of [-_.0-9a-z]
@[0-9a-z] # then '@' followed by one of [0-9a-z]
([-.]?[0-9a-z])* # then 0 or more of ( 0 or 1 of [-.]
# and 1 of [0-9a-z])
\\. # a literal '.'
[a-z]{2,6}$" # terminated with 2-6 of [a-z]
The major differences are that I only allow one '+' (technically you can
have more but most people only use one), the [something][otherthing]*
pattern is better behaved than [something]+[otherthing]* because the regex
engine doesn't backtrack as much (this matters when [something] and
[otherthing] are very similar; it's an efficiency tweak), and the TLD is
from 2-6 characters rather than 2-3, taking into account the new longer
TLDs .aero, .coop, and .museum (not that these matter much in practice but
they are legal.)
I still haven't found the time to implement a more robust email address
validator in PEAR's Mail module. Brave people should look at
http://www.faqs.org/rfcs/rfc822.html to see the pain involved in parsing
email addresses.
hth,
--
Bob Apthorpe
More information about the geeklog-users
mailing list