faqts : Computers : Programming : Languages : PHP : Common Problems : Security

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

7 of 7 people (100%) answered Yes
Recently 7 of 7 people (100%) answered Yes

Entry

How can I prevent spammers from using the php form on my website?

Nov 28th, 2006 14:56
jean korte,


Because php is more 'forgiving' than some other languages, it is easier
for a beginner to learn.  On the other hand, it is also easier for a
beginner to inadvertently create a security or spam problem for themselves.
When php emails the contents of a web page form to the recipient, you
can think of it as sending just one big message.  The separate fields of
from the form have control characters that tell the receiving email
program where each item begins and ends.  Unscrupulous spammers know
this and they look for forms where they can add extra control characters
and extra email addresses and messages.  If you want to know more about
how they do this, here is a page that shows samples of the kind of data
spammers enter:
http://www.securephpwiki.com/index.php/Email_Injection
The suggestion on this page as to how to prevent this problem - namely
by searching for \n control characters - does not really solve the problem.
The problem is solved by tightly validating the data entered into the
form - particularly any single line entry from an html <input type='text'>
statement.  Rather than worrying about what the form should not contain,
edit the data for what it should contain.
I do this using regular expressions.  Here are some examples that have
worked well for me:
The following regular expression will validate most email addresses. 
The 6 character museum domains will not validate - if you expect
visitors from these domains you can change (2,4) to (2,6)
if (!eregi("^[A-Z0-9._-]+@[A-Z0-9._-]+\.[A-Z]{2,4}$",
$_POST['emailAddress']))
{        
die('invalid email address');
}
Note that I have used eregi which is case insensitive so I only had to
specify one of either a-z or A-Z and the other case is taken care of. 
This edit also allows only one email address to be entered.  This is so
that if you automatically put the entered email address into the
reply-to field, the recipient will not inadvertently just hit reply and
not notice that the reply is going to multiple addresses.  This is one
way spammers collect email addresses from on line forms.
The following expression will ensure that a name field will contain only
characters you would expect to be a part of someone's name:
if (!eregi("^([A-Z '-]+)$", $_POST['realname']))
{
   die ('Bad characters in name entry');               
}
This expression allows only letters, no numbers.  It also allows spaces,
 the - for hyphenated names and the ' for names such as O'Toole.
The subject field needs to allow a bit more so I have used the following:
if (!eregi("^([A-Z0-9 '-:\!\?]+)$", $_POST['subject']))
{
   die ('Bad characters in subject entry');               
}
This regular expression uses letters, numbers and a limited amount of
punctuation.  It does NOT allow the % character as it can precede the
hex equivalents of control characters and is used by spammers for this
purpose.  So it would prevent a subject such as 50%off -- but the user
can spell out per cent if they really want to input this into a form.
None of these regular expressions allow the back slash characater as it
preceeds control characters.  The backslashes in the subject expression
above are escape characters.   The ? and the ! have a special meaning in
php and the \? and \! tell php that we really mean just the characters
themselves, not the special meaning.
Of course, if you have progressed to the point of giving your users a
chance to re-enter their data, you would direct them with an error
message rather than simply exiting the script with die().
You will find all sorts of scattered advice suggesting that you replace
\r\n characters or  replace to: bcc: cc: or that you search for strings
such as "content-type" etc. etc. used by spammers.  This is fighting a
losing battle and further restricts the letigimate use of these words by
a user of your form.  
The last vulnerablity is at the very end of the comments or message
entered with a <textarea> tag.  Even with the above in place, it is
slightly possible that a spammer could enter control characters and
extra email addresses at the end of the message.  There are a couple of
very easy ways to prevent this.
One is to use the php expression nl2br() to convert any new line (\n)
characters to <br />\n which extinguishes the usefulness of the \n
character to a spammer.  However, if you don't want to send html tags
with your email this is not an option.
The other is to just manually add something at the very end of the
message yourself.  For example, appending "end of message" to whatever
the user entered would prevent this.  Us this as an opportunity to add
something useful.  
I add the IP address of the sender using the $_SERVER['REMOTE_ADDR']
variable.  Because I display a copy of the message sent, the user knows
that their IP address was sent. 
This has a potential added benefit.  If someone you know has a grudge
against you or something and decides to try to send you anonymous
messages using your contact form, they will know that you have their IP
address and can trace or block them and they may not bother with this
juvenile tactic. (hopefully) 
I hope that this is helpful in the overall fight against spam.
ps
I hope that it goes without saying that your email address should be in
the script and not on the web page itself so that it is not avaialble to
robots harvesting email addresses.