Web Application Security Guide/Print version

Edit Intro

This guide attempts to provide a comprehensive overview of web application security. Common web application security issues and methods how to prevent them are explained. Web server and operating system security are not covered. The guide is intended mainly for web application developers, but can also provide useful information for web application reviewers.

The checklist gives a short summary containing only the individual guidelines. It is recommended to take the time and read the full version, where the guidelines are explained in detail, especially if any questions arise.

Most web application developers probably (hopefully) already know some or even most of the points mentioned in this guide. However, there will probably be something new for every developer. Remember, as a developer it is your responsibility to develop your application securely, and a single mistake may be enough to allow an attack.

Checklist edit

Miscellaneous points

Do not rely on Web Application Firewalls for security (however, consider using them to improve security)
If external libraries (e.g. for database access, XML parsing) are used, always use current versions
If you need random numbers, obtain them from a secure/cryptographic random number generator
For every action or retrieval of data, always check access rights
Do not, under any circumstances, attempt to implement cryptographic algorithms yourself. Use high-level libraries for cryptography.
Ensure debug output and error messages do not leak sensitive information
Mark problematic debug output in your code (e.g. //TODO DEBUG REMOVE) even if you intend to remove it after just one test
Do not use “eval()” and similar functions
- Avoid “system()” and similar functions if possible
Ensure database servers are not directly reachable from the outside
Consider to block old browsers from using your application

File inclusion and disclosure

Do not take file names for inclusions from user input, only from trusted lists or constants.
- If user input is to be used, validate it against a whitelist. Checking if the file exists or if the input matches a certain format is not sufficient.
Avoid having scripts read and pass through files if possible.
If you read and deliver files using user-supplied file names, thoroughly validate the file names to avoid directory traversal and similar attacks and ensure the user is allowed to read the file.
Ensure the application runs with no more privileges than required.

File upload vulnerabilities

Avoid unnecessary file uploads
Ensure that files uploaded by the user cannot be interpreted as script files by the web server, e.g. by checking the file extension (or whatever means your web server uses to identify script files)
Ensure that files cannot be uploaded to unintended directories (directory traversal)
Try to disable script execution in the upload directory
Ensure that the file extension matches the actual type of the file content
If only images are to be uploaded, consider re-compressing them using a secure library to ensure they are valid
Ensure that uploaded files are specified with the correct Content-type when delivered to the user
Prevent users from uploading problematic file types like HTML, CSS, JavaScript, XML, SVG and executables using a whitelist of allowed file types
Prevent users from uploading special files (e.g. .htaccess, web.config, robots.txt, crossdomain.xml, clientaccesspolicy.xml)
Prevent users from overwriting application files
Consider delivering uploaded files with the “Content-disposition: attachment” header

SQL injection

use prepared statements to access the database – or –
use stored procedures, accessed using appropriate language/library methods or prepared statements
Always ensure the DB login used by the application has only the rights that are needed

Cross-site scripting (XSS)

Escape anything that is not a constant before including it in a response as close to the output as possible (i.e. right in the line containing the “echo” or “print” call)
If not possible (e.g. when building a larger HTML block), escape when building and indicate the fact that the variable content is pre-escaped and the expected context in the name
Consider the context when escaping: Escaping text inside HTML is different from escaping HTML attribute values, and very different from escaping values inside CSS or JavaScript, or inside HTTP headers.
- This may mean that you need to escape for multiple contexts and/or multiple times. For example, when passing a HTML fragment as a JS constant for later includsion in the document, you need to escape for JS string inside HTML when writing the constant to the JavaScript source, then escape again for HTML when your script writes the fragment to the document. (See rationale for examples)
- The attacker must not be able to put anything where it is not supposed to be, even if you think it is not exploitable (e.g. because attempts to exploit it result in broken JavaScript).
Explicitly set the correct character set at the beginning of the document (i.e. as early as possible) and/or in the header.
Ensure that URLs provided by the user start with an allowed scheme (whitelisting) to avoid dangerous schemes (e.g. javascript:-URLs )
don’t forget URLs in redirector scripts
A Content Security Policy may be used as an additional security measure, but is not sufficient by itself to prevent attacks.

XML and internal data escaping

Avoid XML if possible.
For XML, use well-tested, high-quality libraries, and pay close attention to the documentation. Know your library – some libraries have functions that allow you to bypass escaping without knowing it.
If you parse (read) XML, ensure your parser does not attempt to load external references (e.g. entities and DTDs).
For other internal representations of data, make sure correct escaping or filtering is applied. Try to use well-tested, high-quality libraries if available, even if it seems to be more difficult.
If escaping is done manually, ensure that it handles null bytes, unexpected charsets, invalid UTF-8 characters etc. in a secure manner.

XML, JSON and general API security

Ensure proper access control to the API
Do not forget that you need to correctly escape all output to prevent XSS attacks, that data formats like XML require special consideration, and that protection against Cross-site request forgery (CSRF) is needed in many cases.
Use standard data formats like JSON with proven libraries, and use them correctly. This will probably take care of all your escaping needs.
Make sure browsers do not misinterpret your document or allow cross-site loading
- Ensure your document is well-formed
- Send the correct content type
- Use the X-Content-Type-Options: nosniff header
- For XML, provide a charset and ensure attackers cannot insert arbitrary tags
- For JSON, ensure the top-level data structure is an object and all characters with special meaning in HTML are escaped

(Un)trusted input

Thoroughly filter/escape any untrusted content
If the allowed character set for certain input fields is limited, check that the input is valid before using it
If in doubt about a certain kind of data (e.g. server variable), treat it as untrusted
If you are sure, but there is no real need to treat it as trusted, treat it as untrusted
The request URL (e.g. in environment variables) is untrusted
Data coming from HTTP headers is untrusted
- Referer
- X-Forwarded-For
- Cookies
- Server name (!)
All POST and GET data is untrusted
- includes non-user-modifiable input fields like select
All content validation is to be done server side

Cross-site request forgery (CSRF)

Include a hidden form field with a random token bound to the user’s session (and preferably the action to be performed), and check this token in the response
Make sure the token is non-predictable and cannot be obtained by the attacker
- do not include it in files the attacker could load into his site using <script> tags
Referer checks are not secure, but can be used as an additional measure

Clickjacking

Prevent (i)framing of your application in current browsers by including the HTTP response header “X-Frame-Options: deny”
Prevent (i)framing in outdated browsers by including a JavaScript frame breaker which checks for (i)framing and refuses to show the page if it is detected
For applications with high security requirements where you expect users to use outdated browsers with JavaScript disabled, consider requiring users of older browsers to enable JavaScript

Insecure data transfer

Use SSL/TLS (https) for any and all data transfer
Do not start communicating via http, only redirecting to https when “needed”
Mark cookies with the “secure” attribute
Use the Strict-Transport-Security header where possible
Educate users to visit the https:// URL directly
If your web application performs HTTPS requests, make sure it verifies the certificate and host name
- Consider limiting trusted CAs if connecting to internal servers

Session fixation

Regenerate (change) the session ID as soon as the user logs in (destroying the old session)
Prevent the attacker from making the user use his session by accepting session IDs only from cookies, not from GET or POST parameters (PHP: php.ini setting “session.use_only_cookies”)

Session stealing

Set the “HttpOnly” attribute for session cookies
Generate random session IDs with secure randomness and sufficient length
Do not leak session IDs

Truncation attacks, trimming attacks

Avoid truncating input. Treat overlong input as an error instead.
If truncation is necessary, ensure to check the value after truncation and use only the truncated value
Make sure trimming does not occur or checks are done consistently
Introduce length checks
- care about different lengths due to encoding
Make sure SQL treats truncated queries as errors by setting an appropriate SQL MODE

Password security

Do not store plain-text passwords, store only hashes
Use Argon2, scrypt, bcrypt, or some other secure hashing algorithm specifically designed for secure password "storage".^[1]^[2]
Use per-user salts
Use strengthening (i.e. multi-iteration hashing to slow down brute force attempts)
Limit login attempts per IP (not per user account)
Enforce reasonable, but not too strict, password policies
If a password reset process is implemented, make sure it has adequate security. Questions like “mother’s maiden name” can often be guessed by attackers and are not sufficient.

Comparison issues

Know comparison types in your programming language and use the correct one
When in doubt (especially with PHP), use a strict comparison (PHP: "===")
When comparing strings for equality, make sure you actually check that the strings are equal and not that one string contains the other

PHP-specific issues

Do not use the short form “<?”, always use the full form “<?php”
When using the nginx web server, make sure to correctly follow the official installation instructions and pay attention to the "Pitfalls" page. Beware of tutorials that often contain working but insecure configuration examples.
preg_replace can act as eval() in certain cases. Avoid passing user input to it. If you must, correctly filter and escape it.
Use the Suhosin (including the patch, if possible) and configure it with strict rules
- Enable suhosin.executor.disable_emodifier
- Enable suhosin.executor.disable_eval if possible
- Set suhosin.mail.protect to 2 if possible
When updating PHP to PHP 5.4 from an older version, ensure legacy applications do not rely on magic quotes for security.

Prefetching and Spiders

Use POST requests instead of GETs for anything that triggers an action

Special files

Know the meaning of these files
Ensure robots.txt does not disclose "secret" paths
Ensure crossdomain.xml and clientaccesspolicy.xml do not exist unless needed
If used, ensure crossdomain.xml and clientaccesspolicy.xml allow access from trusted domains only
Prevent users from uploading/changing special files (see file upload vulnerabilities section)

SSL, TLS and HTTPS basics

Follow SSLLabs best practices including:
- Ensure SSLv2 is disabled
- Generate private keys for certificates yourself, do not let your CA do it
- Use an appropriate key length (usually 2048 bit in 2013)
- If possible, disable client-initiated renegotiation
- Consider to manually limit/set cipher suites

Miscellaneous points

This section contains some general security hints for web applications.

Always remember

Do not rely on Web Application Firewalls for security (however, consider using them to improve security)
If external libraries (e.g. for database access, XML parsing) are used, always use current versions
If you need random numbers, obtain them from a secure/cryptographic random number generator
For every action or retrieval of data, always check access rights
Do not, under any circumstances, attempt to implement cryptographic algorithms yourself. Use high-level libraries for cryptography.
Ensure debug output and error messages do not leak sensitive information
Mark problematic debug output in your code (e.g. //TODO DEBUG REMOVE) even if you intend to remove it after just one test
Do not use “eval()” and similar functions
- Avoid “system()” and similar functions if possible
Ensure database servers are not directly reachable from the outside
Consider to block old browsers from using your application

Rationale

Web Application Firewalls (WAFs) can prevent existing security holes from being abused. They will make attacking your web application significantly harder and more annoying for the attacker, increasing the probability that a non-determined attacker will move on to a different target. However, they can usually be bypassed by a determined attacker. Your actual defense is to secure your applications. The WAF is there to provide some additional protection against mistakes in doing so. Having a Web Application Firewall does not allow you to skimp on securing your applications. A WAF that is not precisely tuned to an application will often block legitimate requests and pass attacks through/allow bypassing. This is especially true of the often-used free Core Rule Set of mod_security.

Outdated library versions may contain security issues.

If low-quality random numbers are used, for example for the generation of password reset tokens, attackers may be able to guess the value and circumvent security measures.

Not checking access rights at every step leads to significant vulnerabilities, for example users being able to look at data for which they have no permission (e.g. membership database supposed to show a logged-in member his information – changing the ID in the URL gives information about other members due to missing check).

Cryptography is extremely complicated and mistakes are hard to avoid or discover even for cryptography experts. Secure ciphers are developed over months of work by multiple experts and reviewed by hundreds of them. Do not try to invent a secure cipher. Do not attempt to implement existing ciphers, either, mistakes can go unnoticed and make your result insecure. Use existing, reliable libraries.

Debug output and error messages can give attackers valuable information. Notably, there have been multiple instances where debug output of the following form compromised security: “Provided token 1234 was invalid, expected value 5678” (the attacker gets the correct answer which he just needs to supply in his next attempt). For production versions, displaying of error messages should usually be suppressed. Consider replacing HTTP error pages to hide even basic information like paths.

Marking any debug output that is supposed to be removed ensures that you cannot forget removing it – just search for “TODO” and “REMOVE” before release. Make it a habit to mark it always, even if you intend to remove it “immediately”. You can always get distracted and forget.

Using dynamic code via “eval()” and similar functions is usually unnecessary and small mistakes tend to cause code injection issues. Therefore, these dangerous functions are to be avoided. The same is valid for “system()”, however, this cannot be always avoided. If used, input to “system()” has to be correctly escaped, of course — using existing shell-escape functions or a function that automatically escapes the parameters.

Keeping database servers unreachable from the outside, e.g. by binding them to 127.0.0.1 if they run on the same machine as the web application or by using firewalls with IP white lists, prevents attackers from using stolen database passwords to actually access the database.

Browsers that are no longer supported by the vendor tend to have critical security issues. They are a sign of a badly maintained client that is very prone to malware attack and they often lack security features relevant to web applications. Blocking can be done using the User-Agent header or for IE using conditional comments. Blocking outdated browsers can force clients to use a secure browser; however, it can prevent people who can’t update their browser from using the application. Unless you know that all clients should be having IE8 or newer, blocking anything newer than IE6 is not advised. Note that the Mozilla Firefox 3.6.x branch is still supported (as of September 2011), while the Firefox 4 branch is not. Obviously, intentional blocking of older browsers is mainly relevant to web applications that require very high security.

File inclusion and disclosure

If the names of files that are to be included or sent in response to a request are coming from user input (e.g. in menu systems or download scripts), attackers may be able to request files that they are not supposed to. If user-supplied names are used for inclusions, this can even lead to code execution on the server.

To prevent this type of attack

Do not take file names for inclusions from user input, only from trusted lists or constants.
- If user input is to be used, validate it against a whitelist. Checking if the file exists or if the input matches a certain format is not sufficient.
Avoid having scripts read and pass through files if possible.
If you read and deliver files using user-supplied file names, thoroughly validate the file names to avoid directory traversal and similar attacks and ensure the user is allowed to read the file.
Ensure the application runs with no more privileges than required.

Rationale

If the attacker is able to upload a script file and get a part of the application to include it, he can execute arbitrary code. As this poses an extremely great risk, it has to be carefully avoided. Thus, only the strictest kind of verification (checking against a whitelist) is appropriate for this task. If files are to be offered for download, simply putting them in a directory and letting the web server handle the rest is often the best choice: Not only is it faster than having a script read the file; it also avoids risky interpretation of user-supplied file names. In some cases, this is unavoidable, e.g. if a script is needed to enforce that only logged-in users can download files or to set special headers (see next section).

In that case, make sure you correctly perform the access checks that make the script-based approach necessary and that you thoroughly validate the file names to stop the attacker from downloading files he is not supposed to download. You especially need to make sure that the attacker cannot specify other files than intended, especially not outside of the intended directory, e.g. by using the “..” pseudo-directory name. (Note that a “../” can be encoded in many ways!)

Running the application with limited privileges (usually done by limiting the privileges of the web server or script interpreter) limits the impact of such (and other) issues.

File upload vulnerabilities

Web servers apply specific criteria (e.g. file extension) to decide how to process a file. If an application allows file uploads (e.g. for profile pictures, attached documents), ensure that the uploaded files cannot be interpreted as script files by the web server. Otherwise, the attacker may upload a script in your application’s programming language and run the arbitrary code contained therein by requesting the uploaded file.

Additionally, an attacker could upload custom HTML or JavaScript files and direct a victim to them. Since they come from a directory inside your application, this can be used to subvert the same-origin policy protection by the victim’s browser, for example to steal cookies. Some broken browsers (notably Internet Explorer) ignore the MIME type of files in some cases and detect the file type based on the file content.

To prevent this type of attack

Avoid unnecessary file uploads
Ensure that files uploaded by the user cannot be interpreted as script files by the web server, e.g. by checking the file extension (or whatever means your web server uses to identify script files)
Ensure that files cannot be uploaded to unintended directories (directory traversal)
Try to disable script execution in the upload directory
Ensure that the file extension matches the actual type of the file content
If only images are to be uploaded, consider re-compressing them using a secure library to ensure they are valid
Ensure that uploaded files are specified with the correct Content-type when delivered to the user
Prevent users from uploading problematic file types like HTML, CSS, JavaScript, XML, SVG and executables using a whitelist of allowed file types
Prevent users from uploading special files (e.g. .htaccess, web.config, robots.txt, crossdomain.xml, clientaccesspolicy.xml)
Prevent users from overwriting application files
Consider delivering uploaded files with the “Content-disposition: attachment” header

Rationale

File upload facilities are hard to protect correctly. If they are provided to support “gimmick” functions, they may not be worth the risk.

It is crucial that the web server will not attempt to interpret the uploaded files as scripts, as this could result in arbitrary code execution. Make sure to use the same method as your web server for deciding whether a file will be interpreted as a script or not.

Directory traversal attacks could allow an attacker to overwrite application or server files. Preventing these is also necessary to ensure that disabling script execution for the upload directory is actually effective.

Disabling script execution ensures that if attackers manage to upload a script file, it will still not be executed. However, this should not be relied upon: If the application gets transferred to a different server, the setting could get lost.

Mismatched file names/extensions can be used to upload forbidden data types (e.g. HTML, XML, SVG - see below). Even if the server sets the Content-type according to the extensions, some browsers may ignore this, analyze the file contents (MIME sniffing) and parse the file as HTML.

Re-compressing images ensures that any malicious content is destroyed. However, the image processing library needs to be secure, as it is exposed to user content and could be attacked using e.g. buffer overflow exploits.

Specifying the correct Content-type when delivering the files ensures that the file will be handled correctly by most browsers. This is required for correct functionality, but also relevant for security as incorrect handling of the file could lead to MIME sniffing, resulting in security issues.

User-uploaded HTML, CSS, JavaScript and similar files can contain scripts that run in the origin of the web site and thus are allowed to access cookies or web site content. XML and SVG files are often overlooked, but can also execute scripts. This can lead to various attacks like session stealing, CSRF etc. Executables can be dangerous to the user and should therefore be blocked. A whitelist should be used as creating a reliable and complete list of dangerous extensions is not possible. ZIP files can be dangerous for outdated browsers (notably Firefox 2.x). Note that various files are technically also ZIP files, notably documents from OpenOffice (e.g. odt, ods) and Microsoft Office 2007 and newer (e.g. docx, xlsx).

Special files like .htacces, web.config, robots.txt, crossdomain.xml and clientaccesspolicy.xml could allow attackers to change security settings (.htaccess, web.config), cause load (robots.txt) or allow cross-site scripting/cross-site request forgery attacks using plugins (crossdomain.xml and clientaccesspolicy.xml). Note that crossdomain.xml files are also valid if they appear in subdirectories.

Allowing the user to overwrite files belonging to the application can not only damage the application, but also allow other attacks, e.g. make code execution possible or enable the attacker to change critical settings.

The Content-disposition: attachment header forces browsers to save the file instead of immediately opening it, thus reducing the risk for some of the attacks. Note that this can significantly annoy the users and is not possible in all situations.

The following resources provide additional information on this topic:

Guide about MIME sniffing: http://h-online.com/-746229
MediaWiki resources about upload protection:

SQL injection

An SQL injection vulnerability occurs if user input included in database queries is not escaped correctly. This type of vulnerability allows attackers to change database queries, which can allow them to obtain or modify database contents.

To prevent this type of attack

use prepared statements to access the database – or –
use stored procedures, accessed using appropriate language/library methods or prepared statements
Always ensure the DB login used by the application has only the rights that are needed

Rationale

Escaping input manually is error-prone and can be forgotten. With prepared statements, the correct escaping is automatically applied. This also avoids issues with different input interpretation (charset, null byte handling etc.) which can lead to hard-to-find vulnerabilities. Using a database login with limited access rights limits the impact of successful attacks.

Exploitation

SQL injection can compromise any information in the database and even lead to full system compromise. It can be used to add PHP, HTML, and JavaScript code to web pages and create files. Arbitrary content added to the website can be used for malicious attacks against users and to gain shell access to the server.

Example

If the input for the title of the page on this website were vulnerable to SQL injection then the URL that would be used for the attack is https://en.wikibooks.org/w/index.php?title=. A simple test to reveal if the input is vulnerable would be to add https://en.wikibooks.org/w/index.php?title=' because this SQL syntax would break the query and show an SQL error on the page. The next query could be to select usernames and hashed passwords with something like https://en.wikibooks.org/w/index.php?title=1%20UNION%20ALL%20SELECT%20user_pass%20FROM%20wiki_user;--. The ;-- on the end ends the query and makes the remaining query a comment. Files containing password salts could be dumped to allow an attacker to begin cracking passwords and gain access to administrator accounts using the select load_file() query. A query like this one could be used to gain shell access to the server: https://en.wikibooks.org/w/index.php?title=UNION%20SELECT%20<? system($_REQUEST['cmd']); ?>,2,3%20INTO%20OUTFILE%20"shell.php";--

Cross-site scripting (XSS)

XSS vulnerabilities occur if user input included in the output of a web application is not escaped correctly. This type of vulnerability allows attackers to inject content into the web application output. This can be used to inject a false login form (reporting the input to an attacker) or malicious JavaScript code which can steal cookies and information or execute actions using the user’s permissions. XSS vulnerabilities are separated into two main categories, reflected (non-persistent) and persistent vulnerabilities.

Reflected XSS vulnerabilities include the user input only in the output directly following the request. Thus, the attacker needs the user to follow a malicious link or make a malicious POST request. The former can be done by including the link as an IFRAME; the latter can be done using JavaScript. Both vulnerabilities do require that the user visits a malicious/compromised site, but they do not necessarily require user interaction.

Persistent XSS vulnerabilities store the user input and include it later outputs (e.g. a posting in a forum). This means that the users do not need to visit a malicious/compromised site.

To prevent this type of attack

Escape anything that is not a constant before including it in a response as close to the output as possible (i.e. right in the line containing the “echo” or “print” call)
If not possible (e.g. when building a larger HTML block), escape when building and indicate the fact that the variable content is pre-escaped and the expected context in the name
Consider the context when escaping: Escaping text inside HTML is different from escaping HTML attribute values, and very different from escaping values inside CSS or JavaScript, or inside HTTP headers.
- This may mean that you need to escape for multiple contexts and/or multiple times. For example, when passing a HTML fragment as a JS constant for later includsion in the document, you need to escape for JS string inside HTML when writing the constant to the JavaScript source, then escape again for HTML when your script writes the fragment to the document. (See rationale for examples)
- The attacker must not be able to put anything where it is not supposed to be, even if you think it is not exploitable (e.g. because attempts to exploit it result in broken JavaScript).
Explicitly set the correct character set at the beginning of the document (i.e. as early as possible) and/or in the header.
Ensure that URLs provided by the user start with an allowed scheme (whitelisting) to avoid dangerous schemes (e.g. javascript:-URLs )
don’t forget URLs in redirector scripts
A Content Security Policy may be used as an additional security measure, but is not sufficient by itself to prevent attacks.

Rationale

Escaping data directly at the output location makes it easier to check that all outputs are escaped – each and every variable used as a parameter for an output method must either be marked as pre-escaped or be wrapped in a corresponding escape command.

Different contexts require completely different escaping rules. A “)” character with no dangerous meaning in HTML and HTML attributes can signify the end of an URL path in CSS. See the example at the bottom for a complex but common case where HTML and JavaScript are used together and create countless opportunities for XSS. Note that many simple XSS attempts are "accidentally" blocked even by the wrong escaping (e.g. HTML escaping mangles quotes required for a JavaScript string injection, or newlines creating invalid JavaScript in case of injection attempts). Do NOT rely on this. The attacker may know a trick you are not thinking about. If it is possible to place anything in a place of the document structure where it is not supposed to go (e.g. outside a JavaScript string literal), it is a security issue that must be fixed. It might not be exploitable - or you may simply not be seeing the way to exploit it. Don't take that risk!

Not setting the character set may lead to guessing by the browser. Such guessing can be exploited to pass a string that seems harmless in your intended encoding, but is interpreted as a script tag in the encoding assumed by the browser. For HTML5, use <meta charset="utf-8" /> as the first element in the head section.

URLs can be dangerous, too. User-provided links should be checked against a scheme whitelist, as the javascript scheme is not the only dangerous one. Other schemes can trigger possibly unwanted action. If only web links are to be allowed, require the URLs to start with “http://” or “https://”.

A Content Security Policy can prevent certain kinds of injection. Only some browsers support it; others simply ignore it. It is a powerful secondary defense to limit the impact of security issues, but cannot be used as the primary way to prevent XSS - the primary way to prevent XSS is correct escaping, which will not only prevent XSS, but also ensure that your page displays correctly even in the presence of uncommon input. Implementing a CSP may require significant changes to your code. Notably, you cannot include any inline JavaScript (unless you explicitly allow inline JS in your CSP - which removes most of the protection CSPs provide).

Complex XSS example with JS inside HTML

Often overlooked issues include the complex interaction between HTML and JavaScript. A often-used construct is something like this:

<script>
  var CURRENT_VALUE = 'test';
  document.getElementById("valueBox").innerHTML = CURRENT_VALUE; // INSECURE CODE - DO NOT USE.
</script>

The content of CURRENT_VALUE (in this example, the word test) is inserted into the page source dynamically by the server according to e.g. user input or a value from a database. The second line, which actually writes the data to the document, is often part of a script included from a file. There are many different ways to perform XSS attacks against such a construct, unless proper escaping is used in every step. In our examples, the attacker wants to execute the code alert(1);.

First, if proper escaping for JavaScript is missing, the attacker can simply provide the appropriate quote symbol to terminate the string, a semicolon, his code, and then comment out the rest of the line. For example, the attacker could provide the value ';alert(1);//, resulting in the following HTML code, executing his code:

<script>
  var CURRENT_VALUE = '';alert(1);//';
  document.getElementById("valueBox").innerHTML = CURRENT_VALUE;
</script>

Note that this will work even if the value is escaped using a HTML-escaping function like htmlspecialchars() if that function doesn't touch the single-quote used in this example.

Assuming the attacker cannot use the appropriate quote, because it is filtered, he can use the value </script><script>alert(1);</script>. Inside a regular JavaScript file, the resulting line would not immediately cause a problem (though assigning it to innerHTML would), since the following is a perfectly safe variable assignment:

var CURRENT_VALUE = '</script><script>alert(1);</script>';

Since, however, this appears in an inline script block, the HTML parser will interpret the "script-end" tag, resulting in a broken piece of JavaScript, followed by a second script block containing the attacker's code, some text, and a spurious script-end tag:

<script>
  var CURRENT_VALUE = '</script><script>alert(1);</script>';
  document.getElementById("valueBox").innerHTML = CURRENT_VALUE;
</script>

Or, reindented for clarity:

<script>
  var CURRENT_VALUE = '
</script>
<script>alert(1);</script>
'; document.getElementById("valueBox").innerHTML = CURRENT_VALUE;
</script>

The attacker can also simply break the JavaScript by inserting a backslash at the end of the string, thus escaping the quote at the end:

var CURRENT_VALUE = 'text\';

A simple newline anywhere in the string will also cause a syntax error (unterminated string literal). While these attacks do not allow direct XSS in this example, they may break critical security features, render the site unusable (Denial of Service), or allow XSS if another value can be manipulated - here the attacker supplies text\ and ;alert(1);' to a variant of this construct that passes two values:

var CURRENT_VALUE1 = 'text\'; var CURRENT_VALUE2 = ';alert(1);'';

Since the string-ending quote was escaped, the quote that is supposed to start the second string instead closes the first, turning the remaining content into JavaScript. This brings us to the statement above: If it is possible to place anything in a place of the document structure where it is not supposed to go (e.g. outside a JavaScript string literal), it is a security issue that must be fixed. It might not be exploitable - or you may simply not be seeing the way to exploit it. Don't take that risk!

These are only issues with the first line in our example. The second line directly inserts the value into the document as HTML, thus allowing XSS. To exploit this, the attacker must avoid the script end tag due to the issue mentioned above, so he uses a non-existing image with an error handler. His input <img src=1 onerror=alert(1)> results in:

<script>
  var CURRENT_VALUE = '<img src=1 onerror=alert(1)>';
  document.getElementById("valueBox").innerHTML = CURRENT_VALUE;
</script>

The innerHTML assignment puts the image tag into the document, and since "1" is not a valid URL, the error handler is executed. Note that this is not perfectly valid HTML, since the quotes around the attributes are missing. It is still valid enough to work, and avoids the quotes being mangled due to escaping.

Simply HTML escaping output value using functions like htmlspecialchars() on the server side (when writing it to the variable assignment line) will prevent some of these attacks and might make others unexploitable or harder to exploit. However, it is incorrect and dangerous and will leave other means of attack!

Most notably, the attacker might decide to do what you should have done, and properly escape his attack sequence for you. This will leave the backslash \ as the only special character, giving an input like \u003Cimg src=1 onerror=alert(1)\u003E (note that any remaining character, i.e. the spaces, braces, equals signs and letters could also be escaped). This will be unharmed by your escape function, resulting in the following code:

<script>
  var CURRENT_VALUE = '\u003Cimg src=1 onerror=alert(1)\u003E';
  document.getElementById("valueBox").innerHTML = CURRENT_VALUE;
</script>

The JavaScript parser will interpret the escape sequeces and insert the XSS code into your document.

There are two correct ways to escape in this situation:

Method 1 - JS escaping server side, HTML escaping client side (recommended)
- On the server, properly (see below) escape the value using JavaScript escape values.
- In the client-side JavaScript, ensure your code escapes the text before inserting it into the document, using e.g. the .text() setter of jQuery.

Method 2 - HTML escaping server side, JS escaping client side (not recommended)
- On the server, first escape the value for HTML
- On the server, then properly (see below) escape the value using JavaScript escape values before inserting it into the document.

Method 2 allows you to deliver server-generated custom HTML to the client. You need to escape the HTML like any other HTML output (e.g. using htmlspecialchars in PHP). The escaped content then gets passed to the client side, which directly dumps it into the document. This means the client side cannot use the text for any non-HTML context, and attempting to do so may lead to a security issue. As you can see, the escaping is done in reverse order: The format that gets interpreted last (HTML, in this case) gets escaped first, then the entire string is "wrapped" by escaping in the outer format.

The recommended approach is to keep text unescaped until it is ready for output, then escape right before it is output (i.e. when the context is known). Consistently following this approach will also avoid double-encoding (i.e. showing your users HTML entities like & in the text).

How to properly escape for JavaScript inside HTML: Ensure that characters like < which have no special meaning in JavaScript but do have a special meaning in HTML also get escaped. Do not write your own escaping routines, you will most likely miss something. Use existing libraries. For current versions of PHP, you may want to consider using json_encode() with the additional flags set:

...
<script>
  var CURRENT_VALUE = <?php echo json_encode($text,
        JSON_HEX_QUOT | JSON_HEX_TAG | JSON_HEX_AMP | JSON_HEX_APOS); ?>;
    $("#valueBox").text(CURRENT_VALUE);
</script>
...

The text will now be correctly rendered, even if it includes weird special characters.

XML and internal data escaping

Escaping is required in internal data representations, too. For example, incorrectly escaped strings in XML could allow the attackers to close their including tag and inject arbitrary XML.

XML is a very complex format which can bear many unpleasant surprises.

To prevent this type of attack

Avoid XML if possible.
For XML, use well-tested, high-quality libraries, and pay close attention to the documentation. Know your library – some libraries have functions that allow you to bypass escaping without knowing it.
If you parse (read) XML, ensure your parser does not attempt to load external references (e.g. entities and DTDs).
For other internal representations of data, make sure correct escaping or filtering is applied. Try to use well-tested, high-quality libraries if available, even if it seems to be more difficult.
If escaping is done manually, ensure that it handles null bytes, unexpected charsets, invalid UTF-8 characters etc. in a secure manner.

Rationale

XML is a highly complex format with many surprising features - did you know that XML can load other content via HTTP? If you just want to store/pass a few structured values, the powerful features of XML are often unnecessary. JSON is a less complex alternative, but requires its own safety measures (like avoiding arrays at top level and hex-encoding special characters that may be interpreted by broken browsers).

XML is too complex to “just quickly” write code that handles all possibilities correctly and safely. Do not rely on the security of “home-made” minimal libraries. Even some “official” XML libraries are known to have escaping issues in some functions or to explicitly allow content to be passed into the XML without escaping. (Notably the addChild method in PHP’s SimpleXML does partial escaping, see comments for PHP bug 36795) Libraries can contain critical issues, too. Read the documentation of your library carefully and consider searching the internet for known issues. If you are not sure, quickly test at least some basic cases.

XML has features that allow loading of external data like entities and DTDs. Some parsers enable this by default. If you parse untrusted XML files (remember, everything that comes from a user is untrusted), this may be used to read local files, make requests to internal systems not accessible from outside the firewall, and in some cases, even execute code. See OWASP article for details.

Doing escaping manually is very difficult to do correctly, as all problematic cases (e.g. partial UTF8 characters or different charsets) need to be considered. Writing a solution that works correctly with regular input may be fast and easy, but writing a solution that works correctly with any intentionally malformed input is difficult.

XML, JSON and general API security

APIs can provide additional security challenges. At the same time, basic security rules (like output escaping) must not be overlooked.

To prevent this type of attack

Ensure proper access control to the API
Do not forget that you need to correctly escape all output to prevent XSS attacks, that data formats like XML require special consideration, and that protection against Cross-site request forgery (CSRF) is needed in many cases.
Use standard data formats like JSON with proven libraries, and use them correctly. This will probably take care of all your escaping needs.
Make sure browsers do not misinterpret your document or allow cross-site loading
- Ensure your document is well-formed
- Send the correct content type
- Use the X-Content-Type-Options: nosniff header
- For XML, provide a charset and ensure attackers cannot insert arbitrary tags
- For JSON, ensure the top-level data structure is an object and all characters with special meaning in HTML are escaped

Rationale

Certain actions are often restricted to users with appropriate privileges. However, some developers forget to properly restrict their API, thus allowing users without proper privileges to perform these actions. Ensure that the API properly enforces access controls. Remember that you still need CSRF protection! A separate client can easily fetch a token (but will need the user's credentials to do so), while a malicious JavaScript can't (due to the same-origin policy).

Even if your application is not displaying the API output, the attacker may use it for XSS attacks by directly linking to it. For this reason, you must follow proper escaping rules and keep browsers from misinterpreting your output.

If you use standard data formats like JSON, you can use standard libraries which have been thoroughly checked by many professionals. This will make it easier for you to correctly escape content, and save you a lot of time (and potential security issues).

Certain browsers love to interpret anything that looks like it may be HTML as HTML. This is especially true for XML documents (which may also represent other script-bearing formats like SVG). Sending a well-formed document and setting the correct content type makes it less probable that browsers will start guessing. The X-Content-Type-Options: nosniff header will stop browsers from attempting to guess the content type (most importantly, it will disable the aggressive guessing in Internet Explorer).

Providing the correct charset in XML is important because different charsets can cause vastly different interpretations of the data. For example, what is harmless text in UTF-8 or other common charsets can turn into a script tag in UTF-7.

JSON uses JavaScript syntax and could possibly be loaded across domain boundaries using <script> tags. Together with creative modification of the Array prototype, this can give access to the data (bypassing the same-origin policy) in outdated browsers. Passing an object instead of an array prevents this (as of 2013).

Escaping special characters in JSON is recommended to avoid content sniffing. In PHP, it can be done by passing the JSON_HEX_TAG flag to json_encode.

(Un)trusted input

All user input is to be considered untrusted. Seemingly “trusted/safe” input, like some $_SERVER variables in PHP, can be easily manipulated by attackers.

To prevent this type of attack

Thoroughly filter/escape any untrusted content
If the allowed character set for certain input fields is limited, check that the input is valid before using it
If in doubt about a certain kind of data (e.g. server variable), treat it as untrusted
If you are sure, but there is no real need to treat it as trusted, treat it as untrusted
The request URL (e.g. in environment variables) is untrusted
Data coming from HTTP headers is untrusted
- Referer
- X-Forwarded-For
- Cookies
- Server name (!)
All POST and GET data is untrusted
- includes non-user-modifiable input fields like select
All content validation is to be done server side

Rationale

Escaping or filtering “trusted” input that should not contain any characters that require escaping will only give you a negligible performance penalty, but you will be on the safe side if the input turns out to be untrusted.

Validating input data using a character whitelist can avoid attacks using unexpected characters (null bytes, UTF-8, control characters used as delimiters in internal representations etc.). Ensure your validation is not too strict, for example you will need to allow both UTF-8 and characters like ' in person name fields.

An attacker is not constrained by the constraints a browser puts on him. Just because an input field is specified with maxlength=20 does not mean that an attacker cannot craft a request with 200 KB of data. The same goes for any JavaScript based constraints.

Cross-site request forgery (CSRF)

Cross-site request forgery occurs if a third-party web site causes the browser of the logged-in user to make a request to your service. With GET forms, this can be done using IFRAMEs or IMG tags. With POST forms, this is possible using a FORM element with the action attribute pointed to your site, possibly submitted using JavaScript. Both methods require no user interaction. The browser automatically submits the session cookie of the user. This can allow an attacker to trigger unwanted action with the permissions of the logged-in user.

To prevent this type of attack

Include a hidden form field with a random token bound to the user’s session (and preferably the action to be performed), and check this token in the response
Make sure the token is non-predictable and cannot be obtained by the attacker
- do not include it in files the attacker could load into his site using <script> tags
Referer checks are not secure, but can be used as an additional measure

Rationale

CSRF attacks allow attackers to abuse existing user sessions. The same-origin-policy of web browsers prevents the attacking web site to read the content (and thus the token) of the targeted site. As the token is bound to the session, the attacker cannot gain the token by simply visiting the web site himself. The token needs to be non-predictable (secure randomness), as otherwise the attacker could simply guess it.

Referer checks are unreliable, as some user agents do not send the header and some personal firewalls filter or falsify it for privacy reasons. Additionally the attacker can avoid sending a Referer, for example (tested with IE8 and Firefox 6) simply by setting window.location using JavaScript.

Clickjacking

In Clickjacking attacks, the target site is embedded in an IFRAME on the attacking site and either kept in the background, but mostly covered by other elements or made transparent and kept in the foreground. The user is then incited to click a certain location (e.g. when using the transparency method by placing a button in the background). Instead of the visible button, the click hits the invisible window. The placement of the IFRAME and button is chosen so that the click triggers the action wanted by the attacker (e.g. change settings). As the user is logged into the target site, the click can trigger actions that would otherwise be unreachable for the attacker. Multiple Facebook spam waves were generated using this method.

To prevent this type of attack

Prevent (i)framing of your application in current browsers by including the HTTP response header “X-Frame-Options: deny”
Prevent (i)framing in outdated browsers by including a JavaScript frame breaker which checks for (i)framing and refuses to show the page if it is detected
For applications with high security requirements where you expect users to use outdated browsers with JavaScript disabled, consider requiring users of older browsers to enable JavaScript

Rationale

The X-Frame-Options header is required as JavaScript frame breakers could be ineffective in some newer browsers that allow undetectable framing. However, older, still common browsers ignore the header and thus require additional protection using classic JavaScript based frame breakers. Since (as opposed to the header method) those do not work if JavaScript is disabled, additional measures may be necessary.

Insecure data transfer

Data transferred unencrypted can be sniffed. This can not only give an attacker valuable information, but also the content of session cookies, allowing him to hijack a session. Additionally, non-secure communication can be modified by an attacker.

To prevent this type of attack

Use SSL/TLS (https) for any and all data transfer
Do not start communicating via http, only redirecting to https when “needed”
Mark cookies with the “secure” attribute
Use the Strict-Transport-Security header where possible
Educate users to visit the https:// URL directly
If your web application performs HTTPS requests, make sure it verifies the certificate and host name
- Consider limiting trusted CAs if connecting to internal servers

Rationale

Using https ensures all data transfer is encrypted and the server is authenticated. Redirects sent on unencrypted pages can be removed or modified by the attacker. Thus, the transition from plain http to https can be sabotaged, making any plain http communication before switching to https dangerous. Marking the cookies secure-only ensures they are never transferred via unencrypted connections to prevent sniffing.

The STS header ensures that after the first visit, even if users visit the http:// URL, the request is performed via secure https. This prevents attacks like the SSLstrip attack on the unencrypted redirect. Educating the user to visit the https:// URL directly provides this protection for the first request and browsers that do not support STS and thus ignore the header. This education can be supported by serving nothing or only an information page without a clickable link on port 80 to force users to enter the correct URL and remove the incentive to be lazy and omit the “https://”.

In some web applications, the web server performs HTTPS requests (for example when fetching or pushing data to APIs or running the OpenID or OAuth protocols). HTTPS is only secure if the software initiating the connection (i.e. your web application) correctly verifies the remote certificate:

Checks if the certificate is still valid
Checks if the certificate is signed by a trusted CA (a list of trusted CAs is needed)
Checks if the hostname you are connecting to matches the name in the certificate (the wrapper performing the SSL handling needs access to the host name)

Some libraries do not do this by default, making HTTPS connections insecure! Consider it suspicious if you are not required to provide a list of trusted CAs, or it looks like the SSL wrapper does not have access to the host name you are connecting to. To test for this issue, attempt to connect to a host that uses a non-expired selfsigned certificate, then attempt to connect to a host that uses a valid certificate, but use a different hostname (e.g. address the host by its IP address) than the one specified in the certificate. If either of these connections succeed, your library/configuration is insecure.

In PHP, both standard ways to perform HTTP(S) requests have issues: The cURL library doesn't check certificates by default if used with cURL below version 7.10. The Stream API always requires explicit configuration (affecting all functions using url_fopen, e.g. fopen(), file(), file_get_contents()). For cURL, set CURLOPT_SSL_VERIFYPEER and CURLOPT_CAINFO. For the Stream API, use a stream context with the verify_peer, CN_match and cafile SSL context options.

If you are connecting to internal servers, consider limiting the list of trusted CAs to the CA you are using. This reduces the risk from compromised/malicious CAs. The default CA bundles often include CAs which you may not consider trustworthy, e.g. the Chinese internet authority CNNIC.

Session fixation

In a session fixation attack, an attacker creates an unauthenticated session and then tricks a user to use and authenticate the session. As soon as the user has authenticated, the attacker can then use the session, as he knows the session id.

To prevent this type of attack

Regenerate (change) the session ID as soon as the user logs in (destroying the old session)
Prevent the attacker from making the user use his session by accepting session IDs only from cookies, not from GET or POST parameters (PHP: php.ini setting “session.use_only_cookies”)

Rationale

Regenerating the ID makes the old session ID worthless to the attacker. Even if the attacker manages to fix a session, his session will never be authenticated. The second countermeasure is aimed at making it impossible to fix the session. However, XSS or similar issues with other applications on the same domain (not necessarily sub-domain!) may allow attackers to set false cookies.

Session stealing

An attacker who is able to obtain or guess the session ID can steal the session and abuse the privileges of the user.

To prevent this type of attack

Set the “HttpOnly” attribute for session cookies
Generate random session IDs with secure randomness and sufficient length
Do not leak session IDs

Rationale

Setting the “HttpOnly” attribute on cookies prevents them from being read using JavaScript. This makes it harder to perform successful XSS attacks. Random, secure session IDs prevent the attacker from guessing a valid session ID. Ensuring that session IDs do not leak, for example in Referer information, copied links and HTML content from the site etc. makes sure that the attacker cannot obtain the session ID in this way.

Truncation attacks, trimming attacks

Truncating input can be problematic if the truncation affects comparisons (e.g. checking users against a blacklist before truncation, and then truncating the name to perform the login). SQL queries can be truncated if they exceed a certain length. This can be used to execute a query with significantly different meaning (e.g. cutting of a part of a WHERE clause). Strings can also be automatically trimmed (leading/trailing whitespace removed), leading to the same vulnerabilities (e.g. checking the input "eviluser␣" against the blacklist, then logging in "eviluser"). SQL may do such trimming automatically.

To prevent this type of attack

Avoid truncating input. Treat overlong input as an error instead.
If truncation is necessary, ensure to check the value after truncation and use only the truncated value
Make sure trimming does not occur or checks are done consistently
Introduce length checks
- care about different lengths due to encoding
Make sure SQL treats truncated queries as errors by setting an appropriate SQL MODE

Rationale

Avoiding truncation makes sure no issues can arise. If truncation is applied, performing all necessary checks after the truncation and using only the truncated value is equivalent to receiving the value in truncated condition. The same rules apply for trimming. Length checks prevent unexpected truncation due to length limits. Encoding needs to be taken into account because the byte-lengths and character-lengths of a UTF-8 string may be different. Setting the SQL MODE so that truncation causes errors ensures that truncation cannot be abused to modify queries. However, the resulting errors can still cause queries to fail unexpectedly, which should be handled in a secure manner.

Password security

Most web applications use username/password combinations to manage access.

To keep password-based login mechanisms secure

Do not store plain-text passwords, store only hashes
Use Argon2, scrypt, bcrypt, or some other secure hashing algorithm specifically designed for secure password "storage".^[3]^[4]
Use per-user salts
Use strengthening (i.e. multi-iteration hashing to slow down brute force attempts)
Limit login attempts per IP (not per user account)
Enforce reasonable, but not too strict, password policies
If a password reset process is implemented, make sure it has adequate security. Questions like “mother’s maiden name” can often be guessed by attackers and are not sufficient.

Rationale

Users re-use passwords for multiple services. If an attacker gains access to one server and can gain a list of passwords, he may be able to use this password to attack other services. Therefore, only password hashes may be stored. Secure hashing algorithms are easy to use in most languages and ensure the original password cannot be easily recovered and that wrong passwords are not falsely accepted.

Adding salts to the password hashes prevents the use of rainbow tables and significantly slows down brute-force attempts. Strengthening slows both off-line brute-force attacks against stolen hashes and on-line brute-force in case the rate limiting fails. However, it increases CPU load on the server and would open a vector for DDoS attacks if not prevented with login attempt limiting. A good strengthening can slow down off-line brute-force attacks down by a factor of 10000 or more.

Limiting login attempts is necessary to prevent on-line brute-force attacks and DoS via the CPU usage of the password strengthening procedure. Without a limit, an attacker can try a very large number of passwords directly against the server. Assuming 100 attempts per second, which is reasonable for a normal web server, no significant strengthening and an attacker working with multiple threads, this would result in 259,200,000 passwords tried in a single month!

Not enforcing any password policies will lead to too many users choosing “123456”, “qwerty” or “password” as their password, opening the system up for attack. Enforcing too strict password policies will force users to save passwords or write them down, generally annoy them and foster re-using the same password for all services. Furthermore, users using secure passwords not matching the policies may be forced to use passwords which are harder to remember, but not necessarily secure. A password consisting of 5 concatenated, randomly (!) chosen lowercase dictionary words is significantly more secure than an eight-character password consisting of mixed case letters, numbers and punctuation. Take this into account if you do not get a password policy to implement, but have to design your own.

If an attacker cannot obtain the password, he may try to reset it. Often, answers to password reset questions are easy to find or guess. Questions alone are no sufficient protection. Consider using a question together with e-mail verification by sending a new temporary password, for example.

↑ Patrick Mylund Nielsen. "Storing Passwords Securely".
↑ Wikibook Cryptography/Secure Passwords describes more of the history and theory behind designing a hashing algorithm for password storage.
↑ Patrick Mylund Nielsen. "Storing Passwords Securely".
↑ Wikibook Cryptography/Secure Passwords describes more of the history and theory behind designing a hashing algorithm for password storage.

Comparison issues

When comparing values, know the behaviour of your programming language. For example in PHP, "==" is a loose comparison that ignores the type and may give you unexpected behaviour. "===" is used for exact comparison. Using the wrong type of comparison can lead to security issues.

To prevent comparison issues

Know comparison types in your programming language and use the correct one
When in doubt (especially with PHP), use a strict comparison (PHP: "===")
When comparing strings for equality, make sure you actually check that the strings are equal and not that one string contains the other

Rationale

Using a too loose comparison can easily cause security issues. For example, in PHP, the following will evaluate to TRUE:

 "a97e8342f0" == 0

The hex string, which could be a token or hash, is automatically parsed as an integer, and as it starts with a letter and thus cannot be parsed, the result is 0.

Accidentally checking for strings being contained instead of checking for strings being equal can allow attackers to bypass e.g. whitelist checks.

PHP-specific issues

When using the PHP language, several issues need to be considered.

When using PHP...

Do not use the short form “<?”, always use the full form “<?php”
When using the nginx web server, make sure to correctly follow the official installation instructions and pay attention to the "Pitfalls" page. Beware of tutorials that often contain working but insecure configuration examples.
preg_replace can act as eval() in certain cases. Avoid passing user input to it. If you must, correctly filter and escape it.
Use the Suhosin (including the patch, if possible) and configure it with strict rules
- Enable suhosin.executor.disable_emodifier
- Enable suhosin.executor.disable_eval if possible
- Set suhosin.mail.protect to 2 if possible
When updating PHP to PHP 5.4 from an older version, ensure legacy applications do not rely on magic quotes for security.

Rationale

PHP can support shortened PHP code start tags. If the option is enabled, both "<?php" and "<?" alone can start a PHP code block. However, if the option is disabled, "<?" will not be detected and the code will be delivered to the browser instead. This can lead to code disclosure. Using the full form ensures that the code will work correctly and won’t disclose the code if the server does not support short tags.

When using the nginx server, it is very easy to make critical configuration mistakes that allow users to pass image files to the PHP interpreter. See the "Pitfalls" page for mor information. It also provides valuable tips that will probably save you some time hunting down phantom issues, so you should read it if you use nginx.

preg_replace evaluates the replacement text as PHP code if the non-standard "e" modifier is given in the search RegExp. If an attacker can influence the RegExp to add this modifier and provide a custom replacement text, preg_replace allows arbitrary code execution. Be extremely careful when using this function, use preg_quote with a correctly set delimiter parameter for escaping when possible. If you must accept RegExp code from the user, ensure it cannot contain the delimiter (also consider attacks using malformed UTF-8, null bytes etc.) - but if possible, avoid it completely.

Suhosin can prevent certain attacks on web applications and disable insecure functions. The patch also protects internal memory structures against certain memory corruption attacks. (Also see the feature list for a complete list of features and the official explanation why Suhosin is useful.) Suhosin improves your security, but like Web Application Firewalls, it does not magically make all applications secure.

Disabling the e modifier prevents the above-mentioned vulnerabilities in preg_replace from being used by an attacker even if an application is vulnerable. The e modifier should never be used, an application that does not work with the e modifier disabled is broken. Banning eval may break legitimate applications. Consider running Suhosin in simulation mode first to discover (badly coded) applications that use it. Setting suhosin.mail.protect can prevent attacks that use your mail forms to send spam. (Again, use simulation mode first to determine if your applications are compatible with it.)

Magic quotes have been removed in PHP 5.4. An application that relies on them for security will become vulnerable if the update is installed. Note that this does not mean you should not update; instead, you should fix (i.e. rewrite or delete) the application. Magic quotes are not a suitable way to escape input and in most cases will not protect against all attack vectors. An application that relies on magic quotes is probably ancient and/or written without security in mind. Simply adding code that will emulate magic quotes is a bad idea.

Prefetching and Spiders

GET requests are not supposed/expected to trigger actions/changes and are happily followed by various browser mechanisms like Prefetching or Session Restore and by crawlers. This can cause unwanted actions to be triggered completely without user interaction and without the need for an attack.

To prevent this

Use POST requests instead of GETs for anything that triggers an action

Rationale

GET requests can be automatically and unintentionally triggered, for example by crawlers. For example in cases of “delete” buttons, this can cause a single user with aggressive Prefetching to accidentally delete everything just by opening a listing page. POST requests are expected to trigger actions and are handled accordingly by browsers.

Special files

Special files like .htaccess, robots.txt, crossdomain.xml and clientaccesspolicy.xml have special meanings which has to be considered before deploying such files.

To prevent this type of attack

Know the meaning of these files
Ensure robots.txt does not disclose "secret" paths
Ensure crossdomain.xml and clientaccesspolicy.xml do not exist unless needed
If used, ensure crossdomain.xml and clientaccesspolicy.xml allow access from trusted domains only
Prevent users from uploading/changing special files (see file upload vulnerabilities section)

Rationale

Special files like .htaccess, robots.txt, crossdomain.xml and clientaccesspolicy.xml define security relevant settings and rules. Knowing their meaning is necessary to use them securely.

.htaccess influences the behaviour and security relevant settings of the web server (e.g. access rights, executable file types, ...).

robots.txt can be ignored by malicious or badly written robots. As this file is publicly available, an attacker can gain valuable information about "interesting" paths (like administration interfaces) if they are mentioned in the robots.txt file. Attackers do check this file for such content.

crossdomain.xml and clientaccesspolicy.xml can disable the same-origin policy in some plug-ins. Incorrect configuration leaves the site open for cross-site scripting/cross-site request forgery attacks using plugins. Note that crossdomain.xml files are also valid if they appear in subdirectories.

SSL, TLS and HTTPS basics

SSL/TLS provide encryption and authentication for HTTPS. ^[1]

For maximum security

Follow SSLLabs best practices including:
- Ensure SSLv2 is disabled
- Generate private keys for certificates yourself, do not let your CA do it
- Use an appropriate key length (usually 2048 bit in 2013)
- If possible, disable client-initiated renegotiation
- Consider to manually limit/set cipher suites

Rationale

SSL is easy to do and hard to do right. SSLLabs provide good guidelines that are updated when new attacks are discovered.

The CA has no need-to-know for your private key. Depending on the cipher suite used, the private key can allow adversaries to decrypt passively eavesdropped communications. Thus, even if you trust the CA, it is better to avoid any risk. Generate a key and a CSR and provide only the CSR to the CA.

Increasing key length increases security, but also significantly increases the CPU load for connection establishment. 1024 bit keys will not be accepted by Mozilla Firefox anymore for certificates that expire after the year 2013. 2048 bit keys should be enough for all applications for quite a few years – using larger key sizes seems to be overkill. (All information based on 2013.) Note: The large CPU overhead of connection establishment can be used by (D)DoS attackers. Such DDoS attacks are harder to detect and defend against when client-initiated renegotiation is supported.

SSL/TLS supports a large set of “cipher suites”, each defining a set of cryptographic mechanisms used to secure the connection. Some of them do provide perfect forward secrecy, some do not. (Perfect forward secrecy means that if the private key becomes available to an attacker, he cannot decrypt data that was eavesdropped before he got the key). Usually, the client (browser) and server choose a cipher suite by first exchanging which suites are mutually supported, and the client’s preferred suite is then chosen. Depending on setup, the server may choose the cipher suite, ignoring the client’s preference. Most defaults are reasonably sane, but for either high-speed or high-security applications, you may want to consider restricting the supported/preferred suites to fast or high-security suites. If you want to exclude clients that do not support sufficient security (e.g. ancient “export control” limited clients), make sure to disable those cipher suites. When configuring cipher suites, carefully check the setup to make sure you do not allow “ADH” suites that do not authenticate the server! If you are unsure, keep the default, and always verify the effects of your settings!

Authors edit

The initial version of the Web Application Security Guide was written in 2011 by Jan Schejbal.

The main contributors to the current version are:

Jan Schejbal
(list yourself if you make significant contributions)

Other contributors who can be seen on the version tab of each page have helped to improve this guide.

↑ "About SSL/TLS". Instantssl.com. Retrieved 2016-04-29.

[1] Patrick Mylund Nielsen. "Storing Passwords Securely".

[2] Wikibook Cryptography/Secure Passwords describes more of the history and theory behind designing a hashing algorithm for password storage.

[3] Patrick Mylund Nielsen. "Storing Passwords Securely".

[4] Wikibook Cryptography/Secure Passwords describes more of the history and theory behind designing a hashing algorithm for password storage.

[5] "About SSL/TLS". Instantssl.com. Retrieved 2016-04-29.

[1]

[2]

[3]

[4]

[1]

Web Application Security Guide/Print version

Miscellaneous points

File inclusion and disclosure

File upload vulnerabilities

SQL injection

Cross-site scripting (XSS)

XML and internal data escaping

XML, JSON and general API security

(Un)trusted input

Cross-site request forgery (CSRF)

Clickjacking

Insecure data transfer

Session fixation

Session stealing

Truncation attacks, trimming attacks

Password security

Comparison issues

PHP-specific issues

Prefetching and Spiders

Special files

SSL, TLS and HTTPS basics

Always remember

Rationale

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

Exploitation

Example

To prevent this type of attack

Rationale

Complex XSS example with JS inside HTML

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

Further reading

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

To prevent this type of attack

Rationale

To keep password-based login mechanisms secure

Rationale

To prevent comparison issues

Rationale

When using PHP...

Rationale

To prevent this

Rationale

To prevent this type of attack

Rationale

For maximum security

Rationale