Web Application Security Guide/XML, JSON and general API security

APIs can provide additional security challenges. At the same time, basic security rules (like output escaping) must not be overlooked.

To prevent this type of attack

  • Ensure proper access control to the API
  • Do not forget that you need to correctly escape all output to prevent XSS attacks, that data formats like XML require special consideration, and that protection against Cross-site request forgery (CSRF) is needed in many cases.
  • Use standard data formats like JSON with proven libraries, and use them correctly. This will probably take care of all your escaping needs.
  • Make sure browsers do not misinterpret your document or allow cross-site loading
    • Ensure your document is well-formed
    • Send the correct content type
    • Use the X-Content-Type-Options: nosniff header
    • For XML, provide a charset and ensure attackers cannot insert arbitrary tags
    • For JSON, ensure the top-level data structure is an object and all characters with special meaning in HTML are escaped

Rationale

Certain actions are often restricted to users with appropriate privileges. However, some developers forget to properly restrict their API, thus allowing users without proper privileges to perform these actions. Ensure that the API properly enforces access controls. Remember that you still need CSRF protection! A separate client can easily fetch a token (but will need the user's credentials to do so), while a malicious JavaScript can't (due to the same-origin policy).

Even if your application is not displaying the API output, the attacker may use it for XSS attacks by directly linking to it. For this reason, you must follow proper escaping rules and keep browsers from misinterpreting your output.

If you use standard data formats like JSON, you can use standard libraries which have been thoroughly checked by many professionals. This will make it easier for you to correctly escape content, and save you a lot of time (and potential security issues).

Certain browsers love to interpret anything that looks like it may be HTML as HTML. This is especially true for XML documents (which may also represent other script-bearing formats like SVG). Sending a well-formed document and setting the correct content type makes it less probable that browsers will start guessing. The X-Content-Type-Options: nosniff header will stop browsers from attempting to guess the content type (most importantly, it will disable the aggressive guessing in Internet Explorer).

Providing the correct charset in XML is important because different charsets can cause vastly different interpretations of the data. For example, what is harmless text in UTF-8 or other common charsets can turn into a script tag in UTF-7.

JSON uses JavaScript syntax and could possibly be loaded across domain boundaries using <script> tags. Together with creative modification of the Array prototype, this can give access to the data (bypassing the same-origin policy) in outdated browsers. Passing an object instead of an array prevents this (as of 2013).

Escaping special characters in JSON is recommended to avoid content sniffing. In PHP, it can be done by passing the JSON_HEX_TAG flag to json_encode.

Further reading