1, Escaping must be performed in the last possible place = when writing HTML output
Only when writing a java variable content to the output stream we know if we write the content into a javascript variable, html attribute or as a plain text (html body).
2, Every variable of a String type deserves escaping. Don't escape output only in case you NEED to print HTML.
You may never know how a possibly malicious content got into your variable. It may be persistent
XSS from DB, session attribute, request parameter.
3, Use the right escaping for the right situation.
See OWASP's XSS_Prevention_Rules_Summary
Side note: Don't mix URL encoding with URL escaping.
See HTML appendix B.2.1 and B.2.2URL encoding
Though there can be UTF-8 characters in the URL, URLs should be transmitted in US-ASCII encoding.
Valid URL follows percent encoding of characters, described in Percent encoding of URI characters.
When URL is not encoded, UTF-8 characters are translated into US-ASCII OOTB by browsers. But if you want to be sure (you can't lose here) you can use new java.net.URI().toASCIIString(). In javascript there is a similar function window.encodeURI().
What MUST BE encoded is a URL parameter name and value. It's confusing but to encode the parameter use java.net.URLEncoder.encode(). Javascript has better name: window.encodeURIComponent().
Don't use URL encoding to perform HTML escaping! (Yes, I've seen that :) )
URL escaping
Let's say we already have a valid (correctly encoded) URL which we want to write into HTML.
Then there are 2 important steps:
1, Ampersand character (&), which may be a part of URL, is used to prefix html entity references. When writing a valid HTML we should escape it the same way as any other ampersand you want to write there. Use ampersand entity: &
2, To write safe HTML = to be sure that URL won't escape from HTML attributes or JavaScript variable (it is allowed to contain apostrophe character) it's good to escape the URL as described in XSS_Prevention_Rules_Summary, not only the query string part.
When we don't have the URL properly encoded we must escape the whole URL - perform the step #2.
No comments:
Post a Comment