Rails XSS protection and helpers

Hi,

I'm trying the new XSS protection from Rails 2.3.5 + rails_xss plugin, and I have some remarks/questions about it.

The button_to helper doesn't return an html_safe string, even if it does escape its inputs. My patch is on this ticket: #3018 Introduce notion of 'html_safe?' to Strings in preparation for on-by-default XSS - Ruby on Rails - rails, but I don't know how to encourage a member of the core team to apply it.

The label_tag doesn't escape its input, but returns an html_safe string. Is it the expected behaviour ?

At the opposite, textilize("This is worded <strong>strongly</strong>", :filter_html) returns a string that is not html_safe, so this string will be escaped a second time.

How can I know when a string given to an helper will be escaped or not? How can I know when a string returned by an helper is html_safe or not?

For the moment, I read Rails code, but I suppose there is a rule of thumb for these two questions, just that I'm not clever enough to find it. Thanks for your help :slight_smile:

I'd recommend that you create a new ticket for this - I know it's a tiny patch (add some parens), but it helps to have a separate ticket to track status on.

--Matt Jones

The label_tag doesn't escape its input, but returns an html_safe string. Is it the expected behaviour ?

No

At the opposite, textilize("This is worded <strong>strongly</strong>", :filter_html) returns a string that is not html_safe, so this string will be escaped a second time.

:filter_html isn't enough here some css properties (-moz-binding f.ex) are an attack vector too. You must use sanitize on the resulting output if you want to be sure your code will be save.

How can I know when a string given to an helper will be escaped or not?

Escaping is now idempotent, strings will not be double escaped. So in general you shouldn't need to know whether it's going to be escaped

How can I know when a string returned by an helper is html_safe or not?

With the exception of textilize, simple_format and friends, all our helpers should escape all the output, and return safe strings. Any cases where I missed this (which I'm sure there are many) are bugs and should be reported via lighthouse. Can you open a new ticket for your case and assign it to me with a tag of xss?

The reason I released a plugin for 2.3.x is so we can do this kind of investigation before shipping XSS protection as a headline feature.

I have a couple of questions regarding the way XSS protection is implemented. From what I know, there are two kinds of such vulnerabilities:

a. non-persistent (data does not come from the database) b. persistent

I do not fully understand why we do not protect against case (b) by escaping the data *prior to its entry in the database*, eventually writing audit migrations when we discover new vulnerabilities. I see two main advantages of the *pre-database* pattern:

1. it is much easier to track the paths writing to the database then those displaying content 2. the performance hit is divided by the write/read ratio which is very high on high volume websites

The only drawback is that you have to audit all the content when new rules are discovered, but that could be easily done offsite with a rake task (auto-discovery of string/text columns, parsing, alert).

Could the *pre-database* vs *rendering* filtering be a choice that the developer could make depending on the complexity and number of the data sources ?

Is the performance hit so low that all this is useless ?

Gaspard

Because something/someone may write to the database with unescaped data. In your proposal we'd be assuming that all database content is safe which is a risky proposition. Additionally, content in the database may be used in non-HTML context like plain text emails - in those cases we'd have to unescape the content.

Andrew

I do not fully understand why we do not protect against case (b) by escaping the data *prior to its entry in the database*, eventually writing audit migrations when we discover new vulnerabilities. I see two main advantages of the *pre-database* pattern:

Because something/someone may write to the database with unescaped data. In your proposal we'd be assuming that all database content is safe which is a risky proposition. Additionally, content in the database may be used in non-HTML context like plain text emails - in those cases we'd have to unescape the content.

Andrew

I understood these two issues, but I think they do not really hold because:

1. it can be easier to make sure the database content is safe in some applications (connection.quote for example) then to ensure all rendering paths are properly escaped. 2. if you only escape (or strip) evil code, it does not matter if this content appears escaped or not at all in plain text emails or such

Gaspard

You’re also assuming that all dangerous data comes from a database, which is not a generally safe assumption. Trying to sanitize all possible inputs to the system BEFORE they become inputs in the first place is a problem that is far outside the scope of Rails. However, sanitizing all inputs AFTER they become inputs but before they are OUTPUT is perfectly within the scope.

– Yehuda

I am not saying that the use case I am talking about should be the general way to deal with output sanitization, I just consider that there is a (possibly large) category of rails applications for which filtering content in:

connection.quote

would be enough to filter any and all evil code (in the apps I consider, not in every situation). I see this route as both safer and less performance hungry and am quite surprised it is not considered.

Gaspard

Gaspard Bucher wrote:

I am not saying that the use case I am talking about should be the general way to deal with output sanitization, I just consider that there is a (possibly large) category of rails applications for which filtering content in:

connection.quote

would be enough to filter any and all evil code (in the apps I consider, not in every situation). I see this route as both safer and less performance hungry and am quite surprised it is not considered.

Gaspard

Hi,

I don't think that your proposition will be safer. I think of many cases where data don't go in the database: informations in the session, data retrieved from API, files, ActiveResource, submitting invalid inputs to a form, params from URL...

And it has some nasty effects when the output is not XML/HTML (text emails, PDF). For example, the characters & < and > are transformed to &amp; &lt; and &gt;.

If your app really benefits from this performance boost, you should consider using a plugin like xssterminate instead of the default Rails filtering.

Michael Koziarski wrote:

The label_tag doesn't escape its input, but returns an html_safe string. Is it the expected behaviour ?

No

OK, I've opened a ticket: #3449 Label tag doesn't escape its input - Ruby on Rails - rails.

:filter_html isn't enough here some css properties (-moz-binding f.ex) are an attack vector too. You must use sanitize on the resulting output if you want to be sure your code will be save.

OK, it makes sense. But, is there a good reason that the textilize helper doesn't santize its output by default? I don't see why a developer wants to call textilize without sanitizing it, since the output will be escaped else.

Can you open a new ticket for your case and assign it to me with a tag of xss?

Done: #3448 Button_to does not return an HTML-safe string - Ruby on Rails - rails.

Thanks for your answer.

OK, I've opened a ticket: #3449 Label tag doesn't escape its input - Ruby on Rails - rails.

Thanks.

OK, it makes sense. But, is there a good reason that the textilize helper doesn't santize its output by default? I don't see why a developer wants to call textilize without sanitizing it, since the output will be escaped else.

sanitize will strip some content that users may actually be expecting, so it's not safe to apply it by default. In reality textilize is a great case where you're probably better off caching the results of the transformation like:

after_save :transform_content

def transform_content   self.content_html = HTML::WhiteListSanitizer.new.sanitize(RedCloth.new(content).to_html) end

Then just adding a safe accessor for it:

def content_html   read_attribute(:content_html).html_safe! end

Then in your views you just need

<%= @post.content_html %>