Rails XSS protection and helpers

Hi,

I'm trying the new XSS protection from Rails 2.3.5 + rails_xss plugin,
and I have some remarks/questions about it.

The button_to helper doesn't return an html_safe string, even if it does
escape its inputs. My patch is on this ticket:
https://rails.lighthouseapp.com/projects/8994/tickets/3018-introduce-notion-of-html_safe-to-strings-in-preparation-for-on-by-default-xss,
but I don't know how to encourage a member of the core team to apply it.

The label_tag doesn't escape its input, but returns an html_safe string.
Is it the expected behaviour ?

At the opposite, textilize("This is worded <strong>strongly</strong>",
:filter_html) returns a string that is not html_safe, so this string
will be escaped a second time.

How can I know when a string given to an helper will be escaped or not?
How can I know when a string returned by an helper is html_safe or not?

For the moment, I read Rails code, but I suppose there is a rule of
thumb for these two questions, just that I'm not clever enough to find
it. Thanks for your help :slight_smile:

I'd recommend that you create a new ticket for this - I know it's a tiny patch (add some parens), but it helps to have a separate ticket to track status on.

--Matt Jones

The label_tag doesn't escape its input, but returns an html_safe string.
Is it the expected behaviour ?

No

At the opposite, textilize("This is worded <strong>strongly</strong>",
:filter_html) returns a string that is not html_safe, so this string
will be escaped a second time.

:filter_html isn't enough here some css properties (-moz-binding f.ex)
are an attack vector too. You must use sanitize on the resulting
output if you want to be sure your code will be save.

How can I know when a string given to an helper will be escaped or not?

Escaping is now idempotent, strings will not be double escaped. So in
general you shouldn't need to know whether it's going to be escaped

How can I know when a string returned by an helper is html_safe or not?

With the exception of textilize, simple_format and friends, all our
helpers should escape all the output, and return safe strings. Any
cases where I missed this (which I'm sure there are many) are bugs and
should be reported via lighthouse. Can you open a new ticket for your
case and assign it to me with a tag of xss?

The reason I released a plugin for 2.3.x is so we can do this kind of
investigation before shipping XSS protection as a headline feature.

I have a couple of questions regarding the way XSS protection is
implemented. From what I know, there are two kinds of such
vulnerabilities:

a. non-persistent (data does not come from the database)
b. persistent

I do not fully understand why we do not protect against case (b) by
escaping the data *prior to its entry in the database*, eventually
writing audit migrations when we discover new vulnerabilities. I see
two main advantages of the *pre-database* pattern:

1. it is much easier to track the paths writing to the database then
those displaying content
2. the performance hit is divided by the write/read ratio which is
very high on high volume websites

The only drawback is that you have to audit all the content when new
rules are discovered, but that could be easily done offsite with a
rake task (auto-discovery of string/text columns, parsing, alert).

Could the *pre-database* vs *rendering* filtering be a choice that the
developer could make depending on the complexity and number of the
data sources ?

Is the performance hit so low that all this is useless ?

Gaspard

Because something/someone may write to the database with unescaped data. In your proposal we'd be assuming that all database content is safe which is a risky proposition. Additionally, content in the database may be used in non-HTML context like plain text emails - in those cases we'd have to unescape the content.

Andrew

I do not fully understand why we do not protect against case (b) by
escaping the data *prior to its entry in the database*, eventually
writing audit migrations when we discover new vulnerabilities. I see
two main advantages of the *pre-database* pattern:

Because something/someone may write to the database with unescaped
data. In your proposal we'd be assuming that all database content is
safe which is a risky proposition. Additionally, content in the
database may be used in non-HTML context like plain text emails - in
those cases we'd have to unescape the content.

Andrew

I understood these two issues, but I think they do not really hold because:

1. it can be easier to make sure the database content is safe in some
applications (connection.quote for example) then to ensure all
rendering paths are properly escaped.
2. if you only escape (or strip) evil code, it does not matter if this
content appears escaped or not at all in plain text emails or such

Gaspard

You’re also assuming that all dangerous data comes from a database, which is not a generally safe assumption. Trying to sanitize all possible inputs to the system BEFORE they become inputs in the first place is a problem that is far outside the scope of Rails. However, sanitizing all inputs AFTER they become inputs but before they are OUTPUT is perfectly within the scope.

– Yehuda

I am not saying that the use case I am talking about should be the
general way to deal with output sanitization, I just consider that
there is a (possibly large) category of rails applications for which
filtering content in:

connection.quote

would be enough to filter any and all evil code (in the apps I
consider, not in every situation). I see this route as both safer and
less performance hungry and am quite surprised it is not considered.

Gaspard

Gaspard Bucher wrote:

I am not saying that the use case I am talking about should be the
general way to deal with output sanitization, I just consider that
there is a (possibly large) category of rails applications for which
filtering content in:

connection.quote

would be enough to filter any and all evil code (in the apps I
consider, not in every situation). I see this route as both safer and
less performance hungry and am quite surprised it is not considered.

Gaspard

Hi,

I don't think that your proposition will be safer. I think of many cases
where data don't go in the database: informations in the session, data
retrieved from API, files, ActiveResource, submitting invalid inputs to
a form, params from URL...

And it has some nasty effects when the output is not XML/HTML (text
emails, PDF). For example, the characters & < and > are transformed to
&amp; &lt; and &gt;.

If your app really benefits from this performance boost, you should
consider using a plugin like xssterminate instead of the default Rails
filtering.

Michael Koziarski wrote:

The label_tag doesn't escape its input, but returns an html_safe string.
Is it the expected behaviour ?

No

OK, I've opened a ticket:
https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/3449-label-tag-doesnt-escape-its-input.

:filter_html isn't enough here some css properties (-moz-binding f.ex)
are an attack vector too. You must use sanitize on the resulting
output if you want to be sure your code will be save.

OK, it makes sense. But, is there a good reason that the textilize
helper doesn't santize its output by default? I don't see why a
developer wants to call textilize without sanitizing it, since the
output will be escaped else.

Can you open a new ticket for your
case and assign it to me with a tag of xss?

Done:
https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/3448-button_to-does-not-return-an-html-safe-string.

Thanks for your answer.

OK, I've opened a ticket:
https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/3449-label-tag-doesnt-escape-its-input.

Thanks.

OK, it makes sense. But, is there a good reason that the textilize
helper doesn't santize its output by default? I don't see why a
developer wants to call textilize without sanitizing it, since the
output will be escaped else.

sanitize will strip some content that users may actually be expecting,
so it's not safe to apply it by default. In reality textilize is a
great case where you're probably better off caching the results of the
transformation like:

after_save :transform_content

def transform_content
  self.content_html =
HTML::WhiteListSanitizer.new.sanitize(RedCloth.new(content).to_html)
end

Then just adding a safe accessor for it:

def content_html
  read_attribute(:content_html).html_safe!
end

Then in your views you just need

<%= @post.content_html %>