Page title is sanitized 2 times (or what is happening!?)

collimarco · September 6, 2023, 9:24pm

I found a strange behavior and I wonder if it’s a Rails bug.

View:

<h1><%= title "Let's go" %></h1>

Helper:

module ApplicationHelper
  def title(t)
    content_for :title, t
    t
  end
end

Layout:

<!DOCTYPE html>
<html lang="en">
<head>
  <title><%= content_for?(:title) ? "#{yield(:title)} - MyWebsite" : "MyWebsite" %></title>
  ...

The problem with this code, which apparently is correct, is that the resulting title displayed to the user is Let's go.

One could simply add .html_safe, but it seems a security risk if the title may contain user-generated content.

Is this a Rails bug? What is the correct (and safe) solution?

collimarco · September 7, 2023, 7:21am

I found a solution:

yield(:title) + " - MyWebsite"

instead of using "#{yield(:title)} - MyWebsite"

walterdavis · September 7, 2023, 12:52pm

There are a few other ways to treat this, too. First, what you are seeing is the basic “strings aren’t safe by default” behavior that has been standard since early on in Rails (v 2.x, IIRC). As you noted, this is for your protection from UGC attacks, and also just to make you stop and think.

Your solution will fail if you are in a setting where strings are immutable, though, like in a model, where you probably have the # frozen_string_literal: true declaration at the top of your file. There, you’ll want to do something like this:

[yield(:title), 'MyWebsite'].join(' - ')

or

"#{yield(:title)} - MyWebsite"

The former deals with the possibility that title is not set, too, and doesn’t leave you with a hanging hyphen.

In place of yield, I have often used content_for[:some_key] as well.

Finally, I often recommend wrapping sanitize around such a concatenated string. That gives you a default set of code-stripping which you can further refine, say to remove inline style marks and other nonsense that you don’t want to see in your page. ActionView::Helpers::SanitizeHelper

Walter

hakunin · September 7, 2023, 1:44pm

Your solution will fail if you are in a setting where strings are immutable

+ doesn’t mutate strings, just creates a new one. It should still work.

I often recommend wrapping sanitize around such a concatenated string.

I would argue it’s a little heavy-handed for situations where you know what you’re printing (i.e. programmer supplies the string). Loofah sanitization is not free.

walterdavis · September 7, 2023, 2:00pm

Both good points. I must have been mis-remembering +, maybe I was thinking of <<. Sanitize is an acceptable cost for places where you can’t guarantee the pedigree of the string being assigned, but you’re right. When it’s your own work and a known safe value, then it would be overkill. Using raw() or html_safe or even the ERB shortcut of <%== would be fine and faster.

Walter

collimarco · September 7, 2023, 2:25pm

This post is about the fact that sanitization was happening 2 times, producing the wrong output.

My new solution still has normal erb sanitization and produces the right result. This is not true for the code in the original post.

hakunin · September 7, 2023, 2:59pm

To address the double escaping, it’s quite interesting. Not sure why this is happening.

irb(main):001:0> a = 'a'.html_safe
=> "a"
irb(main):002:0> a.html_safe?
=> true
irb(main):003:0> (a + 'b').html_safe?
=> true
irb(main):004:0> ("#{a}b").html_safe?
=> false

But that explains why it double-escaped, and why the solution worked. Need to investigate why the difference between + and #{}.

walterdavis · September 7, 2023, 3:09pm

Thanks. I suppose I glossed right over that. Too much coffee! By default, ERB with <%= will sanitize any string that is not already marked as safe. A normal string is not safe, no matter where it came from, until it has been marked safe by one mechanism or another. What I should have said more clearly was that inside an ERB output tag, any strings that are not already marked as safe (which is to say, are not instances of the SafeBuffer class) will be escaped to protect you. You can find out which is which by this trick:

<%= 'hello world'.class %>
<%= tag.div('').class %>
<%= render_to_string('some_partial').class %>

Some examples (imagine all are inside an ERB context):

SafeBuffer.new('hello world') # what it says on the tin
raw('hello world &amp; stuff') # ditto
'hello world <script>alert("howdy!")</script>'.html_safe # ditto
tag.div 'hello world' # the div tag itself is safe, but its content is not!
tag.div @post.body # the data will be escaped, because it's definitely suspect
SafeBuffer.new('hello') + 'world' # is the whole thing safe? I don't remember...
'wakka' + SafeBuffer.new('wakka') # pretty sure this product is no longer safe

The issue is that until it is marked safe by you, even if it’s a hard-coded string, and completely safe, Rails assumes it is not safe. Any HTML or entiies will be escaped (or double-escaped, as you noted).

If you disable this practice using <%== instead of <%=, then sure, anything already encoded will pass through without further safety practices.

The reason why yield worked for you is not because the resulting string is marked safe, but because you literally are stepping outside of the ERB context when you use that method. This rock is what Rails partials, templates, and layouts are built upon. Each one opens a new block “context” like a wormhole into the surrounding ERB context, and so any escaping that your current template/partial/layout is already engaged in is suspended until that block closes. Each part is assumed to know what it is doing. Use this cheat code with your eyes wide open!

Walter

walterdavis · September 7, 2023, 3:12pm

That’s because the surrounding double-quotes created a new string, which is not trustworthy (yet). It would also need to be blessed.

hakunin · September 7, 2023, 3:22pm

Both + and interpolation created a new string, so that’s not it. The difference I think stems from SafeBuffer having special handling for + that it cannot have for interpolation. I wonder if Ruby maintainers ever explored making #{} call + under the hood. What would be the implications.

And what’s funny is that + should behave like #{}. After +, it seems the result should become unsafe. Rails has control over this, and deliberately changes it to safe. I wonder why.

hakunin · September 7, 2023, 3:37pm

Okay, it’s starting to make sense now.

If you start out with a SafeBuffer, then anything you concatenate with it using one of its methods like + or concat (regardless whether a new object is produced or not) gets auto-escaped too.

If you inject a SafeBuffer into an interpolation, then ruby calls to_s on it, and loses its safety, you’re going back to plain strings. However, SafeBuffer to_s is already an escaped version of the string that went into safe buffer. So now a resulting plain string would get escaped, since it’s unsafe. The already-escaped substring is going to get escaped a second time, along with it.

And of course, content_for produces a SafeBuffer, not a plain string, so you use its method +, which leaves the left side intact, and escapes the right side only.

Topic		Replies	Views
concatenation of a html_safe string and a html_UNsafe string rubyonrails-core	4	431	March 16, 2011
Let's use <%== %> instead of <%= raw() %> rubyonrails-core patch	9	2555	November 8, 2010
Trouble with titleize rubyonrails-talk	4	147	March 7, 2011
rails 3.0.4 broke yield :javascript ? rubyonrails-core	16	482	February 17, 2011
On-By-Default XSS escaping rubyonrails-core	8	247	August 9, 2009

Page title is sanitized 2 times (or what is happening!?)

Related topics

More Resources