Should an blank string be html_safe?

Just trying to implement a simple helper over the past few days had me
really confused.

messages = ''
messages << content_tag(:p, 'dave')
#=> &lt;p&gt;dave&lt;\p%;gt;

Eventually I realised the original empty string was not html_safe

message = ''.html_safe
message << content_tag(:p, 'dave')
#=> <p>dave</p>

Is this intentional behavour?

I see how you got confused, but this is intentional. All strings are originally not html_safe since there’s no way of telling if they came from the author or user input. I don’t agree that Rails should special-case this behavior (blank strings not html_safe) since I don’t really think the way you’re building content here should be encouraged. Depending on your helper as whole, there must be better ways.

Also, what about if you’re appending user input instead of just content tags:

query = “”

query << params[:query]

query << content_tag(…)

If blank strings were safe to begin with, and users grew accustomed to the fact, doing this would suddenly be exposing yourself to XSS via GET/POST params.

Remember you can use raw for output unsafe strings without being escaped

From your example

query = “”.html_safe

query << params[:query] # This WOULD be escaped. The << operator is overwritten to recognize whether what is being appended is html_safe and escape it if it isn’t to maintain an html_safe string

query << content_tag(…)

query # Is still html_safe

Ah, then my example was totally mistaken. Thank you. However, I still stand behind consistency, and not special-casing a practice that I don’t find optimal for widespread use.

Mislav’s example was “” without .html_safe.

There’s really no obvious way to make “” become html_safe without modifying Ruby. What about:

x = “”

y = “#{x}#{safe_string}”

And even in the case of:

x = “”

x << safe_string

We’d have to override every single << in the system (a serious performance problem) to achieve this.

In the end, the rule is simple and consistent. Direct instances of String are always not html_safe. This means that concatenating safe Strings onto a String results in an unsafe String.

Yehuda Katz
Architect | Engine Yard
(ph) 718.877.1325

Thanks for the clarification guys.

RobL