Using Rails 3 (git master) and Ruby 1.9.2-head I noticed I have to
treat EVERY SINGLE STRING in my app, even things like
link_to " bla", path
with raw(). This is crazy! It is a FIXED string! I understand it when
variables are concerned, but this is taking it a little too far. One
might even say the escaping only is necessary if STRING variables are
introduced, so including number-variables in a(n otherwise fixed)
string should not trigger the need to use raw().
I only just started but the amount of "raw()" I have to insert into my
app seems excessive.
Unfortunately, there's no way to know if a string is a literal or not,
short of parsing the Ruby, and considering the sheer number of string
literals in Ruby, that seems very infeasible.
/Jonas
P.S.: off the top of my head:
"word"
'word'
%(word)
%[word]
%{word}
%q(word)
%q[word]
%q{word}
%Q(word)
%Q[word]
%Q{word}
<<STRING
word
STRING
<<-STRING
word
STRING
etc...
Making the switch to HTML-safety is quite a pain. The grass is greener
on the other side, though!
You mark just a handful of strings as <%= raw ... %> instead of almost
every string as <%= h ... %> -- less work down the line, plus no
lingering XSS worries.
What is the reason that #tainted? wouldn't suffice? Basically, all
tainted strings are HTML escaped; if you wish to circumvent this, mark
your strings as untainted with String#untaint. It seems to me that
#html_safe tries to replicate functionality which is already present
(and more powerful) within Ruby itself.
If you are certain that your string is safe, then you can mark it as html_safe by doing this:
<%= link_to " bla".html_safe, path %>
Prem, yes. The original poster's concern is that literal strings are
already known to be HTML-safe, so having to state it again is annoying
and not DRY.
What is the reason that #tainted? wouldn't suffice? Basically, all
tainted strings are HTML escaped; if you wish to circumvent this, mark
your strings as untainted with String#untaint. It seems to me that
#html_safe tries to replicate functionality which is already present
(and more powerful) within Ruby itself.
Should I cook up a patch for this?
The problem with tainted is that it's a single flag for everybody to
share. It may have different meanings in different libraries.
Say, a database driver checks taint to see whether strings should be
SQL-escaped.
I think the benefits of using tainting in the common case, 99% of
usage, far outweigh these unlikelihoods though.
That's a valid concern, but I think trying it out is the best approach
-- then we'll se if any problems arise. Having a #html_safe wrapper
around the tainting also yields some flexibility.
That seems reasonable. Perhaps the tainting could be moved higher up
the hierarchy, the ActiveRecord itself? All getters would then return
tainted strings. Would that not suffice?
The main issue with taint is that it’s a blacklist operation. This means that we’d be relying on all libraries that could retrieve persisted user data to properly taint their Strings. In contrast, the html_safe approach assumes that all Strings are unsafe, until they are marked safe.
Rails internally marks its own Strings as safe, which is why you don’t need to mark form_for, link_to, etc. as html_safe.
That seems reasonable. Perhaps the tainting could be moved higher up
the hierarchy, the ActiveRecord itself? All getters would then return
tainted strings. Would that not suffice?
If we used tainting then all it would take is one library to not
return tainted strings, and your application is completely hosed.
It's not just a question of ActiveRecord either. Apps read strings
from memcached, redis, http (dozens of libraries here), one bug here
and you're hosed. The risk outweighs the slim rewards.
Tainting was an alluring option initially, but the more you drill down
into it the more of a red-herring it becomes.