I have found this rails plugin which automatically removes XSS from
models upon saving. This is great. My concern is, which is the best
choice, 1) use plugin like this Or 2) allow the content to be entered in
to db as it is and later escape it from view using h method or sanitize
. Why I am asking this is , the latest railscast 204 says rails3
automatically sanitize html. But why cant use this type of plugin for
not at all entering such malicious user inputs to the database? Please
share your thoughts


   No comments yet!


I agree; it has never made sense to me to have to sanitize the output.

Escaping everything as you display it does have the benefit of allowing you to see what information is in the DB. Also you can change which tags are allowed after the fact by using sanitize() instead of h()

The downside is that you have to escape it every time you display the page. Granted this isn’t a heavy operation, but it does happen repeatedly. It seems to me that if you are always going to have to use h() anyway, things should just be sanitized before insertion into the DB and forgo the h().

Just my opinion. I still use h() and sanitize()

Two problems with that:

The first and smallest is an annoyance.
If I want to save my blog in a db, and I write a post that has the content:
  "Never use '<' in your HTML; use '&lt;' instead"
...this will get written to the DB as:
  "Never use '&gt;' in your HTML; use '&gt;' instead"
...which then gets encoded with h() in a view as:
  "Never use '&amp;gt;' in your HTML; use '&amp;gt;' instead"
...or if just output straight to the view "because it was sanitized
before putting it in the DB" as:
  "Never use '<' in your HTML; use '<' instead"

You'll have seen this happen on *loads* of bulletin boards and
feedback comments all over the web.

Adjusting the user's input before storing it in the db is "bad",
because you can never reverse it without all sorts of unreliable
hoops. Just store what they typed, and whenever you deal with it
assume it's highly-toxic.

The second problem is an arrogant presumption that the only place that
will ever use this user-supplied data is in the rendering of an HTML
But what happens when you're storing details, say of an order placed,
and the user enters their special delivery comments :
  "Please knock & wait for >5mins"
You store this as:
  "Please knock &amp; wait for &lt;5mins"
...because you *know* you're going to have to display it in a
confirmation page on the web site and you don't want to worry about
encoding it there every time, but you forget that you might want it
put into a PDF that's generated for the delivery driver, or use it in
a JS function on the web page, or include it in a field of a CSV
export. In each of these instances, you're going to have to decode it
back from the "safe" HTML encoded version to the user input (I refer
you to my first point; that you can not reliably do this :slight_smile: before
encoding it however you need for your new use.

Life is much easier if you just store what they typed and deal with it
when you use it...

Life is much easier if you just store what they typed and deal with it
when you use it...

And again going through the plugin doc I found an example like

class Message < ActiveRecord::Base
   xss_terminate :except => [ :body ]

         Means we can exempt some fields from sanitization. So isn't
that sufficient? Any other thoughts?


So instead of messing with *all* of the user-supplied input, you only
mess with *some* of it? That won't end up in confusion for the
developers trying to re-render the DB content to PDF, etc.; when some
of the data renders fine, and some has to be "decoded" back to plain
text (but doesn't go back to *exactly* what the user typed)...

I didn't think I was ambiguous: fiddling with users' data before you
store it is going to end up in confusion and pain somewhere [1]. It's
perfectly easy to assume that all DB content is taited, and treat it
appropriately for whatever purpose you want to put it.

My 2p... YMMV :slight_smile:

[1] Of course, you need to "fiddle" with it to prevent SQL injection -
but the end result should be that the content in the DB is exactly
what the user typed even if they typed "Robert'); DROP TABLE


Excellent points.