Storing URL in database -- h() or sanitize() ?

Am I crazy?

It seems only smart to escape fields before they're put into the
database if that is possible. If somebody injects SQL in their url or
first name where better to escape it than in the model before_save ?
Even attr_protected assumes that code is written to fix injection and
so on.

I ended up with this really irritating code:

    include ActionView::Helpers::SanitizeHelper
    HTML_ESCAPE = { '&' => '&', '"' => '"', '>' =>
'&gt;', '<' => '&lt;' }
    def html_escape(s)
      s.to_s.gsub(/[&\"><]/) { |special| HTML_ESCAPE[special] }
    end

    def before_save
      for c in [:lastname, :firstname, :login, :email]
        self[c] = html_escape(self[c])
        end
      self[:homepage] = sanitize(self[:homepage])
    end

Several things:
1. Maybe I should have used strip_tags.
2. Is sanitize the right thing to use on a URL that will later be made
into a link on a web page?
3. Why the !@#$ do I have to search 4 or more places to find how to
reference this code. Escaping is not sensibly restricted to the view,
is it?
4. And wouldn't it be more sensible to be able to experiment with
these routines in the console?

This is one of those places where the Rails framework seems terribly
unprofessional. I have three books and several web sites to search.
And not one of them told me the incantation to make this work in the
model. I shouldn't have to be groveling around in the source code to
make simple things work.

I should be able to figure out correct code (at least the syntax) from
the documentation, or is that crazy?

F

If you're using ActiveRecord for your ORM, it takes care of quoting
all your fields whenever you create or update. Furthermore, if you
use the standard find or it's derivative find_by_xxx methods and let
it due substitution on the conditions then it quotes those values as
well. Both of these provisions are intended to prevent SQL injection
attacks and are fairly well publicized in Rails texts.

Similarly, the h (html_escape) method is usually one of the very first
ones you're introduced to when you meet any explanation of Rails
views. It's purpose, as the long name suggests, is to html escape
anything captured from a form so that it can be rendered without
destroying the page. If you want to use it in a view you simply
say...

<%= h my_object.some_field_with_user_data %>

That field will be neatly html-escaped for you.

Fred Talpiot wrote:

Am I crazy?

It seems only smart to escape fields before they're put into the
database if that is possible. If somebody injects SQL in their url or
first name where better to escape it than in the model before_save ?
Even attr_protected assumes that code is written to fix injection and
so on.

Sanitizing/escaping data before it enters the database is a valid way of
handling things. The reason it's not usually recommended is you may
want to do things with the data other that just display it to the user.
However, if the only thing you are going to do with the data is display
it on a webpage, escaping the html before putting it in database may
work well. You do need to make sure that you aren't also escaping it
when it is displayed.

I ended up with this really irritating code:

    include ActionView::Helpers::SanitizeHelper
    HTML_ESCAPE = { '&' => '&amp;', '"' => '&quot;', '>' =>
'&gt;', '<' => '&lt;' }
    def html_escape(s)
      s.to_s.gsub(/[&\"><]/) { |special| HTML_ESCAPE[special] }
    end

    def before_save
      for c in [:lastname, :firstname, :login, :email]
        self[c] = html_escape(self[c])
        end
      self[:homepage] = sanitize(self[:homepage])
    end

You should probably use CGI for the html_escape[1]:

require 'cgi'
CGI.escapeHTML('<"&">') # => "&lt;&quot;&amp;&quot;&gt;"

Several things:
1. Maybe I should have used strip_tags.
2. Is sanitize the right thing to use on a URL that will later be made
into a link on a web page?
3. Why the !@#$ do I have to search 4 or more places to find how to
reference this code. Escaping is not sensibly restricted to the view,
is it?
4. And wouldn't it be more sensible to be able to experiment with
these routines in the console?

This is one of those places where the Rails framework seems terribly
unprofessional. I have three books and several web sites to search.
And not one of them told me the incantation to make this work in the
model. I shouldn't have to be groveling around in the source code to
make simple things work.

If you want to know how things actually work in Rails, the source is
generally the best documentation (assuming you know ruby fairly well).
The reason you aren't finding much information/documentation for what
you want to do is that you aren't following standard Rails practice
(i.e. escape data on display, not on input).

I should be able to figure out correct code (at least the syntax) from
the documentation, or is that crazy?

It's not crazy, just idealistic.

Jeremy

[1]
http://www.ruby-doc.org/stdlib/libdoc/cgi/rdoc/classes/CGI.html#M000094