Need ActiveRecord bind variable support; offering effort/patches

Hey, Josh Peek suggested I ping Rails Core about this.

I'm at the point that I really need true bind variable support in AR.
Currently we make heavy use of Oracle and PostgreSQL. Both databases,
especially Oracle, get big performance gains from bind variables.
This is important enough that I can devote time and energy to getting
it implemented in AR so that it works across multiple adapters.

I've looked thru the AR code and it seems like much of it boils down
to sanitize_sql and friends. The reasons Josh suggested that I ping
Rails Core are:

1) Make sure my approach to a potential patch sounds good.

2) Make sure there aren't any pending AR changes coming up that would
cause a huge refactor of any patches.

Regarding #1, here are my thoughts:

Start out by reorganizing AR so sanitize_sql is an adapter method. It
could live in AbstractAdapter so that existing adapters that rely on
the current behavior wouldn't have to change. This would make
sanitize_sql and friends a bit of an API, so maybe there would be a
minor refactor beyond just moving the methods, perhaps into a
SanitizeSql class similar to the PrimaryKey class that just hit
github.

Then, run the tests and ensure that moving sanitize_sql maintains
existing adapter functionality. For people using
AR::Base.sanitize_sql, we can follow the same pattern as other adapter
methods to make sanitize_sql available in AR::Base but have it be
taken from the adapter (like the existing column/table methods).

Next, take an adapter, for example OracleEnhanced, and modify it so
that it accepts the unmodified SQL and list of bind parameters from
the AR Base layer. For standard :conditions this should be pretty easy
looking at AR. The slightly more complex/sticky cases of lazy fetch
and :include will need a bit of refactoring, but nothing that seems
crazy. Basically a string that's concatenated repeatedly and an array
of bind << params works in most cases. I've talked to Raimonds a bit
about this, so seems like OracleEnhanced could be a good guinea pig.

This approach is basically the same approach DataMapper has taken with
their DataObjects layer. AR already has much of the same structure in
place, just named AR Base and AR Adapters. We can't jump ship to
using DM because it doesn't support DB views/sprocs well, and we have
tons and tons of AR code too.

So, I guess my questions are:

1) Thoughts on this approach?

2) Any AR changes in the works that would affect this?

Thanks,
Nate Wiger
PlayStation

1) Thoughts on this approach?

Your approach is exactly what we'd have to do. It could become
slightly tricky because (as you've mentioned) the existing code
'instantly'

2) Any AR changes in the works that would affect this?

Yes, but in a postive way. Miloops' ARel branch at least centralizes
all the query generation, hopefully you can leverage some of the
tidying he's done to get a head start on your work.

> 2) Any AR changes in the works that would affect this?

Yes, but in a postive way. Miloops' ARel branch at least centralizes
all the query generation, hopefully you can leverage some of the
tidying he's done to get a head start on your work.

Cool, I spoke with miloops via email about this and he was
enthusiastic like me.

What should the next steps be? Should I start putting bind variable
support into ARel, and then when that's all merged upstream in to AR,
AR will get bind variables?

Or should I wait until after the merge and look at it then? (I guess
my real question is: Is ARel going into AR "as-is" or will it be
changed/refactored significantly?)

-Nate

As-is. ARel will use bind vars throughout and AR can take advantage.

Either way, we can attack the problem from both ends. Check out
miloops' rails fork (which will be merging to master in the near
future) and take a look at how relations are built and composed.

Best,
jeremy

I discussed this some with dbussink, who maintains DO. One of the
problems is that you’d have to be able to convert “?” into the native
format, while avoiding ? in Strings. So for instance:

“SELECT * from foo where name=”"?"" and id = ?"

This is a simple case–it can be a lot trickier than that. One possible
solution is to ban the use of literal strings where bind variables are
also used. Thoughts?

– Yehuda

Jeremy Kemper wrote:

I discussed this some with dbussink, who maintains DO. One of the
problems is that you'd have to be able to convert "?" into the native
format, while avoiding ? in Strings. So for instance:

"SELECT * from foo where name=""?"" and id = ?"

I could be missing your point, but isn't that problem already
addressed in AR? AR already parses ?'s and :bind's itself and makes
those decisions currently, since it emulates bind variables. So that
functionality (and perhaps edge case bugs) should already be in the
current "id = ?" => "id = '1'" AR code.

So, we should be able to take the current sanitize_sql, and change it
slightly so that when AR makes the pass to decide which ? would
normally be replaced with 'quoted values', instead it would be
replaced with :b1, @x, or whatever that driver decides.

Just FYI, your example actually fails in AR currently:

p = Player.find_by_sql(['SELECT * from players where username=""?"" and id = ?', 'hey', 1])

ActiveRecord::StatementInvalid: OCIError: ORA-01741: illegal zero-
length identifier: SELECT * from players where username=""'hey'"" and
id = 1

So it's not functionality that anyone is relying on as-is.

-Nate

Understood. It’s part of why a more general purpose solution is
difficult (for DO). But I think that it’s reasonable to require the
usage of String literals or bind variables, but not both.

– Yehuda

Nate W wrote:

By "string literals" do you mean the question mark - ?

And by "bind variables" do you mean things starting with a colon
- :var1

If so, the problem with that distinction is that ? is a perfectly
valid bind variable for Oracle. The OCI driver natively supports the
AR syntax of ["select * from users where id = ? and name = ?, 1",
'Nate']

In fact in the old school Perl DBI that statement would be passed
directly to OCI. Any ? is bound in order left-to-right by Oracle

So a ? and a :var have to be treated the same way.

If I totally missed your point please let me know.

-Nate

Understood. It's part of why a more general purpose solution is difficult
(for DO). But I think that it's reasonable to require the usage of String
literals *or* bind variables, but not both.

It's pretty easy to work with both, you simply don't do any
interpolation if you've not been told there are bind variables

:conditions=>"description = 'Ruby is awesome?' # no interpolation
:conditions=>["description = ?", params[:whatever]]

A bare string with no associated 'things to bind' should obviously not
be 'bound'.