I'd start with using bound variables first, though. They're always
beneficial and are a natural stepping stone to prepared statements
which can be added later for those queries which may benefit (many do
not).
database) bind variables. For a while now I've had a working (passes all tests) implementation of AR that uses bind variables for inserts and updates, those being by far the easiest to tweak. Given how widely spread throughout the codebase the selects are, I haven't had the time to think through a reasonable approach there.
Though I like what was suggested (by you I think) back in Feb, around the idea of SqlString < String. Has anyone looked into that further?
Are you attending RailsConf this week?
Perhaps a group of us could get together to dig into this?
I like the gist of your idea and approach. Thanks for working on it.
I'd start with using bound variables first, though. They're always
beneficial and are a natural stepping stone to prepared statements
which can be added later for those queries which may benefit (many do
not).
Thanks. Though I don't exactly understand what you mean - aren't
prepared statements and bounded variables the same thing? That is, is
it not so that one cannot exist without the other?
My patch uses both prepared statements and bounded variables, in case
that wasn't clear.
Sorry I didn't comment on your earlier postgres patch: it includes a
lot of superfluous style edits and untested changes, so I didn't
review further. I salute your effort but it could use a lot of cleanup
before committing.
No need to apologize. And I'd like to add a minor correction: the
PostgreSQL patch isn't mine, I'm merely trying to get attention on
that patch, because it has a lot of fixes that I depend on.
I have no experience with Oracle, but according to some blogs, such as
this one:
Rails performs badly because of the lack of use of prepared
statements. Here is a quote from Greg Luck:
"We have two production applications running on Ruby. And how is it.
Well, despite being perhaps no more than 5% of the functionality of
our applications, Ruby on Rails is the number one consumer of Oracle
CPU and logical gets. Why? Rails does not support prepared statements,
so Oracle has to reparse every time."
And I forgot to mention this last time, but there's one more reason
for using prepared statements/bounded variables: binary support in
PostgreSQL. I have a Rails app which uses PostgreSQL, and its
Person.password_password column was of the type bytea. It contains a
raw hash of the user's password. When a user logs in, the app
performed the following query:
Person.find_by_username_and_password_hash(username,
calculate_hash(password))
This will send a query to PostgreSQL, and this query will contain a
binary string. However, in PostgreSQL, binary strings must be escaped
with escape_bytea() - the regular string escaping function will not
do. Because the binary string isn't escaped properly, the whole
database connection breaks (it throws an exception, something about
"character not allowed in Unicode mode" or something).
One may wonder why the binary string isn't escaped correctly. The
reason is this:
1. It seems impossible in Ruby to determine whether a String contains
binary data without scanning through the entire String (expensive), so
the current PostgreSQL adapter assumes that strings are text strings,
and escapes them as text strings, unless it knows that this string
belongs to a binary column.
2. During ActiveRecord::Base.save, ActiveRecord will now which
attributes belong to which columns, so saving records that contain
binary data is no problem. The string escaper will have knowledge
about which argument belongs to which column.
3. However, things go wrong during ActiveRecord::Base.find. In there,
it is impossible to find out which arguments belong to which table
columns.
By using prepared statements/bounded variables, we can eliminate this
entire problem. Strings will not be escaped, but be directly bounded
to a prepared statements.
I'd have to disagree with you there. Rails actually performs very well
with Oracle. A change made a while back to default to "similar"
cursor_sharing resulted in much of the benefit of bind variables.
"performs badly" is definitely too strong.
I have no experience with Oracle, but according to some blogs, such as
this one:
Uncommented Bytes - Jeff Sheets: Major Enterprise Ruby on Rails Issues?
Rails performs badly because of the lack of use of prepared
statements. Here is a quote from Greg Luck:
"We have two production applications running on Ruby. And how is it.
Well, despite being perhaps no more than 5% of the functionality of
our applications, Ruby on Rails is the number one consumer of Oracle
CPU and logical gets. Why? Rails does not support prepared statements,
so Oracle has to reparse every time."
This was before Rails 1.2 which introduced the cursor_sharing=similar session parameter. It's oracle-specific and makes the database rewrite the queries on the fly to use bind variables instead of literal values.
Btw even before 1.2 you may had used that parameter (only database-wide instead of on a single application).
In any case using real bind variable would represent a further optimization step, as it'd allow a fine tuning of which bind variables vs literal values use in each query.
The former refers to your intention to drop prepared statements after each query. The latter is what would happen if you didn’t cleanup these prepared statements - you’d suddenly have hundreds of them in the database. I think databases would blindly allow it to grow to unreasonable levels - can they do otherwise? Prepared statements were designed to be made manually; I don’t think any of us can predict what happens when you have an insane number of them active, eg. would it be a hog, would it take up memory, etc.
Hongli Lai wrote:
In any case using real bind variable would represent a further optimization step, as it'd allow a fine tuning of which bind variables vs literal values use in each query.
Agreed. In my tests the _client_ performance impact of using real bind variables instead of relying on cursor_sharing of similar was basically non-existent, but there was some (small, but still some) server improvement. Plus some Oracle query plans worked better (ie., required fewer hints) when constants were left as constants.
Jeremy Kemper wrote:
> I'd start with using bound variables first, though. They're always
> beneficial and are a natural stepping stone to prepared statements
> which can be added later for those queries which may benefit (many do
> not).
From my tests w/ Oracle, all the benefit was in the use of (real, aka
database) bind variables. For a while now I've had a working (passes all
tests) implementation of AR that uses bind variables for inserts and
updates, those being by far the easiest to tweak. Given how widely
spread throughout the codebase the selects are, I haven't had the time
to think through a reasonable approach there.
Though I like what was suggested (by you I think) back in Feb, around
the idea of SqlString < String. Has anyone looked into that further?
I wish
> Are you attending RailsConf this week?
Perhaps a group of us could get together to dig into this?
That'd be awesome. I'll be at the hackfest at the Jupiter hotel the
nights of the 17th-19th.
It's at 800 E Burnside St, about a half mile from the convention center.
Though I like what was suggested (by you I think) back in Feb, around
the idea of SqlString < String. Has anyone looked into that further?
I wish
I actually started this morning to look into it
Are you attending RailsConf this week?
Perhaps a group of us could get together to dig into this?
That'd be awesome. I'll be at the hackfest at the Jupiter hotel the
nights of the 17th-19th.
It's at 800 E Burnside St, about a half mile from the convention center.
aahh I won't be at the conf... any way to meet in a virtual space? perhaps the #railsconf irc channel?
I tested this with MySQL. I can't have unlimited number of prepared
statements. After a certain threshold, queries will fail.
My MySQLAdapter implementation caches up to 32 prepared statements
(this number can be changed). If the cache is full, the least recently
used one will be removed. Though, according to benchmarks, caching
prepared statements doesn't improve performance at all on MySQL.
I tested this with MySQL. I can’t have unlimited number of prepared
statements. After a certain threshold, queries will fail.
It would be good to know what
is this treshold.
… according to benchmarks, caching
prepared statements doesn’t improve performance at all on MySQL.
It is possible that there is a performance boost on raw queries, but that it’s not visible because the ActiveRecord object instantiation time is an order of magnitude higher.
Also, simple queries against a database that has a miniscule amount of records don’t benefit much from having a cached execution plan. You could first benchmark several hundred thousand INSERTs, then UPDATE those records, then query those tables with SELECTs with as much conditions and joins you could throw in. Don’t forget the database indexes which play an important role in laying out the execution plan.
For some reason I'm unable to reproduce that problem now. Perhaps I
was mistaken.
It is possible that there is a performance boost on raw queries, but that
it's not visible because the ActiveRecord object instantiation time is an
order of magnitude higher.
Also, simple queries against a database that has a miniscule amount of
records don't benefit much from having a cached execution plan. You could
first benchmark several hundred thousand INSERTs, then UPDATE those records,
then query those tables with SELECTs with as much conditions and joins you
could throw in. Don't forget the database indexes which play an important
role in laying out the execution plan.
I've searched Google for benchmarks on prepared-vs-unprepared
statements, and to my surprise I couldn't find any. I'll try to setup
a good benchmark.
Thank you so much for working on this feature. Prepared statement/
bind variable support is critical for Oracle projects, especially in
the (evil) enterprise world. I know that Rails isn't targeted for
large enterprise apps, but that doesn't mean it isn't targeted at
_enterprises_. Lots of small and medium-sized internal apps would
benefit from Rails, but interoperability/ROI may dictate Oracle as the
database. Enterprise DBAs may (rightly, IMHO) scoff at a framework
that lacks prepared statement support.
Very interesting to see you working on this project. I work a lot for
Enterprises and it makes a lot of difference when it comes to Oracle
with prepared statements and data binding. In case of loops etc. we
had find the performance difference to be in multiples.
I was just wondering how far this development has reached and when I
would be able to use it. Planning a large rails project around Oracle
soon. Good thing I know that the existing rails work and in future we
would be able to get much better performance.
Good to see someone appreciates my work. MySQL support is probably
done. I was still working on PostgreSQL support. There's no Oracle
support though as I don't have access to an Oracle database, but
modifying the Oracle adapter should be fairly trivial.
The work is currently on hold as I'm busy with other things, but I
intent to resume this work in the near future.
> Very interesting to see you working on this project. I work a lot for
> Enterprises and it makes a lot of difference when it comes to Oracle
> with prepared statements and data binding. In case of loops etc. we
> had find the performance difference to be in multiples.
> I was just wondering how far this development has reached and when I
> would be able to use it. Planning a large rails project around Oracle
> soon. Good thing I know that the existing rails work and in future we
> would be able to get much better performance.
Good to see someone appreciates my work. MySQL support is probably
done. I was still working on PostgreSQL support. There's no Oracle
support though as I don't have access to an Oracle database, but
modifying the Oracle adapter should be fairly trivial.
The work is currently on hold as I'm busy with other things, but I
intent to resume this work in the near future.
Did I hear you live in the Enschede region? I do too and would
like to offer my help to implement the PostgreSQL side of things. Send
me a note if you'd like to meet up sometime and I'd gladly share
drinks and thoughts.