Prepared SQL statements

Dear all great developers.

I wonder if there is any work being done or thoughts being shared regarding prepared SQL statements.

Currently I am in heavy need of tuning SQL INSERTS that insert large (3MB) images into blobs.

I am using rails 2.3.5 with postgres adapter and pg driver.

Any information would be appreciated.

Jarl

hi for what purpose u stored image in database?) in most cases, its bad choise. store at file system, and send it with super-puper fast nginx =)

Ivan Nastyukhin dieinzige@me.com

I don’t quite see how this relates to Ruby on Rails Core, unless you’re wanting to discuss a patch / issue for Rails core. This seems like an application support question. For Ruby on Rails assistance / discussion, please discuss on the rubyonrails-talk mailing list.

I agree with Ivan: Do not store images in the database. Files go on the filesystem.

Looking at the title, it seems he is suggesting the use of Prepared Statements in ActiveRecord, which seems pretty reasonable, even if storing images on database is not...

Rodrigo.

Jarl Friis <jarl@gavia.dk> writes:

Dear all great developers.

I wonder if there is any work being done or thoughts being shared regarding prepared SQL statements.

OK. Let that be it (thanks Rodrigo Rosenfeld Rosas). I am interested in any comments on work/thoughs regarding prepared statements. This was not intended as a support discussion on whether or not images in DB is a good choice or not.

If I had a patch that used exploited prepared statements (I don't currently have), would the rails community be interested?

Jarl

Here's some info on prepared statements in Postgres:

http://developer.postgresql.org/pgdocs/postgres/sql-prepare.html

I haven't used them often, but Sequel has them, and I recall on one occasion being able to shave about 15-20% off the run time of a job that had to do a few hundred thousand large inserts; this falls under the category of a "single session is being used to execute a large number of similar statements" that the doc describes.

While sometimes quite useful, I'm not sure where this would really fit into AR. AR is almost exclusively an ORM by design, while Sequel is a database manipulation library that also happens to provide an ORM. Sequel's ORM doesn't automatically use them last time I checked.

It would be interesting to see the overall performance impact if, for example, the typical "SELECT * FROM table WHERE id = ? LIMIT 1" that AR generates were prepared rather than just executed directly. I would guess that Postgres already has sufficient logic to cache query plans for something like this even without using prepared statements, but I'm not sure.

Regards,

Norman

Ok, I was curious so I took a look into it.

http://gist.github.com/468116

Unless there's some logical flaw in my benchmark code, it looks like prepared statements run about 50% faster overall for these types of queries.

"Gist has been deleted"

Konstantin

Ah, sorry - let me try again:

http://gist.github.com/468126

Hi Norman.

Thanks for comments.

Norman Clarke <norman@njclarke.com> writes:

Looking at the title, it seems he is suggesting the use of Prepared Statements in ActiveRecord, which seems pretty reasonable, even if storing images on database is not...

Here's some info on prepared statements in Postgres:

http://developer.postgresql.org/pgdocs/postgres/sql-prepare.html

Yes Postgres has it, almost every (maybe not SQLite) SQL databases on earth has the feature. Even the (ruby) pg driver for postgres supports the feature.

My point is the Rails adapters (at least not postgres) does not exploit it. Even the design of AR (ActiveRecord::Base) does not make it easy to make a new Rails DB adapter that exploits the feature. It requires som hacking on ActiveRecord(::Base) it self.

I am pretty sure that putting binary data in the database makes a huge performace difference when comparing prepared statements and direct exec() calls. The major difference is AR has to quote, that is, escape all the byte. Meaning handing over 4 times as many bytes (each non-ascii is represented by a backslash and the octal representation). After that it has to be passed to the DB engine (through the ruby driver) and the parsed by the DB parser.

If prepared statements where used, the binary data (in memory of the Rails app) could be passed as a binary type to the DB driver (using pg for PostgreSQL).

I am pretty confident that could improve performance. I'll get back if/when I have some measurements.

Ok, I was curious so I took a look into it.

http://gist.github.com/468116

Seems like this is the wrong link.

Jarl

$ ruby -e 'puts((17.639 - 11.099) / 17.639)' 0.37076931798854806

That's 37% faster for prepared (or direct is 59% slower)

-Rob

Thanks, that looks like the kind of logical flaw I was talking about. :slight_smile:

http://gist.github.com/468116

Unless there's some logical flaw in my benchmark code, it looks like prepared statements run about 50% faster overall for these types of queries.

There's a logical flaw in your benchmark code. The situation is a little more nuanced

And for postgresql it can actually be significantly worse when using prepared statements:

http://wiki.postgresql.org/wiki/FAQ#Why_is_my_query_much_slower_when_run_as_a_prepared_query.3F

However having said that, it's definitely something that could be investigated as a feature in arel, then something we used with auto generated methods like find_by_code_and_status. This has come up a bunch of times but it never really got dragged over the line because the benefits are marginal for a lot of use cases so people lose the motivation to do the hard & fiddly work of finishing up.

Yes, I guess, but I can't speak for the entire Rails community :slight_smile:

But I don't think prepared statements would be used by default on ActiveRecord, but this should be explicited unless setting some option for enabling its use in all cases.

Usually, it is better not to use prepared statements, unless some statement will be batch executed where we can have some performance gains, but that will not be always true. There is also support for the situation that you described regarding binary data.

One of the bad things of using prepared statements if for debugging queries. Under development environment, we can enable config to see the SQL generated queries already with all conversions performed by Rails. When using prepared statements, we would see an output like '["... where a=?", true]' instead of "... where a=1" or "... where a='t'", etc. So it is harder to guess how the driver is converting the objects to SQL values. Specially if you want to run the entire generated SQL in some tool...

In Sequel, it is possible to call a "sql" method to see the generated SQL from a Sequel object. I missed this in ActiveRecord for a long time. I guess this is possible for ActiveRecord in Rails 3 since it is using Arel which takes a similar approach to Sequel, but I'm not sure... Do someone know?

Rodrigo.

Yes (sort of). You can call #to_sql on an ActiveRecord::Relation to get the SQL it would most likely generate -- however, that breaks down when eager loading is involved. An excerpt from an article I wrote a while back (http://metautonomo.us/2010/05/11/activerecord-relation-vs-arel/):

If eager loading is a factor in your query, either due to explicit eager_loading values, orincludes values that are also used in your where conditions, then you’ll notice that the value returned by to_sql doesn’t reflect what actually gets run on the server:

   

In Sequel, it is possible to call a "sql" method to see the generated SQL from a Sequel object. I missed this in ActiveRecord for a long time. I guess this is possible for ActiveRecord in Rails 3 since it is using Arel which takes a similar approach to Sequel, but I'm not sure... Do someone know?

Yes (sort of). You can call #to_sql on an ActiveRecord::Relation to get the SQL it would most likely generate -- however, that breaks down when eager loading is involved. An excerpt from an article I wrote a while back (metautonomo.us):    

Hi Ernie, nice article, thanks. That remembers me that ActiveRecord still doesn't deal properly with database schemes, such as those present in PostgreSQL and Oracle (I guess).

For instance, this is probably not possible in ActiveRecord:

class SomeTable < ActiveRecord::Base      set_table 'some_scheme.some_table' end

If eager loading is a factor in your query, either due to explicit eager_loading values, orincludes values that are also used in your where conditions, then you�ll notice that the value returned by to_sql doesn�t reflect what actually gets run on the server:

...

Of course, given the nature of how eager loading and including works, you�d expect the query to differ when they�re involved. So to_sql is a useful tool to display what�s being generated in your queries � just be aware of its limitations and remember that it�s no substitute for checking the logs when confusion arises.     Is it difficult to add some method to ActiveRecord that displays exactly the SQL that will be run, or that would be run if eager loading was set? I mean, maybe some methods like "eager_sql" and "lazy_sql"? I don't know ActiveRecord's code, but I guess that sometime it will need to generate the full sql string, so the method should be already there... Is there any reasons why we can't just see the full sql statement without actually running the query? Sometimes it would even easy some test/spec writing or even make some of them to run faster...

Thanks for the article,

Rodrigo.

I don't think it would be terribly difficult, no -- it'd basically be doing arel.to_sql or a partial version of find_with_associations depending on the result of eager_loading? that looks something like this and sticking it in relation.rb (not tested):

def real_sql   if eager_loading?     including = (@eager_load_values + @includes_values).uniq     join_dependency = ActiveRecord::Associations::ClassMethods::JoinDependency.new(@klass, including, nil)     construct_relation_for_association_find(join_dependency).to_sql   else     arel.to_sql   end end

I think it's just been a matter of there not being a real use case requiring it to be in core. I could throw it into MetaWhere easily enough if enough people thought it'd be useful though. I'm trying to position it as a sort of "AR Query Interface 3.0 on steroids" anyway.

-Ernie

Ernie, maybe if the sql generation part was extracted in a separate method and called by find_with_associations, this could simplify understanding the code as well as allowing the generated sql to be used outside this method. For instance, someone could think it is useful to log the SQL queries in database for taking statistics and deciding which queries to invest effort to optimize based on frequency or just to help understanding what is happening to their relations.

Suppose you have some real slow query (takes up more than a minute to run) and you're trying to optimize it. Instead of waiting for the query to finish, if you are only interested in seeing the generated SQL for some analysis, the possibility to just view the SQL would be great. Specially if you are going to make several attempts to rewrite the relations and observe the results in SQL generation...

Well, I'm saying that because I used this feature a lot while working with Sequel and I thought it would be a good addition to ActiveRecord and I don't see any disadvantages in exposing this method to ActiveRecord.

If you decide to write this helpful method, I would be extremely glad! I'm already glad you took the path to integrate Arel in ActiveRecord and provided a much better interface to AR anyway :wink:

Best regards,

Rodrigo.

Just a clarification -- I appreciate the credit, but while I did the MetaWhere plugin, and contributed a tiny patch to Arel and AR here and there, the overwhelming majority of the work you're crediting was done by Emilio Tagua and Pratik Naik. :slight_smile:

Just as a quick update: I tried my untested code above, and by some miracle it actually worked, so I just pushed version 0.5.2 of MetaWhere to GitHub and Rubygems.org. The method's been renamed debug_sql, though. real_sql just reminded me too much of PHP's mysql_real_* functions and anything that makes me think of PHP tends to make me rethink it... :slight_smile:

Happy debugging! :slight_smile:

-Ernie