How I've Monkeypatched AR

Following up on Koz's suggestion to discuss how plugin developers are monkey patching AR, I'd like to share some of things my coworkers and I have done.

1.) Truncate. Currently implemented in ActiveWarehouse ETL for MySQL. 2.) Bulk import and export using the DB's tool (for example, bulk import with LOAD DATA INFILE for MySQL or BCP for SQL Server). ActiveWarehouse ETL currently implements the import functionality for MySQL only. 3.) Views support. I've redefined tables, created a new method for views and a new method to get the SQL query used for the view, and added a supports_views? method. This is implemented in the Rails SQL Views plugin.

One other thing I would like to see is for the following methods to be moved out of AR::Base if possible and moved into the adapter layer or into their own module so they can be reused easier:

sanitize_sql replace_bind_variables replace_named_bind_variables quote_bound_value raise_if_bind_arity_mismatch

V/r Anthony Eden

Following up on Koz's suggestion to discuss how plugin developers are monkey patching AR, I'd like to share some of things my coworkers and I have done.

1.) Truncate. Currently implemented in ActiveWarehouse ETL for MySQL.

truncate as in 'truncate table foo'?

2.) Bulk import and export using the DB's tool (for example, bulk import with LOAD DATA INFILE for MySQL or BCP for SQL Server). ActiveWarehouse ETL currently implements the import functionality for MySQL only.

This stuff should definitely remain in plugins

3.) Views support. I've redefined tables, created a new method for views and a new method to get the SQL query used for the view, and added a supports_views? method. This is implemented in the Rails SQL Views plugin.

One other thing I would like to see is for the following methods to be moved out of AR::Base if possible and moved into the adapter layer or into their own module so they can be reused easier:

sanitize_sql replace_bind_variables replace_named_bind_variables quote_bound_value raise_if_bind_arity_mismatch

I think this is one of the messier areas of the current query generation code. Because we do replace_bind_variables on each of the [string, val, val] blocks we're passed, we can't use prepared statements. Now while this doesn't matter for mysql or postgresql, in the big commercial databases it makes a huge difference.

If we were instead to build up some in memory structure representing the query, then pass it to the adapters to convert to a SQL string, adapters which need prepared statements could utilise them. Chances are the code would improve too.

> > Following up on Koz's suggestion to discuss how plugin developers are > monkey patching AR, I'd like to share some of things my coworkers and > I have done. > > 1.) Truncate. Currently implemented in ActiveWarehouse ETL for MySQL.

truncate as in 'truncate table foo'?

Yes.

> 2.) Bulk import and export using the DB's tool (for example, bulk > import with LOAD DATA INFILE for MySQL or BCP for SQL Server). > ActiveWarehouse ETL currently implements the import functionality for > MySQL only.

This stuff should definitely remain in plugins

So be it. :slight_smile: I just think that since it is something that is fairly standard (in the sense that most RDBMs that I've seen implement some sort of bulk facility) it would be nice if it was defined at the adapter level with a common method or set of methods to get at it. But no worries one way or another.

> 3.) Views support. I've redefined tables, created a new method for > views and a new method to get the SQL query used for the view, and > added a supports_views? method. This is implemented in the Rails SQL > Views plugin.

> One other thing I would like to see is for the following methods to be > moved out of AR::Base if possible and moved into the adapter layer or > into their own module so they can be reused easier: > > sanitize_sql > replace_bind_variables > replace_named_bind_variables > quote_bound_value > raise_if_bind_arity_mismatch

I think this is one of the messier areas of the current query generation code. Because we do replace_bind_variables on each of the [string, val, val] blocks we're passed, we can't use prepared statements. Now while this doesn't matter for mysql or postgresql, in the big commercial databases it makes a huge difference.

Indeed.

If we were instead to build up some in memory structure representing the query, then pass it to the adapters to convert to a SQL string, adapters which need prepared statements could utilise them. Chances are the code would improve too.

Agreed. Personally I'd like to see ezwhere integrated and used as the interface over this in-memory query structure.

V/r Anthony

Following up on Koz's suggestion to discuss how plugin developers are monkey patching AR, I'd like to share some of things my coworkers and I have done.

Calling all plugin authors. If you've been frustrated when adding functionality to Active Record, speak here, or forever ... yeah :slight_smile:

Things I think I've nicely monkeypatched...

- adding temporary table support - adding query support for regular expressions, and having better hash support (adding prefix and suffix based modifies, ie: field_like or field_contains or matches_field) - adding efficient mass import functionality for MySQL (PostgreSQL 8.2 support is coming as well) for handling multiple value insert statements - adding fulltext index searching support for MySQL - adding block based foreign key enable/disable support - adding to_csv support which supports has_one, belongs_to and has_many relationships - adding custom query object support using duck typing and to_sql

I'd like to see all SQL generation moved into components which are registered for one or more adapters. I would like to see AbstractAdapter be registered for all of the common queries which are handled by all adapters ( LIKE, BETWEEN, etc ).

I'd like to see query components be handled LIFO style. This will keep the adapters registered on AbstractAdapter as the last resort, so a plugin or adapter specific behavior can override the generic support should it need to, without having to change AbstractAdapter code.

I'd like to see a nice way to handle prefix and suffix modifiers when using a Hash in #find related queries. This will allow you to support easier to read queries using suffixes like, "fieldname_matches" or "fieldname_contains" or "fieldname_is_not", etc.

What I'd like to see I've added to ActiveRecord::Extensions, but it'd be nice to have ActiveRecord support these things because I believe it will make it easier for contributors to add functionality for different databases since they won't have to understand the internals of ActiveRecord, instead they can focus on their query component, and simply add one line of code to register it.

My last request is that it is easier and more inviting for people to contribute.

Zach

Following up on Koz’s suggestion to discuss how plugin developers are monkey patching AR, I’d like to share some of things my coworkers and

I have done.

Calling all plugin authors. If you’ve been frustrated when adding functionality to Active Record, speak here, or forever … yeah :slight_smile:

Things I think I’ve nicely monkeypatched…

  • adding temporary table support

  • adding query support for regular expressions, and having better hash support (adding prefix and suffix based modifies, ie: field_like or field_contains or matches_field)

  • adding efficient mass import functionality for MySQL (PostgreSQL 8.2 support is coming as well) for handling multiple value insert statements

  • adding fulltext index searching support for MySQL

  • adding block based foreign key enable/disable support

  • adding to_csv support which supports has_one, belongs_to and has_many relationships

  • adding custom query object support using duck typing and to_sql

I’d like to see all SQL generation moved into components which are

registered for one or more adapters. I would like to see AbstractAdapter be registered for all of the common queries which are handled by all adapters ( LIKE, BETWEEN, etc ).

include Feature

include AnotherFeature

I really don’t see the need for this complexity.

I’d like to see a nice way to handle prefix and suffix modifiers when using a Hash in #find related queries. This will allow you to support easier to read queries using suffixes like, “fieldname_matches” or

“fieldname_contains” or “fieldname_is_not”, etc.

Agreed. There is similar support for attribute methods using attribute_method_suffix.

What I’d like to see I’ve added to ActiveRecord::Extensions, but it’d be nice to have ActiveRecord support these things because I believe it will make it easier for contributors to add functionality for

different databases since they won’t have to understand the internals of ActiveRecord, instead they can focus on their query component, and simply add one line of code to register it.

Again, the ‘registering components’ bit seems very strange to me. It’s not solving any problems IMO.

I think a stable interface and plugins are all you need.

My last request is that it is easier and more inviting for people to

contribute.

Now that’s wide open :slight_smile: Is it hard or uninviting?

jeremy

Ok, my patches are quite limited to get the InterBase driver going.

At the moment, the InterBase driver is a patch on ActiveRecord, I clearly (after my discussions with you) have to turn this into a plugin. I do need one thing:

InterBase uses UPPER for case insignifcance, it doesn’t have LOWER built in (not ANSI-SQL compliant) so I added in a property to abstract_adapter.rb:

has_many_polymorphs ( Evan Weaver ) use a custom AR association/reflection type now. I am pushing towards eliminating the "can't eager load polymorphic associations error" for all polymorphic relationships.

Also, re. Zach:

- adding query support for regular expressions, and having better - adding block based foreign key enable/disable support

Have you made these patches publicly available?

- adding fulltext index searching support for MySQL

I have a Sphinx model for use with the SphinxSE MySQL special engine type. It doesn't patch AR though.

Evan Weaver

Hey buddy,   please do elaborate on that last sentence.

Postgres doesn't benefit from prepared statements? What kind of performance improvements have you (not) been seeing? What kind of data/queries?

As you mention mysql and postgres in the same breath I'm thinking this is just FUD, but if there's really something here I'm eager to know.

Thanks, Isak

Koz has just been stating (I guess) that MySQL and Postgres don’t suffer as much when you don’t use prepared statements. Naturally, repeated queries (like AR model inserts, updates) would benefit from prepared statements even in those database, but in DB2 and Oracle the difference is much more stressed.

As you mention mysql and postgres in the same breath I'm thinking this is just FUD, but if there's really something here I'm eager to know.

Foo.find(:all, :conditions=>["bar = ?", bar)

Without changing our API there's no way we could specify the data type of a parameter to a prepared statement. As the postgres optimizer creates the execution plan when the prepared statement is created, it wouldn't be able to decide (sanely) which indexes to use, and query execution performance suffers.

has_many_polymorphs (Evan Weaver ) use a custom AR association/reflection type now. I am pushing towards eliminating the "can't eager load polymorphic associations error" for all polymorphic relationships.

Also, re. Zach:

> - adding query support for regular expressions, and having better > - adding block based foreign key enable/disable support

Have you made these patches publicly available?

rubyforge site:http://rubyforge.org/projects/arext/ official project blog: http://continuousthinking.com/tags/arext download link: http://rubyforge.org/frs/?group_id=2113 RDOC: http://continuousthinking.com/tags/arext/rdoc/index.html (scroll down to the ActiveRecord::Extensions:Regexp.... classes)

> - adding fulltext index searching support for MySQL

I have a Sphinx model for use with the SphinxSE MySQL special engine type. It doesn't patch AR though.

I haven't used Sphinx. Is it alot better then MySQL's builtin support?

Zach

> > > Following up on Koz's suggestion to discuss how plugin developers are > > > monkey patching AR, I'd like to share some of things my coworkers and > > > I have done.

> > Calling all plugin authors. If you've been frustrated when adding > > functionality to Active Record, speak here, or forever ... yeah :slight_smile:

> Things I think I've nicely monkeypatched...

> - adding temporary table support > - adding query support for regular expressions, and having better > hash support (adding prefix and suffix based modifies, ie: field_like > or field_contains or matches_field) > - adding efficient mass import functionality for MySQL (PostgreSQL > 8.2 support is coming as well) for handling multiple value insert > statements > - adding fulltext index searching support for MySQL > - adding block based foreign key enable/disable support > - adding to_csv support which supports has_one, belongs_to and > has_many relationships > - adding custom query object support using duck typing and to_sql

> I'd like to see all SQL generation moved into components which are > registered for one or more adapters. I would like to see > AbstractAdapter be registered for all of the common queries which are > handled by all adapters ( LIKE, BETWEEN, etc ).

include Feature include AnotherFeature

I really don't see the need for this complexity.

This depends on how things are implemented. I think we need to avoid implicitly encouraging people doing things like:

class MyAdapter < AbstractAdapter    alias :some_method :some_method_orig    def some_method       ....    end end

If we can avoid that then I'll all for it, but if that's how we're relying on mixins to provide additional functionality I think we can come up with a better way.

[snip]

Again, the 'registering components' bit seems very strange to me. It's not solving any problems IMO.

For me it's more greatly decoupling query generation from ActiveRecord::Base and actual adapter implementation.

I think a stable interface and plugins are all you need.

My last request is that it is easier and more inviting for people to

> contribute.

Now that's wide open :slight_smile: Is it hard or uninviting?

I think it could be easier. For me it'd be nice to give some developer some simple rules like:   - create class   - implement "process_query" method that takes x arguments   - return SQL string or nil   - register class for the adapters that support it's SQL syntax

It completely avoids having people go read through AR source to find out what sanitize_sql and quote_attribute_conditions, etc.. methods do. Rather then don't have to care about implementation, they can rather focus on the functionality they want.

Zach

This depends on how things are implemented. I think we need to avoid implicitly encouraging people doing things like:

class MyAdapter < AbstractAdapter alias :some_method :some_method_orig def some_method

  ....

end end

If we can avoid that then I’ll all for it, but if that’s how we’re relying on mixins to provide additional functionality I think we can come up with a better way.

I understand where you’re coming from; however:

It completely avoids having people go read through AR source to find

out what sanitize_sql and quote_attribute_conditions, etc… methods do. Rather then don’t have to care about implementation, they can rather focus on the functionality they want.

there just isn’t that much code here! It’s a handful of methods.

Adding this query handler registration scheme is easily more complex than the code it’s mean to simplify.

That said, I like the ideas you’re wrapping within these handlers.

jeremy

Thanks for the links.

The Sphinx storage engine actually creates fake tables - you add some custom parameters and it calls out to the Sphinx searchd process which can be on another box entirely. So you perform a SELECT * WHERE query = 'your search query;your=custom;params=here'; and get back a rowset. Because it operates within MySQL itself you can join at the source to your actual full records.

Sphinx itself is somewhat awkward to get running. Performance is stellar, though. Once I deploy my Sphinx wrappers on our high-volume site (mid next week probably), I'll make a post on my blog, if you want to keep an eye on that.

Evan Weaver

Foo.find(:all, :conditions => { :bar => bar })

Now we can, because AR knows about the “bar” column and its type. AR adapters could use prepared statements when applicable. Executes could fall back to plain queries when execution would suffer (like in your example).

Or am I overseeing something?

It's certainly possible to support prepared statements for a subset of our api, and i'd expect people will provide patches and benchmarks once the adapters make it feasible.

Just don't expect them to provide some magic performance boosts on all adapters. It's entirely possible that the code complexity would outweigh any performance improvements :).

What do you all think of adding a new key to find's option hash, for adapter specific extensions, optimizer hints and similar? Unrecognized keys would just be ignored.

E.g.: find(:all, :conditions => query, :ext => { :prepare => false, :db2 => {:prepare => true}, :mxyzptlk => false })

Isak

Well, I have a plugin called find_condition that I've had to monkeypatch AR for.

With find_condition installed, any AR class that defines find_condition() as a class method will insert that condition into *any* find() on the class, whether it is direct or through an association from some other class.

The direct reason for this is to allow me to implement my acts_as_invisible class, which conditionally hides rows of a table based on various conditions, generally if the current User is not in the group identified by the visible_to field.

Is this equivalent to creating role-driven views in the application, rather then enforcing them with a view at the database level? Any particular reason(s) why you did one over the other?

Zach