How I've Monkeypatched AR

Following up on Koz's suggestion to discuss how plugin developers are
monkey patching AR, I'd like to share some of things my coworkers and
I have done.

1.) Truncate. Currently implemented in ActiveWarehouse ETL for MySQL.
2.) Bulk import and export using the DB's tool (for example, bulk
import with LOAD DATA INFILE for MySQL or BCP for SQL Server).
ActiveWarehouse ETL currently implements the import functionality for
MySQL only.
3.) Views support. I've redefined tables, created a new method for
views and a new method to get the SQL query used for the view, and
added a supports_views? method. This is implemented in the Rails SQL
Views plugin.

One other thing I would like to see is for the following methods to be
moved out of AR::Base if possible and moved into the adapter layer or
into their own module so they can be reused easier:

sanitize_sql
replace_bind_variables
replace_named_bind_variables
quote_bound_value
raise_if_bind_arity_mismatch

V/r
Anthony Eden

Following up on Koz's suggestion to discuss how plugin developers are
monkey patching AR, I'd like to share some of things my coworkers and
I have done.

1.) Truncate. Currently implemented in ActiveWarehouse ETL for MySQL.

truncate as in 'truncate table foo'?

2.) Bulk import and export using the DB's tool (for example, bulk
import with LOAD DATA INFILE for MySQL or BCP for SQL Server).
ActiveWarehouse ETL currently implements the import functionality for
MySQL only.

This stuff should definitely remain in plugins

3.) Views support. I've redefined tables, created a new method for
views and a new method to get the SQL query used for the view, and
added a supports_views? method. This is implemented in the Rails SQL
Views plugin.

One other thing I would like to see is for the following methods to be
moved out of AR::Base if possible and moved into the adapter layer or
into their own module so they can be reused easier:

sanitize_sql
replace_bind_variables
replace_named_bind_variables
quote_bound_value
raise_if_bind_arity_mismatch

I think this is one of the messier areas of the current query
generation code. Because we do replace_bind_variables on each of the
[string, val, val] blocks we're passed, we can't use prepared
statements. Now while this doesn't matter for mysql or postgresql,
in the big commercial databases it makes a huge difference.

If we were instead to build up some in memory structure representing
the query, then pass it to the adapters to convert to a SQL string,
adapters which need prepared statements could utilise them. Chances
are the code would improve too.

>
> Following up on Koz's suggestion to discuss how plugin developers are
> monkey patching AR, I'd like to share some of things my coworkers and
> I have done.
>
> 1.) Truncate. Currently implemented in ActiveWarehouse ETL for MySQL.

truncate as in 'truncate table foo'?

Yes.

> 2.) Bulk import and export using the DB's tool (for example, bulk
> import with LOAD DATA INFILE for MySQL or BCP for SQL Server).
> ActiveWarehouse ETL currently implements the import functionality for
> MySQL only.

This stuff should definitely remain in plugins

So be it. :slight_smile: I just think that since it is something that is fairly
standard (in the sense that most RDBMs that I've seen implement some
sort of bulk facility) it would be nice if it was defined at the
adapter level with a common method or set of methods to get at it. But
no worries one way or another.

> 3.) Views support. I've redefined tables, created a new method for
> views and a new method to get the SQL query used for the view, and
> added a supports_views? method. This is implemented in the Rails SQL
> Views plugin.

> One other thing I would like to see is for the following methods to be
> moved out of AR::Base if possible and moved into the adapter layer or
> into their own module so they can be reused easier:
>
> sanitize_sql
> replace_bind_variables
> replace_named_bind_variables
> quote_bound_value
> raise_if_bind_arity_mismatch

I think this is one of the messier areas of the current query
generation code. Because we do replace_bind_variables on each of the
[string, val, val] blocks we're passed, we can't use prepared
statements. Now while this doesn't matter for mysql or postgresql,
in the big commercial databases it makes a huge difference.

Indeed.

If we were instead to build up some in memory structure representing
the query, then pass it to the adapters to convert to a SQL string,
adapters which need prepared statements could utilise them. Chances
are the code would improve too.

Agreed. Personally I'd like to see ezwhere integrated and used as the
interface over this in-memory query structure.

V/r
Anthony

Following up on Koz's suggestion to discuss how plugin developers are
monkey patching AR, I'd like to share some of things my coworkers and
I have done.

Calling all plugin authors. If you've been frustrated when adding
functionality to Active Record, speak here, or forever ... yeah :slight_smile:

Things I think I've nicely monkeypatched...

- adding temporary table support
- adding query support for regular expressions, and having better
hash support (adding prefix and suffix based modifies, ie: field_like
or field_contains or matches_field)
- adding efficient mass import functionality for MySQL (PostgreSQL
8.2 support is coming as well) for handling multiple value insert
statements
- adding fulltext index searching support for MySQL
- adding block based foreign key enable/disable support
- adding to_csv support which supports has_one, belongs_to and
has_many relationships
- adding custom query object support using duck typing and to_sql

I'd like to see all SQL generation moved into components which are
registered for one or more adapters. I would like to see
AbstractAdapter be registered for all of the common queries which are
handled by all adapters ( LIKE, BETWEEN, etc ).

I'd like to see query components be handled LIFO style. This will
keep the adapters registered on AbstractAdapter as the last resort, so
a plugin or adapter specific behavior can override the generic support
should it need to, without having to change AbstractAdapter code.

I'd like to see a nice way to handle prefix and suffix modifiers when
using a Hash in #find related queries. This will allow you to support
easier to read queries using suffixes like, "fieldname_matches" or
"fieldname_contains" or "fieldname_is_not", etc.

What I'd like to see I've added to ActiveRecord::Extensions, but it'd
be nice to have ActiveRecord support these things because I believe it
will make it easier for contributors to add functionality for
different databases since they won't have to understand the internals
of ActiveRecord, instead they can focus on their query component, and
simply add one line of code to register it.

My last request is that it is easier and more inviting for people to
contribute.

Zach

Following up on Koz’s suggestion to discuss how plugin developers are
monkey patching AR, I’d like to share some of things my coworkers and

I have done.

Calling all plugin authors. If you’ve been frustrated when adding
functionality to Active Record, speak here, or forever … yeah :slight_smile:

Things I think I’ve nicely monkeypatched…

  • adding temporary table support

  • adding query support for regular expressions, and having better
    hash support (adding prefix and suffix based modifies, ie: field_like
    or field_contains or matches_field)

  • adding efficient mass import functionality for MySQL (PostgreSQL
    8.2 support is coming as well) for handling multiple value insert
    statements

  • adding fulltext index searching support for MySQL

  • adding block based foreign key enable/disable support

  • adding to_csv support which supports has_one, belongs_to and
    has_many relationships

  • adding custom query object support using duck typing and to_sql

I’d like to see all SQL generation moved into components which are

registered for one or more adapters. I would like to see
AbstractAdapter be registered for all of the common queries which are
handled by all adapters ( LIKE, BETWEEN, etc ).

include Feature

include AnotherFeature

I really don’t see the need for this complexity.

I’d like to see a nice way to handle prefix and suffix modifiers when
using a Hash in #find related queries. This will allow you to support
easier to read queries using suffixes like, “fieldname_matches” or

“fieldname_contains” or “fieldname_is_not”, etc.

Agreed. There is similar support for attribute methods using attribute_method_suffix.

What I’d like to see I’ve added to ActiveRecord::Extensions, but it’d
be nice to have ActiveRecord support these things because I believe it
will make it easier for contributors to add functionality for

different databases since they won’t have to understand the internals
of ActiveRecord, instead they can focus on their query component, and
simply add one line of code to register it.

Again, the ‘registering components’ bit seems very strange to me. It’s not solving any problems IMO.

I think a stable interface and plugins are all you need.

My last request is that it is easier and more inviting for people to

contribute.

Now that’s wide open :slight_smile: Is it hard or uninviting?

jeremy

Ok, my patches are quite limited to get the InterBase driver going.

At the moment, the InterBase driver is a patch on ActiveRecord, I clearly (after my discussions with you) have to turn this into a plugin. I do need one thing:

InterBase uses UPPER for case insignifcance, it doesn’t have LOWER built in (not ANSI-SQL compliant) so I added in a property to abstract_adapter.rb:

has_many_polymorphs ( http://blog.evanweaver.com/pages/has_many_polymorphs
) use a custom AR association/reflection type now. I am pushing
towards eliminating the "can't eager load polymorphic associations
error" for all polymorphic relationships.

Also, re. Zach:

- adding query support for regular expressions, and having better
- adding block based foreign key enable/disable support

Have you made these patches publicly available?

- adding fulltext index searching support for MySQL

I have a Sphinx model for use with the SphinxSE MySQL special engine
type. It doesn't patch AR though.

Evan Weaver

Hey buddy,
  please do elaborate on that last sentence.

Postgres doesn't benefit from prepared statements? What kind of
performance improvements have you (not) been seeing? What kind of
data/queries?

As you mention mysql and postgres in the same breath I'm thinking this
is just FUD, but if there's really something here I'm eager to know.

Thanks,
Isak

Koz has just been stating (I guess) that MySQL and Postgres don’t suffer as much when you don’t use prepared statements. Naturally, repeated queries (like AR model inserts, updates) would benefit from prepared statements even in those database, but in DB2 and Oracle the difference is much more stressed.

As you mention mysql and postgres in the same breath I'm thinking this
is just FUD, but if there's really something here I'm eager to know.

Foo.find(:all, :conditions=>["bar = ?", bar)

Without changing our API there's no way we could specify the data type
of a parameter to a prepared statement. As the postgres optimizer
creates the execution plan when the prepared statement is created, it
wouldn't be able to decide (sanely) which indexes to use, and query
execution performance suffers.

has_many_polymorphs (http://blog.evanweaver.com/pages/has_many_polymorphs
) use a custom AR association/reflection type now. I am pushing
towards eliminating the "can't eager load polymorphic associations
error" for all polymorphic relationships.

Also, re. Zach:

> - adding query support for regular expressions, and having better
> - adding block based foreign key enable/disable support

Have you made these patches publicly available?

rubyforge site:http://rubyforge.org/projects/arext/
official project blog: http://continuousthinking.com/tags/arext
download link: http://rubyforge.org/frs/?group_id=2113
RDOC: http://continuousthinking.com/tags/arext/rdoc/index.html
(scroll down to the ActiveRecord::Extensions:Regexp.... classes)

> - adding fulltext index searching support for MySQL

I have a Sphinx model for use with the SphinxSE MySQL special engine
type. It doesn't patch AR though.

I haven't used Sphinx. Is it alot better then MySQL's builtin support?

Zach

> > > Following up on Koz's suggestion to discuss how plugin developers are
> > > monkey patching AR, I'd like to share some of things my coworkers and
> > > I have done.

> > Calling all plugin authors. If you've been frustrated when adding
> > functionality to Active Record, speak here, or forever ... yeah :slight_smile:

> Things I think I've nicely monkeypatched...

> - adding temporary table support
> - adding query support for regular expressions, and having better
> hash support (adding prefix and suffix based modifies, ie: field_like
> or field_contains or matches_field)
> - adding efficient mass import functionality for MySQL (PostgreSQL
> 8.2 support is coming as well) for handling multiple value insert
> statements
> - adding fulltext index searching support for MySQL
> - adding block based foreign key enable/disable support
> - adding to_csv support which supports has_one, belongs_to and
> has_many relationships
> - adding custom query object support using duck typing and to_sql

> I'd like to see all SQL generation moved into components which are
> registered for one or more adapters. I would like to see
> AbstractAdapter be registered for all of the common queries which are
> handled by all adapters ( LIKE, BETWEEN, etc ).

include Feature
include AnotherFeature

I really don't see the need for this complexity.

This depends on how things are implemented. I think we need to avoid
implicitly encouraging people doing things like:

class MyAdapter < AbstractAdapter
   alias :some_method :some_method_orig
   def some_method
      ....
   end
end

If we can avoid that then I'll all for it, but if that's how we're
relying on mixins to provide additional functionality I think we can
come up with a better way.

[snip]

Again, the 'registering components' bit seems very strange to me. It's not
solving any problems IMO.

For me it's more greatly decoupling query generation from
ActiveRecord::Base and actual adapter implementation.

I think a stable interface and plugins are all you need.

My last request is that it is easier and more inviting for people to

> contribute.

Now that's wide open :slight_smile: Is it hard or uninviting?

I think it could be easier. For me it'd be nice to give some developer
some simple rules like:
  - create class
  - implement "process_query" method that takes x arguments
  - return SQL string or nil
  - register class for the adapters that support it's SQL syntax

It completely avoids having people go read through AR source to find
out what sanitize_sql and quote_attribute_conditions, etc.. methods
do. Rather then don't have to care about implementation, they can
rather focus on the functionality they want.

Zach

This depends on how things are implemented. I think we need to avoid
implicitly encouraging people doing things like:

class MyAdapter < AbstractAdapter
alias :some_method :some_method_orig
def some_method

  ....

end
end

If we can avoid that then I’ll all for it, but if that’s how we’re
relying on mixins to provide additional functionality I think we can
come up with a better way.

I understand where you’re coming from; however:

It completely avoids having people go read through AR source to find

out what sanitize_sql and quote_attribute_conditions, etc… methods
do. Rather then don’t have to care about implementation, they can
rather focus on the functionality they want.

there just isn’t that much code here! It’s a handful of methods.

Adding this query handler registration scheme is easily more complex than the code it’s mean to simplify.

That said, I like the ideas you’re wrapping within these handlers.

jeremy

Thanks for the links.

The Sphinx storage engine actually creates fake tables - you add some
custom parameters and it calls out to the Sphinx searchd process which
can be on another box entirely. So you perform a SELECT * WHERE query
= 'your search query;your=custom;params=here'; and get back a rowset.
Because it operates within MySQL itself you can join at the source to
your actual full records.

Sphinx itself is somewhat awkward to get running. Performance is
stellar, though. Once I deploy my Sphinx wrappers on our high-volume
site (mid next week probably), I'll make a post on my blog, if you
want to keep an eye on that.

Evan Weaver
http://blog.evanweaver.com

Foo.find(:all, :conditions => { :bar => bar })

Now we can, because AR knows about the “bar” column and its type. AR adapters could use prepared statements when applicable. Executes could fall back to plain queries when execution would suffer (like in your example).

Or am I overseeing something?

It's certainly possible to support prepared statements for a subset of
our api, and i'd expect people will provide patches and benchmarks
once the adapters make it feasible.

Just don't expect them to provide some magic performance boosts on
all adapters. It's entirely possible that the code complexity would
outweigh any performance improvements :).

What do you all think of adding a new key to find's option hash, for
adapter specific extensions, optimizer hints and similar? Unrecognized
keys would just be ignored.

E.g.:
find(:all, :conditions => query, :ext => { :prepare => false, :db2 =>
{:prepare => true}, :mxyzptlk => false })

Isak

Well, I have a plugin called find_condition that I've had to
monkeypatch AR for.

With find_condition installed, any AR class that defines
find_condition() as a class method will insert that condition into
*any* find() on the class, whether it is direct or through an
association from some other class.

The direct reason for this is to allow me to implement my
acts_as_invisible class, which conditionally hides rows of a table
based on various conditions, generally if the current User is not in
the group identified by the visible_to field.

Is this equivalent to creating role-driven views in the application,
rather then enforcing them with a view at the database level? Any
particular reason(s) why you did one over the other?

Zach