ActiveRecord Refactoring

Starting Monday I'll be working for myself and one of the things I'm
looking forward to is working on ActiveRecord more. I've written and
released some ActiveRecord extensions/plugins [0] and as well as
posted some thoughts on ActiveRecord's modularity [1]. With this in
mind I want to work up a patch to refactor how ActiveRecord itself
handles queries.

Alot of what I'd like to do is very similar to my approach with better
finder support in ActiveRecord::Extensions [1]

Since now is the time that alot of changes are happening for Rails
2.0... what's the core groups feedback on making query support for
ActiveRecord more modular.

Do you see the current AR implementation limiting what AR can do since
anytime someone needs to add functionality they have to write a plugin
which overrides AR core methods?

In short I think AR would get alot of benefits if AR included things
like declarative style registration of query support.
register GenericQuerySupport, :adapters => :all

This way someone can easily register special functionality for MySQL,
PostgreSQL, MS SQL, Oracle, etc, and release the plugin without
overriding AR. They can just call:
  ActiveRecord::Base.register
PostgreSQLRegexpSupport, :adapters=>:postgresql

Thoughts?

Zach

0 - http://www.continuousthinking.com/tags/arext
1 - http://www.rubyinside.com/advent2006/17-extendingar.html

Starting Monday I'll be working for myself and one of the things I'm
looking forward to is working on ActiveRecord more.

Congratulations, it's a bunch of fun :).

I've written and
released some ActiveRecord extensions/plugins [0] and as well as
posted some thoughts on ActiveRecord's modularity [1]. With this in
mind I want to work up a patch to refactor how ActiveRecord itself
handles queries.

We've spoken about this several times in the past, and it's still one
of our goals for the rails 2.0 release. The query generation code
is kinda ugly, and frankly a lot of it is in the wrong place.

We should be generating the queries inside the adapters, that way we
can just use polymorphism to provide adapter specific behaviour.

Alot of what I'd like to do is very similar to my approach with better
finder support in ActiveRecord::Extensions [1]

Since now is the time that alot of changes are happening for Rails
2.0... what's the core groups feedback on making query support for
ActiveRecord more modular.

What do you mean 'more modular', what are the specific problems
you're hitting? I think there's probably a place for adding some
stable APIs to make some common-customisations easier. So long as it
doesn't materially impact the complexity of ActiveRecord, or the
performance of it, then I think everyone wins when plugins don't need
to break every release.

Perhaps we need to get a thread going with AR plugin authors, see what
stuff they're monkeypatching, find a few common cases, and build an
API?

I'd like to see the condition methods moved out of ActiveRecord and
into a more general location...

condition_block?
evaluate_condition

They currently reside in validations.rb

The functionality they provide can be used nicely in ActionController
and possibly other places where class methods accept a "method" type
parameter like the validation methods currently do with the :if
parameter.

I've developed a plugin for ActionController and "borrowed" the code
from ActiveRecord with some minor tweaks to remove the ActiveRecord
specific exception code.

Thoughts?

Thanks,
Andrew

Will the change make the Abstract Adapter really abstract while moving all generic SQL to “Abstract SQL Adapter” (which the current concrete adapters would subclass instead)? Removing all SQL-specific code from AR::Base would also be neat.

That change would make room, for example, an adapter for XML databases in the future!

One thing I experimented with a while back was an explicit Query object, returned by the Adapter.

http://urlx.org/opendarwin.org/73585

The idea is that rather than always passing direct SQL in, developers would typically call appropriate methods on the query object, and pass that.

Alas, I never got very far. However, if this is compatible with the direction things are going, I’d be happy to help out.

– Ernie P.

It's highly unlikely that we'll make ActiveRecord anything other than
a SQL database persistence tool. But if someone comes up with a nice
XML database persistence thing, we'll refactor out common code into
something like 'active model'.

[snip]

We've spoken about this several times in the past, and it's still one
of our goals for the rails 2.0 release. The query generation code
is kinda ugly, and frankly a lot of it is in the wrong place.

We should be generating the queries inside the adapters, that way we
can just use polymorphism to provide adapter specific behaviour.

There seems to be two ways to handle query generation which are better
then the existing methods.
   1 - Generate queries inside the adapters
   2 - Register queries to be associated with certain adapters.

Currently I prefer #2 because it allows you to have only one spot
where a particular query is generated and you can associate (register)
that query with one or more database adapters. #1 all by itself makes
sense, but since some db vendors do support the same semantics for
certain functions or queries you should be able to support DRY for
those overlapping queries. I also like #2 better because you can
simlply add a supported adapter once it has been tested.

For example, let's say I register RegexpQuerySupport which currently
has been tested on MySQL.
   register RegexpQuerySupport, :adapters=>:mysql

Now if you download it and test it against PostgreSQL by changing
":mysql" to "[ :mysql, :postgresql ]" and it works, that's the all
effort involved. If it didn't work, you could write a
RegexpQuerySupport object inside the PostgreSQL adapter and register
it for :postgresql.

> Alot of what I'd like to do is very similar to my approach with better
> finder support in ActiveRecord::Extensions [1]

> Since now is the time that alot of changes are happening for Rails
> 2.0... what's the core groups feedback on making query support for
> ActiveRecord more modular.

What do you mean 'more modular', what are the specific problems
you're hitting?

I think query generation should be separated from ActiveRecord::Base.
Right now to add custom query support you have to hardcode a method/
query, or you have to override ActiveRecord::Base methods like
sanitize_sql, quote to get it to work across the board depending on
the functionality you're adding. The query support for ActiveRecord
should be more modular so adding a new query won't necessarily break
other code in ActiveRecord. Given this making the query support more
component driven also isolates and limits the ripple effect that
people's plugins would cause.

I think there's probably a place for adding some
stable APIs to make some common-customisations easier. So long as it
doesn't materially impact the complexity of ActiveRecord, or the
performance of it, then I think everyone wins when plugins don't need
to break every release.

Implementing a behavioral pattern similar to chain of responsibility
seems to be a good route. I use this in ActiveRecord::Extensions and
so far it's worked great. If you happen to pull down the source [0]
look in th extensions.rb file and check out how the MySQL, PostgreSQL
and Sqlite regular expression support is handled.

Perhaps we need to get a thread going with AR plugin authors, see what
stuff they're monkeypatching, find a few common cases, and build an
API?

That sounds like a great idea.

There could be and there probably is a better way to handle this query
generation in ActiveRecord, I look forward to seeing what everyone can
come up with together.

Zach

If you go the route of registering components for different adapters
you could support XML based adapters, because query generation would
be contained to the adapters that it supports without having to muck
up ActiveRecord to hack on XML support. I think this would be
possible,

Zach

There seems to be two ways to handle query generation which are better
then the existing methods.
1 - Generate queries inside the adapters
2 - Register queries to be associated with certain adapters.

Currently I prefer #2 because it allows you to have only one spot
where a particular query is generated and you can associate (register)
that query with one or more database adapters. #1 all by itself makes
sense, but since some db vendors do support the same semantics for

certain functions or queries you should be able to support DRY for
those overlapping queries. I also like #2 better because you can
simlply add a supported adapter once it has been tested.

For example, let’s say I register RegexpQuerySupport which currently

has been tested on MySQL.
register RegexpQuerySupport, :adapters=>:mysql

Now if you download it and test it against PostgreSQL by changing
“:mysql” to “[ :mysql, :postgresql ]” and it works, that’s the all

effort involved. If it didn’t work, you could write a
RegexpQuerySupport object inside the PostgreSQL adapter and register
it for :postgresql.

How is this different from include RegexpQuery in the adapter?

What do you mean ‘more modular’, what are the specific problems

you’re hitting?

I think query generation should be separated from ActiveRecord::Base.
Right now to add custom query support you have to hardcode a method/
query, or you have to override ActiveRecord::Base methods like

sanitize_sql, quote to get it to work across the board depending on
the functionality you’re adding. The query support for ActiveRecord
should be more modular so adding a new query won’t necessarily break

other code in ActiveRecord. Given this making the query support more
component driven also isolates and limits the ripple effect that
people’s plugins would cause.

Do you have a concrete example? What is ‘adding a new query’ and does it happen often?

I think there’s probably a place for adding some
stable APIs to make some common-customisations easier. So long as it

doesn’t materially impact the complexity of ActiveRecord, or the
performance of it, then I think everyone wins when plugins don’t need
to break every release.

Implementing a behavioral pattern similar to chain of responsibility

seems to be a good route. I use this in ActiveRecord::Extensions and
so far it’s worked great. If you happen to pull down the source [0]
look in th extensions.rb file and check out how the MySQL, PostgreSQL

and Sqlite regular expression support is handled.

This looks like mixing in a module, still.

Perhaps we need to get a thread going with AR plugin authors, see what
stuff they’re monkeypatching, find a few common cases, and build an
API?

That sounds like a great idea.

There could be and there probably is a better way to handle this query

generation in ActiveRecord, I look forward to seeing what everyone can
come up with together.

Me too!

I like Ernie’s suggestion of a SqlStatement-like object, or even better, settling on some duck-typing conventions for sql-like strings.

SqlString < String could also hold its bind variables. The database adapter could translate that to whatever native support for bound parameters, even prepare and cache the statement.

jeremy

I like Ernie's suggestion of a SqlStatement-like object, or even better,
settling on some duck-typing conventions for sql-like strings.

SqlString < String could also hold its bind variables. The database adapter
could translate that to whatever native support for bound parameters, even
prepare and cache the statement.

I tried something like this as an experiment 6 months ago and it
looked a promising approach. It's pretty simple to add a factory
method to abstract adapter to provide a new SqlString, and get active
record to use this method when building queries. Different adapters
have the opportunity to provide their own SqlString class, and the
responsibility for adding limits/offsets, joins, conditions,
parameters, can be shared between the adapter and this special string.

Tom

> There seems to be two ways to handle query generation which are better
> then the existing methods.
> 1 - Generate queries inside the adapters
> 2 - Register queries to be associated with certain adapters.

> Currently I prefer #2 because it allows you to have only one spot
> where a particular query is generated and you can associate (register)
> that query with one or more database adapters. #1 all by itself makes
> sense, but since some db vendors do support the same semantics for
> certain functions or queries you should be able to support DRY for
> those overlapping queries. I also like #2 better because you can
> simlply add a supported adapter once it has been tested.

> For example, let's say I register RegexpQuerySupport which currently
> has been tested on MySQL.
> register RegexpQuerySupport, :adapters=>:mysql

> Now if you download it and test it against PostgreSQL by changing
> ":mysql" to "[ :mysql, :postgresql ]" and it works, that's the all
> effort involved. If it didn't work, you could write a
> RegexpQuerySupport object inside the PostgreSQL adapter and register
> it for :postgresql.

How is this different from include RegexpQuery in the adapter?

Includes are going to include the functionality within the adapter and
override methods (or cause infinite alias loops). There are cases when
you need to do this, but I don't think every additional include of new
query support needs to alias and reimplement sanitize_sql or like
methods. Registering query support (aka: RegexpQuery) isolates the
overall affect new query support/functionality would have. I am
thinking of an 90/10 rule here. Perhaps I am wrong though i my
thinking.

> What do you mean 'more modular', what are the specific problems
> > you're hitting?

> I think query generation should be separated from ActiveRecord::Base.
> Right now to add custom query support you have to hardcode a method/
> query, or you have to override ActiveRecord::Base methods like
> sanitize_sql, quote to get it to work across the board depending on
> the functionality you're adding. The query support for ActiveRecord
> should be more modular so adding a new query won't necessarily break
> other code in ActiveRecord. Given this making the query support more
> component driven also isolates and limits the ripple effect that
> people's plugins would cause.

Do you have a concrete example? What is 'adding a new query' and does it
happen often?

Take regular expressions for example. Most database engines support
them, but they all have syntactically different ways to be expressed
in an SQL statement. I don't want to alias/reimplement methods that I
shouldn't have to just to get the functionality. I think aliasing is a
powerful feature of ruby, but I do not think it is always the best way
to go for having people add SQL functionality to AR. What are you
thinking here in terms of how to allow users to add support for things
like Regexps?

[snip]

I like Ernie's suggestion of a SqlStatement-like object, or even better,
settling on some duck-typing conventions for sql-like strings.

SqlString < String could also hold its bind variables. The database adapter
could translate that to whatever native support for bound parameters, even
prepare and cache the statement.

This sounds like a good idea. Can you guys expand on this more for how
it'd handle queries in practice?

Zach

Hi Zach,

I like Ernie's suggestion of a SqlStatement-like object, or even better,
settling on some duck-typing conventions for sql-like strings.

SqlString < String could also hold its bind variables. The database adapter
could translate that to whatever native support for bound parameters, even
prepare and cache the statement.

This sounds like a good idea. Can you guys expand on this more for how
it'd handle queries in practice?

I pulled together my efforts from last year into a blog post:

http://www.opendarwin.org/~drernie/C499496031/E20070226153152/index.html

Not very complete, but hopefully it gives you a flavor of what I was attempting -- and how much I had to patch ActiveRecord to get the behavior I wanted.

Best,
-- Ernie P.