ActiveRecord Refactoring

Starting Monday I'll be working for myself and one of the things I'm looking forward to is working on ActiveRecord more. I've written and released some ActiveRecord extensions/plugins [0] and as well as posted some thoughts on ActiveRecord's modularity [1]. With this in mind I want to work up a patch to refactor how ActiveRecord itself handles queries.

Alot of what I'd like to do is very similar to my approach with better finder support in ActiveRecord::Extensions [1]

Since now is the time that alot of changes are happening for Rails 2.0... what's the core groups feedback on making query support for ActiveRecord more modular.

Do you see the current AR implementation limiting what AR can do since anytime someone needs to add functionality they have to write a plugin which overrides AR core methods?

In short I think AR would get alot of benefits if AR included things like declarative style registration of query support. register GenericQuerySupport, :adapters => :all

This way someone can easily register special functionality for MySQL, PostgreSQL, MS SQL, Oracle, etc, and release the plugin without overriding AR. They can just call:   ActiveRecord::Base.register PostgreSQLRegexpSupport, :adapters=>:postgresql

Thoughts?

Zach

0 - http://www.continuousthinking.com/tags/arext 1 - http://www.rubyinside.com/advent2006/17-extendingar.html

Starting Monday I'll be working for myself and one of the things I'm looking forward to is working on ActiveRecord more.

Congratulations, it's a bunch of fun :).

I've written and released some ActiveRecord extensions/plugins [0] and as well as posted some thoughts on ActiveRecord's modularity [1]. With this in mind I want to work up a patch to refactor how ActiveRecord itself handles queries.

We've spoken about this several times in the past, and it's still one of our goals for the rails 2.0 release. The query generation code is kinda ugly, and frankly a lot of it is in the wrong place.

We should be generating the queries inside the adapters, that way we can just use polymorphism to provide adapter specific behaviour.

Alot of what I'd like to do is very similar to my approach with better finder support in ActiveRecord::Extensions [1]

Since now is the time that alot of changes are happening for Rails 2.0... what's the core groups feedback on making query support for ActiveRecord more modular.

What do you mean 'more modular', what are the specific problems you're hitting? I think there's probably a place for adding some stable APIs to make some common-customisations easier. So long as it doesn't materially impact the complexity of ActiveRecord, or the performance of it, then I think everyone wins when plugins don't need to break every release.

Perhaps we need to get a thread going with AR plugin authors, see what stuff they're monkeypatching, find a few common cases, and build an API?

I'd like to see the condition methods moved out of ActiveRecord and into a more general location...

condition_block? evaluate_condition

They currently reside in validations.rb

The functionality they provide can be used nicely in ActionController and possibly other places where class methods accept a "method" type parameter like the validation methods currently do with the :if parameter.

I've developed a plugin for ActionController and "borrowed" the code from ActiveRecord with some minor tweaks to remove the ActiveRecord specific exception code.

Thoughts?

Thanks, Andrew

Will the change make the Abstract Adapter really abstract while moving all generic SQL to “Abstract SQL Adapter” (which the current concrete adapters would subclass instead)? Removing all SQL-specific code from AR::Base would also be neat.

That change would make room, for example, an adapter for XML databases in the future!

One thing I experimented with a while back was an explicit Query object, returned by the Adapter.

http://urlx.org/opendarwin.org/73585

The idea is that rather than always passing direct SQL in, developers would typically call appropriate methods on the query object, and pass that.

Alas, I never got very far. However, if this is compatible with the direction things are going, I’d be happy to help out.

– Ernie P.

It's highly unlikely that we'll make ActiveRecord anything other than a SQL database persistence tool. But if someone comes up with a nice XML database persistence thing, we'll refactor out common code into something like 'active model'.

[snip]

We've spoken about this several times in the past, and it's still one of our goals for the rails 2.0 release. The query generation code is kinda ugly, and frankly a lot of it is in the wrong place.

We should be generating the queries inside the adapters, that way we can just use polymorphism to provide adapter specific behaviour.

There seems to be two ways to handle query generation which are better then the existing methods.    1 - Generate queries inside the adapters    2 - Register queries to be associated with certain adapters.

Currently I prefer #2 because it allows you to have only one spot where a particular query is generated and you can associate (register) that query with one or more database adapters. #1 all by itself makes sense, but since some db vendors do support the same semantics for certain functions or queries you should be able to support DRY for those overlapping queries. I also like #2 better because you can simlply add a supported adapter once it has been tested.

For example, let's say I register RegexpQuerySupport which currently has been tested on MySQL.    register RegexpQuerySupport, :adapters=>:mysql

Now if you download it and test it against PostgreSQL by changing ":mysql" to "[ :mysql, :postgresql ]" and it works, that's the all effort involved. If it didn't work, you could write a RegexpQuerySupport object inside the PostgreSQL adapter and register it for :postgresql.

> Alot of what I'd like to do is very similar to my approach with better > finder support in ActiveRecord::Extensions [1]

> Since now is the time that alot of changes are happening for Rails > 2.0... what's the core groups feedback on making query support for > ActiveRecord more modular.

What do you mean 'more modular', what are the specific problems you're hitting?

I think query generation should be separated from ActiveRecord::Base. Right now to add custom query support you have to hardcode a method/ query, or you have to override ActiveRecord::Base methods like sanitize_sql, quote to get it to work across the board depending on the functionality you're adding. The query support for ActiveRecord should be more modular so adding a new query won't necessarily break other code in ActiveRecord. Given this making the query support more component driven also isolates and limits the ripple effect that people's plugins would cause.

I think there's probably a place for adding some stable APIs to make some common-customisations easier. So long as it doesn't materially impact the complexity of ActiveRecord, or the performance of it, then I think everyone wins when plugins don't need to break every release.

Implementing a behavioral pattern similar to chain of responsibility seems to be a good route. I use this in ActiveRecord::Extensions and so far it's worked great. If you happen to pull down the source [0] look in th extensions.rb file and check out how the MySQL, PostgreSQL and Sqlite regular expression support is handled.

Perhaps we need to get a thread going with AR plugin authors, see what stuff they're monkeypatching, find a few common cases, and build an API?

That sounds like a great idea.

There could be and there probably is a better way to handle this query generation in ActiveRecord, I look forward to seeing what everyone can come up with together.

Zach

If you go the route of registering components for different adapters you could support XML based adapters, because query generation would be contained to the adapters that it supports without having to muck up ActiveRecord to hack on XML support. I think this would be possible,

Zach

There seems to be two ways to handle query generation which are better then the existing methods. 1 - Generate queries inside the adapters 2 - Register queries to be associated with certain adapters.

Currently I prefer #2 because it allows you to have only one spot where a particular query is generated and you can associate (register) that query with one or more database adapters. #1 all by itself makes sense, but since some db vendors do support the same semantics for

certain functions or queries you should be able to support DRY for those overlapping queries. I also like #2 better because you can simlply add a supported adapter once it has been tested.

For example, let’s say I register RegexpQuerySupport which currently

has been tested on MySQL. register RegexpQuerySupport, :adapters=>:mysql

Now if you download it and test it against PostgreSQL by changing “:mysql” to “[ :mysql, :postgresql ]” and it works, that’s the all

effort involved. If it didn’t work, you could write a RegexpQuerySupport object inside the PostgreSQL adapter and register it for :postgresql.

How is this different from include RegexpQuery in the adapter?

What do you mean ‘more modular’, what are the specific problems

you’re hitting?

I think query generation should be separated from ActiveRecord::Base. Right now to add custom query support you have to hardcode a method/ query, or you have to override ActiveRecord::Base methods like

sanitize_sql, quote to get it to work across the board depending on the functionality you’re adding. The query support for ActiveRecord should be more modular so adding a new query won’t necessarily break

other code in ActiveRecord. Given this making the query support more component driven also isolates and limits the ripple effect that people’s plugins would cause.

Do you have a concrete example? What is ‘adding a new query’ and does it happen often?

I think there’s probably a place for adding some stable APIs to make some common-customisations easier. So long as it

doesn’t materially impact the complexity of ActiveRecord, or the performance of it, then I think everyone wins when plugins don’t need to break every release.

Implementing a behavioral pattern similar to chain of responsibility

seems to be a good route. I use this in ActiveRecord::Extensions and so far it’s worked great. If you happen to pull down the source [0] look in th extensions.rb file and check out how the MySQL, PostgreSQL

and Sqlite regular expression support is handled.

This looks like mixing in a module, still.

Perhaps we need to get a thread going with AR plugin authors, see what stuff they’re monkeypatching, find a few common cases, and build an API?

That sounds like a great idea.

There could be and there probably is a better way to handle this query

generation in ActiveRecord, I look forward to seeing what everyone can come up with together.

Me too!

I like Ernie’s suggestion of a SqlStatement-like object, or even better, settling on some duck-typing conventions for sql-like strings.

SqlString < String could also hold its bind variables. The database adapter could translate that to whatever native support for bound parameters, even prepare and cache the statement.

jeremy

I like Ernie's suggestion of a SqlStatement-like object, or even better, settling on some duck-typing conventions for sql-like strings.

SqlString < String could also hold its bind variables. The database adapter could translate that to whatever native support for bound parameters, even prepare and cache the statement.

I tried something like this as an experiment 6 months ago and it looked a promising approach. It's pretty simple to add a factory method to abstract adapter to provide a new SqlString, and get active record to use this method when building queries. Different adapters have the opportunity to provide their own SqlString class, and the responsibility for adding limits/offsets, joins, conditions, parameters, can be shared between the adapter and this special string.

Tom

> There seems to be two ways to handle query generation which are better > then the existing methods. > 1 - Generate queries inside the adapters > 2 - Register queries to be associated with certain adapters.

> Currently I prefer #2 because it allows you to have only one spot > where a particular query is generated and you can associate (register) > that query with one or more database adapters. #1 all by itself makes > sense, but since some db vendors do support the same semantics for > certain functions or queries you should be able to support DRY for > those overlapping queries. I also like #2 better because you can > simlply add a supported adapter once it has been tested.

> For example, let's say I register RegexpQuerySupport which currently > has been tested on MySQL. > register RegexpQuerySupport, :adapters=>:mysql

> Now if you download it and test it against PostgreSQL by changing > ":mysql" to "[ :mysql, :postgresql ]" and it works, that's the all > effort involved. If it didn't work, you could write a > RegexpQuerySupport object inside the PostgreSQL adapter and register > it for :postgresql.

How is this different from include RegexpQuery in the adapter?

Includes are going to include the functionality within the adapter and override methods (or cause infinite alias loops). There are cases when you need to do this, but I don't think every additional include of new query support needs to alias and reimplement sanitize_sql or like methods. Registering query support (aka: RegexpQuery) isolates the overall affect new query support/functionality would have. I am thinking of an 90/10 rule here. Perhaps I am wrong though i my thinking.

> What do you mean 'more modular', what are the specific problems > > you're hitting?

> I think query generation should be separated from ActiveRecord::Base. > Right now to add custom query support you have to hardcode a method/ > query, or you have to override ActiveRecord::Base methods like > sanitize_sql, quote to get it to work across the board depending on > the functionality you're adding. The query support for ActiveRecord > should be more modular so adding a new query won't necessarily break > other code in ActiveRecord. Given this making the query support more > component driven also isolates and limits the ripple effect that > people's plugins would cause.

Do you have a concrete example? What is 'adding a new query' and does it happen often?

Take regular expressions for example. Most database engines support them, but they all have syntactically different ways to be expressed in an SQL statement. I don't want to alias/reimplement methods that I shouldn't have to just to get the functionality. I think aliasing is a powerful feature of ruby, but I do not think it is always the best way to go for having people add SQL functionality to AR. What are you thinking here in terms of how to allow users to add support for things like Regexps?

[snip]

I like Ernie's suggestion of a SqlStatement-like object, or even better, settling on some duck-typing conventions for sql-like strings.

SqlString < String could also hold its bind variables. The database adapter could translate that to whatever native support for bound parameters, even prepare and cache the statement.

This sounds like a good idea. Can you guys expand on this more for how it'd handle queries in practice?

Zach

Hi Zach,

I like Ernie's suggestion of a SqlStatement-like object, or even better, settling on some duck-typing conventions for sql-like strings.

SqlString < String could also hold its bind variables. The database adapter could translate that to whatever native support for bound parameters, even prepare and cache the statement.

This sounds like a good idea. Can you guys expand on this more for how it'd handle queries in practice?

I pulled together my efforts from last year into a blog post:

http://www.opendarwin.org/~drernie/C499496031/E20070226153152/index.html

Not very complete, but hopefully it gives you a flavor of what I was attempting -- and how much I had to patch ActiveRecord to get the behavior I wanted.

Best, -- Ernie P.