'columns' in Active Resource - proposed patch

So, one of the things I wanted to add to Active Resource is the ability to have knowledge of expected 'columns'. ie a set of things that ARes knows are attributes of the given resource.

Why do I want this? First let me explain the current situation.

Right now - ARes sets attributes based on whatever comes back from the remote system when you do a fetch. This is fine and dandy and need not be changed... but what if we're creating a brand new resource and want to run local validations over it?

Say I'm implementing a flickr-image-submitting service. The flickr API is known and published. Here's an example that shows up the problem:

  class FlickrPhoto << ActiveResource::Base     validates_presence_of :photo   end

  # this works   my_photo = FlickrPhoto.new(:photo => 'photo_stream_here', :title => 'eben', :tags => ['cat'])   my_photo.valid? # => true   my_photo.save # => true   my_photo.photo # 'photo_stream_here'

  # this doesn't   my_photo = FlickrPhoto.new(:title => 'snooch', :tags => ['cat']) # note: no photo   my_photo.photo # throws NoMethodError exception   my_photo.valid? # throws NoMethodError exception   my_photo.save # throws NoMethodError exception

All the latter three don't work because we're calling 'photo' which doesn't exist as a method on FlickrPhoto - even though we know in advance (due to the published API) that photo is a required column. This is pretty annoying when the validates_presence_of method should be responding to exactly this situation...

We *could* overload validates_presence_of to respond to a NoMethodException... but we'd probably also have to update errors.on and the view-methods and various other locations... when all we really want is to add something into method_missing to return 'nil' if we know it's a column... but don't have a value yet.

Active Record uses a 'columns' method that is populated by reflecting on the db table... clearly Active Resource doesn't have that luxury.

We could get all fancy and do a remote lookup on the remote system... or we could just set the columns ourselves given that we know what they should contain. eg:

  class FlickrPhoto << ActiveResource::Base     columns = [:photo, :title, :description, :tags]     validates_presence_of :photo   end

  # now this does...   my_photo = FlickrPhoto.new(:title => 'snooch', :tags => ['cat']) # note: no photo   my_photo.photo # 'nil'   my_photo.valid? # returns 'false' with errors.on(:photo) == ['can't be blank']   my_photo.save # returns 'false' with errors.on(:photo) == ['can't be blank']

The latter seems like the simplest solution and AFAICS will not interfere with how ARes currently runs.

It also has a nice side-effect of allowing plugins that assume the existence of the 'columns' method to work with ARes as well. The plugins shouldn't care that the columns come from the db-table or otherwise...

I've provided a working proposed patch (below) to implement this and would value opinion on a) the patch as is and b) any other options.

Note - this follows from discussion (and my original proposal) in the following thread: http://groups.google.com/group/rubyonrails-core/browse_thread/thread/2afb6dcb7c555eb1

Cheers, Taryn

From 1cb4d7178f716d3c865ab8c36819c0f772fbfa01 Mon Sep 17 00:00:00 2001

I've provided a working proposed patch (below) to implement this and would value opinion on a) the patch as is and b) any other options.

Note - this follows from discussion (and my original proposal) in the following thread: http://groups.google.com/group/rubyonrails-core/browse_thread/thread/2afb6dcb7c555eb1

I can't make the patch apply thanks to email clients inserting newlines somewhere along the way. However I really like the implementation you have here.

My only thought is a simple one, should these be called 'columns'? Why not attributes?

If you could give us a patch on a lighthouse ticket I think this could be good to go.

> I've provided a working proposed patch (below) to implement this and > would value opinion on a) the patch as is and b) any other options.

> Note - this follows from discussion (and my original proposal) in the > following thread: >http://groups.google.com/group/rubyonrails-core/browse_thread/thread/

I can't make the patch apply thanks to email clients inserting newlines somewhere along the way. However I really like the implementation you have here.

damn - how annoying!

I've sent the patch to Josh as well - so he has a working copy. Because it's a bit of a radical departure (in theory) from how ARes works right now, he understandably wanted more comment on it from the rails-core team before he was willing to apply this one. :slight_smile:

My only thought is a simple one, should these be called 'columns'? Why not attributes?

Attributes already exists on ARes - but is used in a different way. It doesn't represent a canonical list of expected attributes that you will either find (or not) on any given instance. It's more of a list of 'what attributes have been set so far on this instance'. Repurposing an existing, working accessor would be more of a change than just adding a new one... thus a different name was necessary. I used 'columns' because that's what AR uses... and that has its advantages when you want to use AR and ARes interchangeably (which we do). Plugins that assume you have a 'columns' function... will then Just Work.

There's an argument for having a commonly-named accessor interface that does not have a db-inspired name... but that's another argument :wink:

If you could give us a patch on a lighthouse ticket I think this could be good to go.

see previous comment :wink: Josh has a copy - but I need the big guns to run their eyes over this to make sure it's ok for go. Any chance you could poke the right people to come look?

Cheers, Taryn

Hey Taryn, sorry it took so long for me to get back to you.

My only concern is the "column" name, as Koz mentioned. We really should have a common term other than "column" that could apply to any ActiveModel.

I'm trying to contact DHH or Rick Olson who drafted some of this stuff already.

-- Josh

Fantastic. Obviously from a functionality POV - it doesnt matter what it's called, so yeah - it's good that this is all that's missing.

Hmmm. attributes really is quite natural for this... but attributes covers all the ephemeral ones as well.

How about 'persisted_attributes' or something similar?

As a bit of a contrast to, say, 'attributes_before_typecast' and other similar stuff.

Cheers, Taryn

So David and I decided we both like "schema" for containing the list of attributes and types for the model. It seems like a good fit for any "ActiveModel".

In ActiveResource's case, it will have an "undefined schema" by default where it just uses any attribute it receives in the response. This basically the current behavior and we still want to support it.

Its still half baked, but we've been talking for a while now about controllers providing the model schema. So "GET /people/new.xml => PeopleController#schema" would return a xml or json schema that ActiveResource could configure its "auto schema" with.

The third option is a "defined schema" where you can specify exactly what attributes and types you want.

Here is DHH's example, its similar to migration definitions.

  class Person < ActiveRecord::Base     schema do       string :type, :description       string :name, :url_name, :limit => 50       integer :project_id       integer :elements_count, :default => 0, :null => false     end   end

So David and I decided we both like "schema" for containing the list of attributes and types for the model. It seems like a good fit for any "ActiveModel".

In ActiveResource's case, it will have an "undefined schema" by default where it just uses any attribute it receives in the response. This basically the current behavior and we still want to support it.

Yep - makes sense

Its still half baked, but we've been talking for a while now about controllers providing the model schema. So "GET /people/new.xml => PeopleController#schema" would return a xml or json schema that ActiveResource could configure its "auto schema" with.

Yeah, a few people have suggested something of this sort - good idea. The idea people have is that they'd send back some sort of xml chunk that defines all the columns in the schema... I'm not sure how we'd want that to be formatted to point out column-types and any other restrictions etc.

As a bare minimum - just name-type key pairs could work quite well to begin with eg:

http://my_remote_site/people/schema.xml

returns:

<person>   <name type="string" />   <description type="text" />   <project_id type="integer" /> </person>

The third option is a "defined schema" where you can specify exactly what attributes and types you want.

Here is DHH's example, its similar to migration definitions.

class Person < ActiveRecord::Base schema do string :type, :description string :name, :url_name, :limit => 50 integer :project_id integer :elements_count, :default => 0, :null => false end end

Nice - yes, that also takes it all to the next level of allowing us to know what the column *types* are too. I had some similar thoughts along that line, but it was all hazy in my head how we could do it. This example firms it up nicely.

I'm a little uncertain about how to treat the limits/'not null' etc... Are they relevant/necessary? They are pretty databasey and we don't necessarily have a database here... I'm just a bit concerned they may interfere with validations, unless you envisage some sort of composite validation thing springing from them. (eg a limit becomes validates_length_of)

Defaults are probably pretty simple to implement - just add them to an accessor if the attribute is null/blank. :slight_smile:

The others are also do-able (of course), but I would like some clarification on whether you reckon they would serve as a complement to the validations, or can be *implemented* using validations.. or whatever you have in mind. :slight_smile:

I'll go ahead and start to alter my existing code to match the above (unless otherwise told).

Taryn

As a bare minimum - just name-type key pairs could work quite well to begin with eg:

http://my_remote_site/people/schema.xml

returns:

<person> <name type="string" /> <description type="text" /> <project_id type="integer" /> </person>

The problem with this is we're reinventing a schema language and people are going to (rightly) ask why not just use relaxng or xsd. Supporting them will be way more work.

new.xml will give you the fields, the defaults, and it's pretty straight forward. Fundamentally though I just don't think there's much utility to programmatic discovery like this, the rest of your code makes assumptions about what attributes are there, and there's no harm in you codifying this in your local app.

Having said that I think it'd be a neat feature for simple lo-fi APIs where you control both ends of the wire. I'm a little uncertain about how to treat the limits/'not null'

Are they relevant/necessary? They are pretty databasey and we don't necessarily have a database here... I'm just a bit concerned they may interfere with validations, unless you envisage some sort of composite validation thing springing from them. (eg a limit becomes validates_length_of)

The others are also do-able (of course), but I would like some clarification on whether you reckon they would serve as a complement to the validations, or can be *implemented* using validations.. or whatever you have in mind. :slight_smile:

For now I'd suggest just relying on the user to provide validations to pick them up. At some stage in the future we can look at tying auto-validations in from the schema definitions.

> As a bare minimum - just name-type key pairs could work quite well to > begin with eg:

>http://my_remote_site/people/schema.xml

> returns:

> <person> > <name type="string" /> > <description type="text" /> > <project_id type="integer" /> > </person>

The problem with this is we're reinventing a schema language and people are going to (rightly) ask why not just use relaxng or xsd. Supporting them will be way more work.

This was considered more of a 'future idea' than something to do right away anyway, I can see the point, though, in supporting something that already exists rather than re-inventing yet another language. Possibly even supporting just an extremely stripped-down subset of either of the above... Is there a gem that tracks either of the above fairly well?

new.xml will give you the fields, the defaults, and it's pretty straight forward.

Not if the remote API is not written in Rails... I'm trying to steer clear of the 'remote API is Rails' assumption that seems inherent in a lot of ARes design. Our system is written to interface with a remote Cold Fusion site - so I've butted up against a number of these assumptions. :slight_smile:

I've definitely considered the idea of using a blank new command as the schema-definition - it's just as good an idea as schema.xml I guess I consider that schema.xml is more explicit for what you want.

In either case, in any system other than Rails, this function has to be written explicitly so having a vague idea of the kind of thing we'd expect to be returned would be a good thing to know. Having it named something different to 'new' means you steer clear of conflicts in systems that aren't truly RESTful - it gives one more layer of certainty that we don't accidentally create a new, empty widget every time we just want to fetch the schema :wink:

Fundamentally though I just don't think there's much utility to programmatic discovery like this, the rest of your code makes assumptions about what attributes are there, and there's no harm in you codifying this in your local app.

Actually I agree :slight_smile: I think it's much more likely to be useful that the person that writes the API call up the people that own the remote system and ask for a copy of their API and then just codify it into the app... but there seems to be a desire for - and who am I to argue :wink: That being said, it won't be the first feature I write.

Having said that I think it'd be a neat feature for simple lo-fi APIs where you control both ends of the wire.

I think in this case it's even easier to do the schema-definition in the app. The only time I see it being more useful is if the remote system is prone to frequent change, and we want to track that change... and we *don't* want to use the auto-discovery based on attributes that is what the current system already does (eg because we have validates_presence_of on some of the fields).

Still - I'd be looking for a sensible use-case that wasn't covered by existing ideas...

I'm a little uncertain about how to treat the limits/'not null'

<snippage>

For now I'd suggest just relying on the user to provide validations to pick them up. At some stage in the future we can look at tying auto-validations in from the schema definitions.

Yeah, I figure that's the simplest solution - start with the basics and move on from there.

Cheers, Taryn

Just to continue the thread... I'm back on this (now that NaNoWriMo is over).

I've ported my columns over to schema and added tests. The current patch is still using an array of names for the schema (so obviously not the eventual requirement), but it's getting there.

If anybody wants to check it to see if it makes sense, the patch is available here: http://pastie.org/725959

Currently it: - accepts/returns the schema if set - if not set - will return the attributes of the instance - will 'respond_to?' the attributes in the schema - will return nil instead of MethodNotFound for attributes in the schema

Next work: - change it to accept on a hash, not an array - start doing funky things depending on the hash options (ie attributes_before_typecast ).

Taryn

and now I've got some baseline code working the discussion has moved to a new thread: http://groups.google.com/group/rubyonrails-core/browse_thread/thread/6b9631841a4daea2

Taryn