'columns' in Active Resource - proposed patch

So, one of the things I wanted to add to Active Resource is the
ability to have knowledge of expected 'columns'. ie a set of things
that ARes knows are attributes of the given resource.

Why do I want this?
First let me explain the current situation.

Right now - ARes sets attributes based on whatever comes back from the
remote system when you do a fetch. This is fine and dandy and need not
be changed... but what if we're creating a brand new resource and
want to run local validations over it?

Say I'm implementing a flickr-image-submitting service. The flickr API
is known and published. Here's an example that shows up the problem:

  class FlickrPhoto << ActiveResource::Base
    validates_presence_of :photo
  end

  # this works
  my_photo = FlickrPhoto.new(:photo => 'photo_stream_here', :title =>
'eben', :tags => ['cat'])
  my_photo.valid? # => true
  my_photo.save # => true
  my_photo.photo # 'photo_stream_here'

  # this doesn't
  my_photo = FlickrPhoto.new(:title => 'snooch', :tags => ['cat']) #
note: no photo
  my_photo.photo # throws NoMethodError exception
  my_photo.valid? # throws NoMethodError exception
  my_photo.save # throws NoMethodError exception

All the latter three don't work because we're calling 'photo' which
doesn't exist as a method on FlickrPhoto - even though we know in
advance (due to the published API) that photo is a required column.
This is pretty annoying when the validates_presence_of method should
be responding to exactly this situation...

We *could* overload validates_presence_of to respond to a
NoMethodException... but we'd probably also have to update errors.on
and the view-methods and various other locations... when all we really
want is to add something into method_missing to return 'nil' if we
know it's a column... but don't have a value yet.

Active Record uses a 'columns' method that is populated by reflecting
on the db table... clearly Active Resource doesn't have that luxury.

We could get all fancy and do a remote lookup on the remote system...
or we could just set the columns ourselves given that we know what
they should contain. eg:

  class FlickrPhoto << ActiveResource::Base
    columns = [:photo, :title, :description, :tags]
    validates_presence_of :photo
  end

  # now this does...
  my_photo = FlickrPhoto.new(:title => 'snooch', :tags => ['cat']) #
note: no photo
  my_photo.photo # 'nil'
  my_photo.valid? # returns 'false' with errors.on(:photo) == ['can't
be blank']
  my_photo.save # returns 'false' with errors.on(:photo) == ['can't
be blank']

The latter seems like the simplest solution and AFAICS will not
interfere with how ARes currently runs.

It also has a nice side-effect of allowing plugins that assume the
existence of the 'columns' method to work with ARes as well. The
plugins shouldn't care that the columns come from the db-table or
otherwise...

I've provided a working proposed patch (below) to implement this and
would value opinion on a) the patch as is and b) any other options.

Note - this follows from discussion (and my original proposal) in the
following thread:
http://groups.google.com/group/rubyonrails-core/browse_thread/thread/2afb6dcb7c555eb1

Cheers,
Taryn

From 1cb4d7178f716d3c865ab8c36819c0f772fbfa01 Mon Sep 17 00:00:00 2001

I've provided a working proposed patch (below) to implement this and
would value opinion on a) the patch as is and b) any other options.

Note - this follows from discussion (and my original proposal) in the
following thread:
http://groups.google.com/group/rubyonrails-core/browse_thread/thread/2afb6dcb7c555eb1

I can't make the patch apply thanks to email clients inserting
newlines somewhere along the way. However I really like the
implementation you have here.

My only thought is a simple one, should these be called 'columns'?
Why not attributes?

If you could give us a patch on a lighthouse ticket I think this could
be good to go.

> I've provided a working proposed patch (below) to implement this and
> would value opinion on a) the patch as is and b) any other options.

> Note - this follows from discussion (and my original proposal) in the
> following thread:
>http://groups.google.com/group/rubyonrails-core/browse_thread/thread/

I can't make the patch apply thanks to email clients inserting
newlines somewhere along the way. However I really like the
implementation you have here.

damn - how annoying!

I've sent the patch to Josh as well - so he has a working copy.
Because it's a bit of a radical departure (in theory) from how ARes
works right now, he understandably wanted more comment on it from the
rails-core team before he was willing to apply this one. :slight_smile:

My only thought is a simple one, should these be called 'columns'?
Why not attributes?

Attributes already exists on ARes - but is used in a different way. It
doesn't represent a canonical list of expected attributes that you
will either find (or not) on any given instance. It's more of a list
of 'what attributes have been set so far on this instance'.
Repurposing an existing, working accessor would be more of a change
than just adding a new one... thus a different name was necessary.
I used 'columns' because that's what AR uses... and that has its
advantages when you want to use AR and ARes interchangeably (which we
do). Plugins that assume you have a 'columns' function... will then
Just Work.

There's an argument for having a commonly-named accessor interface
that does not have a db-inspired name... but that's another
argument :wink:

If you could give us a patch on a lighthouse ticket I think this could be good to go.

see previous comment :wink:
Josh has a copy - but I need the big guns to run their eyes over this
to make sure it's ok for go.
Any chance you could poke the right people to come look?

Cheers,
Taryn

Hey Taryn, sorry it took so long for me to get back to you.

My only concern is the "column" name, as Koz mentioned. We really
should have a common term other than "column" that could apply to any
ActiveModel.

I'm trying to contact DHH or Rick Olson who drafted some of this stuff
already.

-- Josh

Fantastic. Obviously from a functionality POV - it doesnt matter what
it's called, so yeah - it's good that this is all that's missing.

Hmmm. attributes really is quite natural for this... but attributes
covers all the ephemeral ones as well.

How about 'persisted_attributes' or something similar?

As a bit of a contrast to, say, 'attributes_before_typecast' and other
similar stuff.

Cheers,
Taryn

So David and I decided we both like "schema" for containing the list
of attributes and types for the model. It seems like a good fit for
any "ActiveModel".

In ActiveResource's case, it will have an "undefined schema" by
default where it just uses any attribute it receives in the response.
This basically the current behavior and we still want to support it.

Its still half baked, but we've been talking for a while now about
controllers providing the model schema. So "GET /people/new.xml =>
PeopleController#schema" would return a xml or json schema that
ActiveResource could configure its "auto schema" with.

The third option is a "defined schema" where you can specify exactly
what attributes and types you want.

Here is DHH's example, its similar to migration definitions.

  class Person < ActiveRecord::Base
    schema do
      string :type, :description
      string :name, :url_name, :limit => 50
      integer :project_id
      integer :elements_count, :default => 0, :null => false
    end
  end

So David and I decided we both like "schema" for containing the list
of attributes and types for the model. It seems like a good fit for
any "ActiveModel".

In ActiveResource's case, it will have an "undefined schema" by
default where it just uses any attribute it receives in the response.
This basically the current behavior and we still want to support it.

Yep - makes sense

Its still half baked, but we've been talking for a while now about
controllers providing the model schema. So "GET /people/new.xml =>
PeopleController#schema" would return a xml or json schema that
ActiveResource could configure its "auto schema" with.

Yeah, a few people have suggested something of this sort - good idea.
The idea people have is that they'd send back some sort of xml chunk
that defines all the columns in the schema... I'm not sure how we'd
want that to be formatted to point out column-types and any other
restrictions etc.

As a bare minimum - just name-type key pairs could work quite well to
begin with eg:

http://my_remote_site/people/schema.xml

returns:

<person>
  <name type="string" />
  <description type="text" />
  <project_id type="integer" />
</person>

The third option is a "defined schema" where you can specify exactly
what attributes and types you want.

Here is DHH's example, its similar to migration definitions.

class Person < ActiveRecord::Base
schema do
string :type, :description
string :name, :url_name, :limit => 50
integer :project_id
integer :elements_count, :default => 0, :null => false
end
end

Nice - yes, that also takes it all to the next level of allowing us to
know what the column *types* are too.
I had some similar thoughts along that line, but it was all hazy in my
head how we could do it.
This example firms it up nicely.

I'm a little uncertain about how to treat the limits/'not null'
etc...
Are they relevant/necessary? They are pretty databasey and we don't
necessarily have a database here...
I'm just a bit concerned they may interfere with validations, unless
you envisage some sort of composite validation thing springing from
them. (eg a limit becomes validates_length_of)

Defaults are probably pretty simple to implement - just add them to an
accessor if the attribute is null/blank. :slight_smile:

The others are also do-able (of course), but I would like some
clarification on whether you reckon they would serve as a complement
to the validations, or can be *implemented* using validations.. or
whatever you have in mind. :slight_smile:

I'll go ahead and start to alter my existing code to match the above
(unless otherwise told).

Taryn

As a bare minimum - just name-type key pairs could work quite well to
begin with eg:

http://my_remote_site/people/schema.xml

returns:

<person>
<name type="string" />
<description type="text" />
<project_id type="integer" />
</person>

The problem with this is we're reinventing a schema language and
people are going to (rightly) ask why not just use relaxng or xsd.
Supporting them will be way more work.

new.xml will give you the fields, the defaults, and it's pretty
straight forward. Fundamentally though I just don't think there's
much utility to programmatic discovery like this, the rest of your
code makes assumptions about what attributes are there, and there's no
harm in you codifying this in your local app.

Having said that I think it'd be a neat feature for simple lo-fi APIs
where you control both ends of the wire.
I'm a little uncertain about how to treat the limits/'not null'

Are they relevant/necessary? They are pretty databasey and we don't
necessarily have a database here...
I'm just a bit concerned they may interfere with validations, unless
you envisage some sort of composite validation thing springing from
them. (eg a limit becomes validates_length_of)

The others are also do-able (of course), but I would like some
clarification on whether you reckon they would serve as a complement
to the validations, or can be *implemented* using validations.. or
whatever you have in mind. :slight_smile:

For now I'd suggest just relying on the user to provide validations to
pick them up. At some stage in the future we can look at tying
auto-validations in from the schema definitions.

> As a bare minimum - just name-type key pairs could work quite well to
> begin with eg:

>http://my_remote_site/people/schema.xml

> returns:

> <person>
> <name type="string" />
> <description type="text" />
> <project_id type="integer" />
> </person>

The problem with this is we're reinventing a schema language and
people are going to (rightly) ask why not just use relaxng or xsd.
Supporting them will be way more work.

This was considered more of a 'future idea' than something to do right
away anyway, I can see the point, though, in supporting something that
already exists rather than re-inventing yet another language.
Possibly even supporting just an extremely stripped-down subset of
either of the above...
Is there a gem that tracks either of the above fairly well?

new.xml will give you the fields, the defaults, and it's pretty
straight forward.

Not if the remote API is not written in Rails...
I'm trying to steer clear of the 'remote API is Rails' assumption that
seems inherent in a lot of ARes design.
Our system is written to interface with a remote Cold Fusion site - so
I've butted up against a number of these assumptions. :slight_smile:

I've definitely considered the idea of using a blank new command as
the schema-definition - it's just as good an idea as schema.xml
I guess I consider that schema.xml is more explicit for what you want.

In either case, in any system other than Rails, this function has to
be written explicitly so having a vague idea of the kind of thing
we'd expect to be returned would be a good thing to know.
Having it named something different to 'new' means you steer clear of
conflicts in systems that aren't truly RESTful - it gives one more
layer of certainty that we don't accidentally create a new, empty
widget every time we just want to fetch the schema :wink:

Fundamentally though I just don't think there's
much utility to programmatic discovery like this, the rest of your
code makes assumptions about what attributes are there, and there's no
harm in you codifying this in your local app.

Actually I agree :slight_smile:
I think it's much more likely to be useful that the person that writes
the API call up the people that own the remote system and ask for a
copy of their API and then just codify it into the app... but there
seems to be a desire for - and who am I to argue :wink:
That being said, it won't be the first feature I write.

Having said that I think it'd be a neat feature for simple lo-fi APIs where you control both ends of the wire.

I think in this case it's even easier to do the schema-definition in
the app. The only time I see it being more useful is if the remote
system is prone to frequent change, and we want to track that
change... and we *don't* want to use the auto-discovery based on
attributes that is what the current system already does (eg because we
have validates_presence_of on some of the fields).

Still - I'd be looking for a sensible use-case that wasn't covered by
existing ideas...

I'm a little uncertain about how to treat the limits/'not null'

<snippage>

For now I'd suggest just relying on the user to provide validations to
pick them up. At some stage in the future we can look at tying
auto-validations in from the schema definitions.

Yeah, I figure that's the simplest solution - start with the basics
and move on from there.

Cheers,
Taryn

Just to continue the thread... I'm back on this (now that NaNoWriMo is
over).

I've ported my columns over to schema and added tests. The current
patch is still using an array of names for the schema (so obviously
not the eventual requirement), but it's getting there.

If anybody wants to check it to see if it makes sense, the patch is
available here: http://pastie.org/725959

Currently it:
- accepts/returns the schema if set
- if not set - will return the attributes of the instance
- will 'respond_to?' the attributes in the schema
- will return nil instead of MethodNotFound for attributes in the
schema

Next work:
- change it to accept on a hash, not an array
- start doing funky things depending on the hash options (ie
attributes_before_typecast ).

Taryn

and now I've got some baseline code working the discussion has moved
to a new thread:
http://groups.google.com/group/rubyonrails-core/browse_thread/thread/6b9631841a4daea2

Taryn