HasAndBelongsToManyAssociation.build bug or limitation?

Say categories and products have a habtm relationship:

product = @category.products.build(attributes)
product.categories

Why does product.categories return an empty array instead of the
associated category?

Of course in this case the code can just use @category, but I'd like
to access the category inside of an after_save callback within the
product model and I just get an empty array.

-peter

Does “product.categories(true)” return the same for you?

Yes, passing in true for force_reload still returns an empty array.

I've tried patching HasAndBelongsToManyAssociation.build with
something like this:
record.send(@reflection.active_record.name.tableize) << @owner

But this results in an endless loop during save.

-peter

Can you post your schema/migrations for the tables, and the relevant
assertions from the model classes?

The problem, I believe, stems from the fact that ActiveRecord
associations are not bidirectional. In other words, doing
@category.products << some_product does not automatically populate
the corresponding some_product.categories collection with @category,
until after you save.

The following should work (and if it doesn't, would be a bug):

   @category.products << some_product
   @category.save
   assert some_product.categories(:reload).include?(@category)

The fact that AR does not treat associations as bidirectional is not
a bug, merely a consequence of how it was designed.

- Jamis

Here's the code from the test rails app I created to verify this
problem:

Migration:
create_table :products do |t|
  t.column :name, :string
end
create_table :categories do |t|
  t.column :name, :string
end
create_table :categories_products, :id => false do |t|
  t.column :category_id, :integer
  t.column :product_id, :integer
end

Models:
class Product < ActiveRecord::Base
  has_and_belongs_to_many :categories
end
class Category < ActiveRecord::Base
  has_and_belongs_to_many :products
end

Controller:
render :text => Category.new(:name => 'test').products.build(:name =>
'test').categories(true).size

0 is always rendered by the controller. I also tried saving the new
category first, but that made no difference.

-peter

Yes, it does work after save.

I've reworked (hacked) my code to work around this behavior, but I
still believe that AR should faithfully report the state of the model
for new and saved records and I'd be happy to contribute time to
making it so.

-peter

Yes, it does work after save.

I've reworked (hacked) my code to work around this behavior, but I
still believe that AR should faithfully report the state of the model
for new and saved records and I'd be happy to contribute time to
making it so.

This is definitely something which could be fixed, but don't
underestimate how much work it is to support bi-directional
associations completely.

habtm-habtm
belongs_to - has_many
belongs_to - has_one
has_many - through

etcetcetc

There are a large number of permutations here, and it'll be a lot of
work to get it right. It should probably start as a plugin, though
I'm happy to add some hooks if it needs them.

Best of luck :slight_smile:

The problem, I believe, stems from the fact that ActiveRecord
associations are not bidirectional. In other words, doing
@category.products << some_product does not automatically populate
the corresponding some_product.categories collection with @category,
until after you save.

That's true; perhaps a more graphic example of the design is this:

p = Page.find(1) # select * from pages where id = 1

=> #<Page:0xb7381ec4 @attributes={...}>

g.page_fragments.first # select * from page_fragments

                                     # where page_id = 1
=> #<PageFragment:0xb73498f8 @attributes={...}>

g.page_fragments.first.page # select * from pages where id = 1

=> #<Page:0xb72f89bc @attributes={...}>

Note that the Page object returned by the last expression is a different
object from 'p'. And, as I've annotated above, the last expression will
do another database query, even though the 'correct' object is already
available in memory. Objects created by loading an association don't
have back references to the object they are associated with.

One place where this can bite is if you're trying to use callbacks to
update an associated record in the same way you might have used database
triggers in the past. For example, imagine an order table and a line
item table. You might want the order table to have a total field that is
maintained as the total value of the associated line items. One way that
you might implement part of this would be to have the LineItem
before_destroy callback subtract the line item value from the order
total:

def before_destroy
  order.total -= total
end

and you might expect to write something like

o = Order.find(id)
o.line_items.first.destroy

within the callback 'order' does not refer to the same object as 'o',
which has two implications:

1) the instance of order that the callback updates is never saved.
2) if you add a 'save!' call to the callback the total gets updated but
'o' then refers to an object that is out of date, so you need to reload
it before you use the value of 'total'.

I found it difficult to implement model objects that transparently
maintained a cached total like this in a way that worked reliably: most
of the ways I tried would only work properly when the client code was
written in certain ways (i.e. based on an understanding of what the
implementation was doing). I wondered whether it was a bug, but as Jamis
said it seems to be more an artifact of the current design and
implementation. I do wonder whether it would be possible to make the
behaviour more consistent with what might be expected, though.

-Mark.

1) ActiveRecord does not have an editing context
2) ActiveRecord does not have an identity map
2) ActiveRecord objects try their best to be stateless (having only one bit of state information, which is called new_record?) and sometimes it sucks :slight_smile:

The problem you described is a design decision of ActiveRecord and sadly enough
some very big stuff has got to happen to change that (shall you really want to). If you start listing "problems" which stem from the fact
that there is no context and no identity map in AR you can prety much flood the trac, straight away.

I wish there was a different answer to that. To verify, try the following:

folder = Folder.find(4) # let's pretend a Folder has_many :documents
first_document = folder.documents.first
another_copy_of_the_document = Document.find(first_document.id)
first_document.object_id == another_copy_of_the_document.object_id # false

You can take a look at this:
http://blog.ianbicking.org/sqlobject-api-redesign.html
and just think of ActiveRecord in terms of "each object has it's own editing context,"

I have to say that this result really doesn't surprise me.

AssociationProxy is a bit of magic that was blessed upon us by the
core team. However, knowing how any ORM works is key to knowing why
these super-helpful callbacks are not infalliable. I think the
behaviour is completely expected. How unexpected would *this* be!

u = User.find(1)
u.quote = "Boo Urns"
m = User.find(1)
m.first_name = "Hampton"
m.save

And somehow, the User would end up having the quote of "Boo Urns" and
the first name "Hampton".

Each representation of data from the DB is unique and seperate. They
aren't the DB row, but just the object represenation until you save
back to the ACID database.

That's why your responses made sense to me. If I'm calling another
assocation on something, then I expect it to be a seperate object. It
would break my expectations if I asked for User.find(1) twice and got
the *same* object. I mean, who gets the right data? If I'm not certain
of who is king, then who do I replace?

u = User.find(1)
u.quote = "Boo Urns"

# someone updates the quote on the db from another thread to be
"Hello"

u.save

Personally, I'd expect that the quote would then become "Boo Urns"...
but if we are actually connecting IDs in the database to single Object
instances, then in theory, we would assume some sort of direct
correlation between the object-representation and the database that is
*direct*.

Just my rants after a bottle of wine.

There are limits to ORMs that allow the complexity to be addressed.

-hampton.

There are limits to ORMs that allow the complexity to be addressed.

Well, there are some nice benefits to having an identity map. You can
have transient state used for the duration of a request. However it
introduces a lot more error conditions to consider.

u = User.find(1)
u.name="Koz"
u2 = User.find(1) (you really have to either return the same instance
as u, or raise an error).

It'd be nice if we can solve the common situations such as:

u.posts.first.user == u

But a full on identity map is probably a way off yet.