HasAndBelongsToManyAssociation.build bug or limitation?

Say categories and products have a habtm relationship:

product = @category.products.build(attributes) product.categories

Why does product.categories return an empty array instead of the associated category?

Of course in this case the code can just use @category, but I'd like to access the category inside of an after_save callback within the product model and I just get an empty array.

-peter

Does “product.categories(true)” return the same for you?

Yes, passing in true for force_reload still returns an empty array.

I've tried patching HasAndBelongsToManyAssociation.build with something like this: record.send(@reflection.active_record.name.tableize) << @owner

But this results in an endless loop during save.

-peter

Can you post your schema/migrations for the tables, and the relevant
assertions from the model classes?

The problem, I believe, stems from the fact that ActiveRecord
associations are not bidirectional. In other words, doing
@category.products << some_product does not automatically populate
the corresponding some_product.categories collection with @category,
until after you save.

The following should work (and if it doesn't, would be a bug):

   @category.products << some_product    @category.save    assert some_product.categories(:reload).include?(@category)

The fact that AR does not treat associations as bidirectional is not
a bug, merely a consequence of how it was designed.

- Jamis

Here's the code from the test rails app I created to verify this problem:

Migration: create_table :products do |t|   t.column :name, :string end create_table :categories do |t|   t.column :name, :string end create_table :categories_products, :id => false do |t|   t.column :category_id, :integer   t.column :product_id, :integer end

Models: class Product < ActiveRecord::Base   has_and_belongs_to_many :categories end class Category < ActiveRecord::Base   has_and_belongs_to_many :products end

Controller: render :text => Category.new(:name => 'test').products.build(:name => 'test').categories(true).size

0 is always rendered by the controller. I also tried saving the new category first, but that made no difference.

-peter

Yes, it does work after save.

I've reworked (hacked) my code to work around this behavior, but I still believe that AR should faithfully report the state of the model for new and saved records and I'd be happy to contribute time to making it so.

-peter

Yes, it does work after save.

I've reworked (hacked) my code to work around this behavior, but I still believe that AR should faithfully report the state of the model for new and saved records and I'd be happy to contribute time to making it so.

This is definitely something which could be fixed, but don't underestimate how much work it is to support bi-directional associations completely.

habtm-habtm belongs_to - has_many belongs_to - has_one has_many - through

etcetcetc

There are a large number of permutations here, and it'll be a lot of work to get it right. It should probably start as a plugin, though I'm happy to add some hooks if it needs them.

Best of luck :slight_smile:

The problem, I believe, stems from the fact that ActiveRecord
associations are not bidirectional. In other words, doing
@category.products << some_product does not automatically populate
the corresponding some_product.categories collection with @category,
until after you save.

That's true; perhaps a more graphic example of the design is this:

p = Page.find(1) # select * from pages where id = 1

=> #<Page:0xb7381ec4 @attributes={...}>

g.page_fragments.first # select * from page_fragments

                                     # where page_id = 1 => #<PageFragment:0xb73498f8 @attributes={...}>

g.page_fragments.first.page # select * from pages where id = 1

=> #<Page:0xb72f89bc @attributes={...}>

Note that the Page object returned by the last expression is a different object from 'p'. And, as I've annotated above, the last expression will do another database query, even though the 'correct' object is already available in memory. Objects created by loading an association don't have back references to the object they are associated with.

One place where this can bite is if you're trying to use callbacks to update an associated record in the same way you might have used database triggers in the past. For example, imagine an order table and a line item table. You might want the order table to have a total field that is maintained as the total value of the associated line items. One way that you might implement part of this would be to have the LineItem before_destroy callback subtract the line item value from the order total:

def before_destroy   order.total -= total end

and you might expect to write something like

o = Order.find(id) o.line_items.first.destroy

within the callback 'order' does not refer to the same object as 'o', which has two implications:

1) the instance of order that the callback updates is never saved. 2) if you add a 'save!' call to the callback the total gets updated but 'o' then refers to an object that is out of date, so you need to reload it before you use the value of 'total'.

I found it difficult to implement model objects that transparently maintained a cached total like this in a way that worked reliably: most of the ways I tried would only work properly when the client code was written in certain ways (i.e. based on an understanding of what the implementation was doing). I wondered whether it was a bug, but as Jamis said it seems to be more an artifact of the current design and implementation. I do wonder whether it would be possible to make the behaviour more consistent with what might be expected, though.

-Mark.

1) ActiveRecord does not have an editing context 2) ActiveRecord does not have an identity map 2) ActiveRecord objects try their best to be stateless (having only one bit of state information, which is called new_record?) and sometimes it sucks :slight_smile:

The problem you described is a design decision of ActiveRecord and sadly enough some very big stuff has got to happen to change that (shall you really want to). If you start listing "problems" which stem from the fact that there is no context and no identity map in AR you can prety much flood the trac, straight away.

I wish there was a different answer to that. To verify, try the following:

folder = Folder.find(4) # let's pretend a Folder has_many :documents first_document = folder.documents.first another_copy_of_the_document = Document.find(first_document.id) first_document.object_id == another_copy_of_the_document.object_id # false

You can take a look at this: http://blog.ianbicking.org/sqlobject-api-redesign.html and just think of ActiveRecord in terms of "each object has it's own editing context,"

I have to say that this result really doesn't surprise me.

AssociationProxy is a bit of magic that was blessed upon us by the core team. However, knowing how any ORM works is key to knowing why these super-helpful callbacks are not infalliable. I think the behaviour is completely expected. How unexpected would *this* be!

u = User.find(1) u.quote = "Boo Urns" m = User.find(1) m.first_name = "Hampton" m.save

And somehow, the User would end up having the quote of "Boo Urns" and the first name "Hampton".

Each representation of data from the DB is unique and seperate. They aren't the DB row, but just the object represenation until you save back to the ACID database.

That's why your responses made sense to me. If I'm calling another assocation on something, then I expect it to be a seperate object. It would break my expectations if I asked for User.find(1) twice and got the *same* object. I mean, who gets the right data? If I'm not certain of who is king, then who do I replace?

u = User.find(1) u.quote = "Boo Urns"

# someone updates the quote on the db from another thread to be "Hello"

u.save

Personally, I'd expect that the quote would then become "Boo Urns"... but if we are actually connecting IDs in the database to single Object instances, then in theory, we would assume some sort of direct correlation between the object-representation and the database that is *direct*.

Just my rants after a bottle of wine.

There are limits to ORMs that allow the complexity to be addressed.

-hampton.

There are limits to ORMs that allow the complexity to be addressed.

Well, there are some nice benefits to having an identity map. You can have transient state used for the duration of a request. However it introduces a lot more error conditions to consider.

u = User.find(1) u.name="Koz" u2 = User.find(1) (you really have to either return the same instance as u, or raise an error).

It'd be nice if we can solve the common situations such as:

u.posts.first.user == u

But a full on identity map is probably a way off yet.