has_many seems funky when handling duplicate records

The following test fails on the last line as firm.clients.length == 2, not 4:

  def test_replace_with_duplicates     firm = Firm.find(:first)     c1, c2 = companies(:first_client),companies(:second_client)     assert_equal firm.clients, [c1, c2]     firm.clients = [c1, c1, c2, c2]     firm.save     firm.reload     assert_equal 4, firm.clients.length   end

Looking at the implementation of the "replace" method in ActiveRecord::Associations::AssociationCollection I can see where this behavior is coming from. Not what i was expecting really. Is this expected?

Jeff,

I wonder if it would make a difference if you changed line 3 to be:

c1, c2 = Company.find(1), Company.find(2)

I know that I have seen a difference in the past between loading the fixture from the db vs. loading with the fixture method.

--Tom

Hi Tom,

I gave the above a shot, but it gave the same results.

-J

Why would you want to associate the same client to the same firm multiple times anyway? The idea is that for a has_many you have unique records. For example if I had a post and it had a comment on it, if I did @blog.comments = [Comment.find(:first),Comment.find(:first)] it should only show ONE comment.

I agree with you in the case where a domain model expects a unique set of associations, but what about situation that doesn't? It's a similar difference between Arrays and Sets: Arrays can be unique or have duplicates, but Sets must be unique. I would understand if it was clear that has_many associations are treated like Sets, but nothing describes them that way. And if they were, what would be the point of the ":uniq" option on has_many and HABTM? It would be redundant, right?

What is the SQL in the test.log telling you about the objects? Is it actually creating two additional clients with the same attributes? If not, I don't see how it would be possible after the reload for the firm to 4 clients?

It is not creating two new clients with the same attributes. The "replace" method in AssoctiationCollection, which is what is called when you do something like model_instance.collection = [other_model1, other_model2], looks like this:

      # Replace this collection with +other_array+       # This will perform a diff and delete/add only records that have changed.       def replace(other_array)         other_array.each { |val| raise_on_type_mismatch(val) }

        load_target         other = other_array.size < 100 ? other_array : other_array.to_set         current = @target.size < 100 ? @target : @target.to_set

        @owner.transaction do           delete(@target.select { |v| !other.include?(v) })           concat(other_array.select { |v| !current.include?(v) })         end       end

Certainly if the size of the existing or supplied array is > 100, it's going to ignore duplicates, but also the delete and concat calls, because they don't check for duplicate instances of the same object, will ignore them as well. That means that if @target and other_array contain the same unique set of objects replace will do nothing. For example, consider ModelA which has_many ModelB. Assume instance mod_a.model_bs == [ mod_b ]. If that were the case, the following line would have no effect:

mod_a.model_bs = [mod_b, mod_b, mod_b, mod_b, mod_b, mod_b, mod_b, mod_b, mod_b]

Similarly, if mod_a.model_bs already was set to [mod_b, mod_b, mod_b, mod_b, mod_b, mod_b, mod_b, mod_b, mod_b], and we did the following, it would have no effect:

mod_a.model_bs = [mod_b]

What I'm suggesting is that if we created the has_many relationship with :uniq => true I would expect this behavior. Instead it seems that :uniq => true only has an effect on database reads. "collection=" is documented in the API as, "replaces the collections content by deleting and adding objects as appropriate", which doesn't seem true if the current or new collection contain duplicates

jeff b wrote:

The following test fails on the last line as firm.clients.length == 2, not 4:

  def test_replace_with_duplicates     firm = Firm.find(:first)     c1, c2 = companies(:first_client),companies(:second_client)     assert_equal firm.clients, [c1, c2]     firm.clients = [c1, c1, c2, c2]     firm.save     firm.reload     assert_equal 4, firm.clients.length   end

Looking at the implementation of the "replace" method in ActiveRecord::Associations::AssociationCollection I can see where this behavior is coming from. Not what i was expecting really. Is this expected?

c1,c1 is the SAME db record, and it's a primary key in your db.. unless you dup c1 and save with a different id, you will still only have one model in your db with that id.

The only way to do the functionality that you desire is by adding in a link table and exploring has and belongs to many relationships as the link table can keep "duplicate" associations..

Forget about rails and look at it from the db side of things..

hth

ilan

What I'm suggesting is that if we created the has_many relationship with :uniq=> true I would expect this behavior. Instead it seems that :uniq=> true only has an effect on database reads. "collection=" is documented in the API as, "replaces the collections content by deleting and adding objects as appropriate", which doesn't seem true if the current or new collection contain duplicates

yes. if one says

class Users < ActiveRecord::Base   has_and_belongs_to_many :roles, :uniq => true end

then, at minimum, rails should do something like

before_save do |record|   record.send(:roles).uniq! end

under the hood for you. anything else is very non-POLS, to put it mildly.

kind regards.