has_many seems funky when handling duplicate records

The following test fails on the last line as firm.clients.length == 2,
not 4:

  def test_replace_with_duplicates
    firm = Firm.find(:first)
    c1, c2 = companies(:first_client),companies(:second_client)
    assert_equal firm.clients, [c1, c2]
    firm.clients = [c1, c1, c2, c2]
    firm.save
    firm.reload
    assert_equal 4, firm.clients.length
  end

Looking at the implementation of the "replace" method in
ActiveRecord::Associations::AssociationCollection I can see where this
behavior is coming from. Not what i was expecting really. Is this
expected?

Jeff,

I wonder if it would make a difference if you changed line 3 to be:

c1, c2 = Company.find(1), Company.find(2)

I know that I have seen a difference in the past between loading the
fixture from the db vs. loading with the fixture method.

--Tom

Hi Tom,

I gave the above a shot, but it gave the same results.

-J

Why would you want to associate the same client to the same firm multiple times anyway? The idea is that for a has_many you have unique records. For example if I had a post and it had a comment on it, if I did @blog.comments
= [Comment.find(:first),Comment.find(:first)] it should only show ONE comment.

I agree with you in the case where a domain model expects a unique set
of associations, but what about situation that doesn't? It's a
similar difference between Arrays and Sets: Arrays can be unique or
have duplicates, but Sets must be unique. I would understand if it
was clear that has_many associations are treated like Sets, but
nothing describes them that way. And if they were, what would be the
point of the ":uniq" option on has_many and HABTM? It would be
redundant, right?

What is the SQL in the test.log telling you about the objects? Is it
actually creating two additional clients with the same attributes? If
not, I don't see how it would be possible after the reload for the
firm to 4 clients?

It is not creating two new clients with the same attributes. The
"replace" method in AssoctiationCollection, which is what is called
when you do something like model_instance.collection = [other_model1,
other_model2], looks like this:

      # Replace this collection with +other_array+
      # This will perform a diff and delete/add only records that have
changed.
      def replace(other_array)
        other_array.each { |val| raise_on_type_mismatch(val) }

        load_target
        other = other_array.size < 100 ? other_array :
other_array.to_set
        current = @target.size < 100 ? @target : @target.to_set

        @owner.transaction do
          delete(@target.select { |v| !other.include?(v) })
          concat(other_array.select { |v| !current.include?(v) })
        end
      end

Certainly if the size of the existing or supplied array is > 100, it's
going to ignore duplicates, but also the delete and concat calls,
because they don't check for duplicate instances of the same object,
will ignore them as well. That means that if @target and other_array
contain the same unique set of objects replace will do nothing. For
example, consider ModelA which has_many ModelB. Assume instance
mod_a.model_bs == [ mod_b ]. If that were the case, the following
line would have no effect:

mod_a.model_bs = [mod_b, mod_b, mod_b, mod_b, mod_b, mod_b, mod_b,
mod_b, mod_b]

Similarly, if mod_a.model_bs already was set to [mod_b, mod_b, mod_b,
mod_b, mod_b, mod_b, mod_b, mod_b, mod_b], and we did the following,
it would have no effect:

mod_a.model_bs = [mod_b]

What I'm suggesting is that if we created the has_many relationship
with :uniq => true I would expect this behavior. Instead it seems
that :uniq => true only has an effect on database reads.
"collection=" is documented in the API as, "replaces the collections
content by deleting and adding objects as appropriate", which doesn't
seem true if the current or new collection contain duplicates

jeff b wrote:

The following test fails on the last line as firm.clients.length == 2,
not 4:

  def test_replace_with_duplicates
    firm = Firm.find(:first)
    c1, c2 = companies(:first_client),companies(:second_client)
    assert_equal firm.clients, [c1, c2]
    firm.clients = [c1, c1, c2, c2]
    firm.save
    firm.reload
    assert_equal 4, firm.clients.length
  end

Looking at the implementation of the "replace" method in
ActiveRecord::Associations::AssociationCollection I can see where this
behavior is coming from. Not what i was expecting really. Is this
expected?

c1,c1 is the SAME db record, and it's a primary key in your db.. unless
you dup c1 and save with a different id, you will still only have one
model in your db with that id.

The only way to do the functionality that you desire is by adding in a
link table and exploring has and belongs to many relationships as the
link table can keep "duplicate" associations..

Forget about rails and look at it from the db side of things..

hth

ilan

What I'm suggesting is that if we created the has_many relationship
with :uniq=> true I would expect this behavior. Instead it seems
that :uniq=> true only has an effect on database reads.
"collection=" is documented in the API as, "replaces the collections
content by deleting and adding objects as appropriate", which doesn't
seem true if the current or new collection contain duplicates

yes. if one says

class Users < ActiveRecord::Base
  has_and_belongs_to_many :roles, :uniq => true
end

then, at minimum, rails should do something like

before_save do |record|
  record.send(:roles).uniq!
end

under the hood for you. anything else is very non-POLS, to put it
mildly.

kind regards.