about duplication in a (HABTM) join table

I'm new to rails and databases, needless to say I'm pretty confused here. There's this issue I don't understand how to resolve, heck I don't even know if it's an issue. I've also been searching for answers here, forums, irc, and nothing. Anyway I'll try to be as clear as I can be. And I apologize for the many questions in one thread.

Let's say I have two tables, girls and boys (to spicy up this topic), with their respective models, which in turn have a has_and_belongs_to_many relationship. For that relationship to work, I have a join table called boys_girls, with two columns, boy_id and girl_id.

in rails, I create a boy called brad, and a girl called angelina. now if I do: brad.girls << angelina brad.girls << angelina

not only will angelina be two times in his array (if only this stuff could be in real life), but that relationship will appear in two rows on the join table.

First question: as far as database performance and size goes, is this a problem?

Anyway, if I add uniq => true to has_and_belongs_to_many on the models, ActiveRecord will successfully ignore this duplication when I reload the objects. But it will still act the same way as I said before: if I add duplication, it will show up in the existing array and it will be added to the table.

So, my second question is, how do I avoid this duplication?

I found in the agile web dev book that I can add an index to the join table right in the migration, and add :unique => true after the add_index call. I have tried it, and no difference. I suppose it only configures the index to ignore duplicates, which would then resolve performance issues?

Also, I have read that validates_uniqueness_of accepts various columns with scope, but I'm not sure how to do that, and also, am I right to say that scope only helps to limit uniqueness to given sets of rows? In that case that wouldn't help, right?

And, in case validation in the model is the way to go, where should I put it? In the Girl and Boy models? or should I create a model to represent the join table rows and do that validation there?

(Boys and girls might not exactly illustrate what I need. In my case, I really don't need to add more information to the join table, so validation would possibly be the only reason to create a third model.)

I resolved the issue with the last option, creating a join model (let's say I called it Bond), and a has_many through call within both Girl and Boy. Then all I had to do was add this to the join model:

validates_uniqueness_of :boy_id, :scope => :girl_id

now everything works perfectly, and the models prevent from saving duplicates, returning a validation error if I try to save the same item twice on the same collection.

about issues of having duplicate items on a join table, in db terms, I talked to a much more db experienced person than me, which said that first it's messy, and second it could become a problem with a very large user base (thousands of users with thousands of items....). So I guess the right way is to avoid duplication, specially when deploying for a large userbase, but as an internal tool for few users, it's not a terrible issue.

Thanks for posting your solution. I was getting into habtm/through and I hadn't seen the obvious nature using through with validations.

This could qualify as the first installment of a Rails Quiz. Want to start it?

midwaltz wrote:

hi peter. a rails quiz? :slight_smile: go ahead, use it..

although further on I won't be of much help, I'm might have a lot of questions, but no answers. I'm really new to rails and backend. Which is also a good reason not to take what I say for granted.

oh well. leaving tomorrow for holidays. only sand and trees. no join tables and validations there.