Invalid source reflection macro?

I'm trying to model a somewhat complicated genetic relationship in Rails.

class Gene < ActiveRecord::Base   belongs_to :species, :foreign_key => "species_id", :class_name => "Species"   has_many :genes_orthogroups, :class_name => 'GenesOrthogroups' # join table model   has_many :orthogroups, :through => :genes_orthogroups   has_many :orthologs, :through => :orthogroups, :source => :genes, :conditions => 'gene.species_id != ortholog.species_id' end class GenesOrthogroups < ActiveRecord::Base   belongs_to :gene   belongs_to :orthogroup end class Orthogroup < ActiveRecord::Base   has_many :genes_orthogroups, :class_name => 'GenesOrthogroups'   has_many :genes, :through => :genes_orthogroups   belongs_to :species_pair, :class_name => 'SpeciesPair', :foreign_key => "species_pair_id" end class SpeciesPair < ActiveRecord::Base   belongs_to :species1, :foreign_key => "species1_id", :class_name => 'Species'   belongs_to :species2, :foreign_key => "species2_id", :class_name => 'Species'   has_many :orthogroups end class Species < ActiveRecord::Base   has_many :genes   has_many :species_pairs, :class_name => 'SpeciesPair' end

I'm trying to print out a list of the orthologs with a partial. The list is generated with:   def find_ortholog_list(gene_id)     ortholog_list =     Gene.find(gene_id).orthologs.each do |ortholog|       ortholog_list << ortholog     end     return OrthologList.new(ortholog_list)   end

I get the following error message: Invalid source reflection macro :has_many :through for has_many :orthologs, :through => :orthogroups. Use :source to specify the source reflection.

If I change :source to :gene instead of :genes, I get: Could not find the source association(s) :gene in model Orthogroup. Try 'has_many :orthologs, :through => :orthogroups, :source => <name>'. Is it one of :genes_orthogroups, :genes, or :species_pair?

Essentially, any given gene should have zero, one, or multiple orthologs in another species. These are grouped into orthogroups, each consisting of genes from two species, and representing a many-to-many relationship. Genes in an orthogroup which are of the same species would be paralogs instead. Because there are more than just two species, a gene may participate in multiple orthogroups (up to one for each additional species). The generalization for ortholog/paralog is "homolog" or "homologue," incidentally.

I've based most of this code on the p.369 explanation "Using Models as Join Tables" in Agile Web Development with Rails (3rd Ed). I'm not really sure why it fails, except for the fact that I'm trying to use Gene as its own source for defining orthologs. What's the proper way to define this relationship?

Apologies for the complexity of the model. I couldn't think of any layman's examples.

Thanks so much, John

The :orthologs association is the problem - Rails doesn't support nesting :through associations. I recall there being a plugin around someplace to do it, so you may want to look into that.

Depending on what you need, a simple instance method may work as well. For example (on Gene):

def orthologs   genes_orthogroups.ortholog_groups_for(self).map { |g| g.gene } end

On GenesOrthogroup: named_scope ortholog_groups_for { |g| { :include => :gene, :conditions => ['genes.species_id != ?', g.species_id] } }

(not tested, but should be close to working)

A couple general things:

- model names should be singular (GenesOrthogroup rather than GenesOrthogroups). Otherwise you'll eventually run into issues.

- when writing SQL fragments in conditions, table names are plural (so genes.whatever rather than gene.whatever).

- the association macros have sensible defaults, so you can leave some options out. For instance, the :species association in Gene can be simplified to 'belongs_to :species' - Rails will find the correct FK (species_id) and class (Species).

--Matt Jones

The plural model name comes from the Agile Development book. They have an example--categories_products, linking categories to products--which they turn into a model. Should it be renamed category_product? Or is categories_product correct? I'm a bit confused about this, so thanks for bringing it up.

I'll look around for the plugin. Thanks so much, Matt!

John

I'll look at the rest of your example later, but...

John Woods wrote:

The plural model name comes from the Agile Development book. They have an example--categories_products, linking categories to products--which they turn into a model. Should it be renamed category_product?

[...]

If I had to choose one of those options, I'd probably use CategoryProduct and rename the table. However, I think neither option is very good. What most people here seem to recommend is this: when your join table becomes a model, take the time to come up with an appropriate name for it.

For example, in this case, we might go from Category habtm Products (with categories_products table) to

class Category   has_many :categorizations   has_many :products, :through => :categorizations end

(likewise for Product)

class Categorization   belongs_to :category   belongs_to :product end

Choosing an appropriate name for the join model will make future development much easier and clearer.

Best,

So the plugin appears to work. FYI, it's here: git://github.com/ianwhite/nested_has_many_through.git

Unfortunately, I'm still having trouble setting up the condition. I've renamed a few things and added appropriate associations:

class Gene < ActiveRecord::Base   belongs_to :species   has_many :gene_orthogroup_linkers, :class_name => 'GeneOrthogroupLinker'   has_many :orthogroups, :through => :gene_orthogroup_linkers   has_many :ortholog_orthogroup_linkers, :class_name => 'GeneOrthogroupLinker', :through => :orthogroups, :source => :gene_orthogroup_linkers   has_many :orthologs, :source => :gene, :through => :ortholog_orthogroup_linkers, :conditions => 'genes.species_id != genes_2.species_id' end

GeneOrthogroupLinker is the class that used to be GenesOrthogroups.

Mysql::Error: Unknown column 'genes_2.species_id' in 'where clause': SELECT `genes`.* FROM `genes` INNER JOIN gene_orthogroup_linkers ON ( genes.id = gene_orthogroup_linkers.gene_id ) INNER JOIN orthogroups ON ( gene_orthogroup_linkers.orthogroup_id = orthogroups.id ) INNER JOIN gene_orthogroup_linkers gene_orthogroup_linkers_2 ON ( orthogroups.id = gene_orthogroup_linkers_2.orthogroup_id ) WHERE (gene_orthogroup_linkers_2.gene_id = 556 AND genes.species_id ! = genes_2.species_id)

So, it looks to me like the problem is that it selects from genes but has no inner join to genes_2, only to gene_orthogroup_linkers_2. How do I get it to INNER JOIN genes genes_2 ON ( gene_orthogroup_linkers_2.gene_id = genes_2.id AND genes.species_id ! = genes_2.species_id)?

It seems like as a work-around, I could add a species_id column on the GeneOrthogroupLinker model, but that doesn't seem like the cleanest solution.

Incidentally, it seems unnecessary to join with orthogroups (why not just join the two linkers with orthogroup_id?). Is this easily fixable, or better left as-is?

I also tried the named_scope, but it doesn't seem right for this situation.

Best, John

Have you tried passing the missing join directly, via the :join option?

--Matt Jones