nested includes in AR and to_json / to_xml

Hi,

I updated my project to Rails 3 (beta 4) and got a problem with to_json and to_xml.

I have the model structure: Projects has_many Tasks habtm Users

After the line @projects = Project.includes(:tasks => :users).where('tasks.id is not null') mysql returns:

Pr1 | task1 | Peter | Pr2 | task2 | Peter | Pr2 | task2 | David | Pr2 | task2 | Bernd | TestProject | task3 | Peter | Pr2 | task4 | David |

When I try to get the xml version with: @projects.to_xml(:include => {:tasks => {:include => :users}})

I get three times the same task for the project Pr2:

  <project>     <name>Pr2</name>     <tasks>       <task>         <name>task2</name>           <users>             <user> <name>Peter</name> </user>             <user> <name>David</name> </user>             <user> <name>Bernd</name> </user>           </users>        </task>       <task>         <name>task2</name>           <users>             <user> <name>Peter</name> </user>             <user> <name>David</name> </user>             <user> <name>Bernd</name> </user>           </users>        </task>       <task>         <name>task2</name>           <users>             <user> <name>Peter</name> </user>             <user> <name>David</name> </user>             <user> <name>Bernd</name> </user>           </users>        </task>     </tasks>   </project>

It should return one task with 3 users only, this is the same for to_json. This code works in Rails 2.3.8 (to_xml and to_json returns one task with 3 users), after updating to Rails 3 beta 4 the to_xml/ to_json methods doesn't work as expected.

So how can I get the xml/json result from an ActiveRecord::Relation set (with nested includes)?

Thanks for any help! David

Hi David,

This is what I found with rails edge.

puts Project.find(2).to_xml(:include => {:tasks => {:include => :users}})

<project>   <name>pr2</name>   <created-at type="datetime">2010-06-25T03:50:57Z</created-at>   <updated-at type="datetime">2010-06-25T03:50:57Z</updated-at>   <id type="integer">2</id>   <tasks type="array">     <task>       <name>task2</name>       <project-id type="integer">2</project-id>       <id type="integer">2</id>       <users type="array">         <user>           <name>Peter</name>           <task-id type="NilClass">2</task-id>           <id type="integer">1</id>           <user-id type="NilClass">1</user-id>         </user>         <user>           <name>David</name>           <task-id type="NilClass">2</task-id>           <id type="integer">2</id>           <user-id type="NilClass">2</user-id>         </user>         <user>           <name>Bernd</name>           <task-id type="NilClass">2</task-id>           <id type="integer">3</id>           <user-id type="NilClass">3</user-id>         </user>       </users>     </task>   </tasks> </project>

ActiveRecord::Schema.define(:version => 20100625013546) do

  create_table "projects", :force => true do |t|     t.string "name"     t.datetime "created_at"     t.datetime "updated_at"   end

  create_table "tasks", :force => true do |t|     t.string "name"     t.integer "project_id"   end

  create_table "tasks_users", :id => false, :force => true do |t|     t.integer "task_id", :null => false     t.integer "user_id", :null => false   end

  create_table "users", :force => true do |t|     t.string "name"   end

end class Project < ActiveRecord::Base   has_many :tasks end class Task < ActiveRecord::Base   belongs_to :project   has_and_belongs_to_many :users end class User < ActiveRecord::Base   has_and_belongs_to_many :tasks end

# seeds.rb task1 = Task.create!(:name => 'task1') task2 = Task.create!(:name => 'task2') task3 = Task.create!(:name => 'task3')

pr1 = Project.create!(:name => 'pr1') pr2 = Project.create!(:name => 'pr2')

task2.project_id = pr2.id task2.save!

task2.users.create(:name => 'Peter') task2.users.create(:name => 'David') task2.users.create(:name => 'Bernd')

Hi Neeraj,

could you try

puts Project.includes(:tasks => :users).where('tasks.id is not null').to_xml(:include => {:tasks => {:include => :users}})

this will return 3 times the same task (task2).

Without the where('tasks.id is not null'), it returns the correct values. The differnce is in the sql query:

For Project.includes(:tasks => :users) the sql queries are:

SELECT `projects`.* FROM `projects` SELECT `tasks`.* FROM `tasks` WHERE (`tasks`.project_id IN (1,2)) SELECT `task_users`.* FROM `task_users` WHERE (`task_users`.task_id = 2) SELECT `users`.* FROM `users` WHERE (`users`.`id` IN (1,2,3))

Project.includes(:tasks => :users).where('tasks.id is not null') creates a single sql query:

SELECT `projects`.`id` AS t0_r0, `projects`.`name` AS t0_r1, `projects`.`created_at` AS t0_r2, `projects`.`updated_at` AS t0_r3, `tasks`.`id` AS t1_r0, `tasks`.`name` AS t1_r1, `tasks`.`project_id` AS t1_r2, `tasks`.`created_at` AS t1_r3, `tasks`.`updated_at` AS t1_r4, `users`.`id` AS t2_r0, `users`.`name` AS t2_r1, `users`.`created_at` AS t2_r2, `users`.`updated_at` AS t2_r3 FROM `projects` LEFT OUTER JOIN `tasks` ON `tasks`.`project_id` = `projects`.`id` LEFT OUTER JOIN `task_users` ON `tasks`.`id` = `task_users`.`task_id` LEFT OUTER JOIN `users` ON `users`.`id` = `task_users`.`user_id` WHERE (tasks.id is not null)

I think the problem is, that the single sql query returns 3 rows for task2 (as described in my first post), since Rails edge the to_json/ to_xml return these 3 rows (and each row includes 3 users).

I am able to reproduce this problem.

Can you open a ticket so that it's documented. We will have rest of discussion on that ticket.

Notice that first one works fine but the second one fails.

# works fine puts Project.includes(:tasks).where('tasks.id is not null').to_xml(:include => {:tasks => {:include => :users}})

# causes duplicates puts Project.includes(:tasks => :users).where('tasks.id is not null').to_xml(:include => {:tasks => {:include => :users}})

Ticket: https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/4971-nested-includes-in-ar-causes-duplicates-in-to_json-to_xml