Rails foreign keys and associations: when to denormalize?

I’ve been wrestling with database normalization and the “Rails Way” and I wanted to get peoples opinions as to if there’s a best practices guide to how to do it. Here’s some background:

Let’s assume we have a model/schema similar to the following:

class Student < ActiveRecord::Base

has_many :student_exams

has_many :exams, through: :student exams

end

class StudentExam < ActiveRecord::Base

belongs_to :student

belongs_to :exam

end

class Exam < ActiveRecord::Base

has_many :questions

belongs_to :professor

has_one :college, through: :professor

end

class Question < ActiveRecord::Base

belongs_to :exam

has_many :responses

question_text: text

end

class Response < ActiveRecord::Base

belongs_to :question

belongs_to :student

response_text: text

end

class Evaluation

belongs_to :response

belongs_to :professor

grade: integer

end

The idea is that a Professor creates an Exam andQuestions, and then assigns students through StudentExam to take that exam. The students then generate Responses to the exam questions, and then those Responses are graded by Evaluations done by the Professor.

Now, given the above background, I wanted to ask a few questions to get peoples opinions on how to structure this app:

Questions

  1. In general, are there any changes you guys would make with the given associations?
  2. For the Response model, does it make sense to do “belongs_to :student”, or should it instead be “belongs_to :student_exam”? It feels more logical to have the former, because we can get the exam through the question if need be (response.question.exam), and its really a student responding to it, not a student_exam. But if we used student_exam, we could use response.student_exam.exam to get back to the exam, and response.student_exam.student to get back to the student.
  3. If we were on an Evaluation and we wanted to back to a exam, it would seem quite cumbersome to do evaluation.response.question.exam just to get back to the exam.
  • Now we could always create a method on the model that just shortened it for us, but is there a point when you are chaining associations that somewhere along the line you add in another foreign_key to a table for easy lookup?
  • If we wanted to see all Evaluations for an Exam, what would be the best way to do that?
  • Only way I could see to do it easily would be to break it up, something like:
  • @questions = @exam.questions
  • @responses = Response.where(question_id: @questions.pluck(:id)
  • @evaluations = Evaluation.where(response_id: @responses.pluck(:id)
  • This seems kind of painful

Let me know your thoughts – or if you have any great books on the best strategies for normalizing/denormalizing within or without the scope of a rails app, I’d love to hear them.

Hey Liz,

Thanks for responding. Here’s some initial responses:

  1. While I understand the need in an actual application for things like a Classroom model (owned by a Professor), and then something like a StudentClassroom model (setting up Students to a Classroom) to better help organize the data, what I’m looking for really is how associations fit together when deeply nested outside of the context of the actual application. I set up this example simply as a way of exploring some of the problems it poses.
  • Ditto for a School model – we can assume that a School has_many :professors, etc.
  • The idea is that as we get deeper into our models, it becomes harder and harder to get back to the beginning.
  • In my example, From an Evaluation, we would need 5 hops to get back a School model for example (imagine evaluation.response.question.exam.professor.school)
  • Similarly, from the reverse side, it would be hard to query for all Evaluations for a School:
  • @school = School.first
  • @all_professors = @school.professors
  • @all_exams = Exam.where(professor_id: @all_professors.pluck(:id))
  • @all_exam_questions = Question.where(exam_id: @all_exams.pluck(:id))
  • @all_exam_responses = Response.where(question_id: @all_exam_questions.pluck(:id))
  • @all_exam_evaluations = Evaluation.where(response_id: @all_exam_responses.pluck(:id))
  • While the above works, it took 6 lines to get there! Not to mention the expense of doing a crazy amount of lookups
  • So instead, what’s the solution?
  • We could write an “all_evaluations” method in what would be our School model using the above code
  • Solves the multiple lines problem, but not the query complexity problem
  • Perhaps using delegate? Little more elegant than perhaps a method, but again the complexity problem
  • Adding an extra foreign key to a model somewhere and denormalizing the database a bit
  • Eg for Evaluation, maybe add exam_id to the class
  • To get back to a School now, it’s 2 less hops (evaluation.exam.professor.school)
  • To get all Evaluations for a School, we can now do the following, which is two less lines of code (and less joins):
  • @school = School.first
  • @all_professors = @school.professors
  • @all_exams = Exam.where(professor_id: @all_professors.pluck(:id))
  • @all_exam_evaluations = Evaluation.where(exam_id: @all_exams.pluck(:id))
  • This feels arbitrary though
  • if we are going to do this, why not just add a school_id to an evaluation?
  • Or a bunch of other foreign keys to all the tables to make all hops 1-2 at most In the end, I’m mostly trying to understand which strategy to use under which scenarios – I see a lot of blog posts talking about the law of demeter and simplifying your associations (usually using delegate), but most of them have pretty simple connections like business.location.city_name, nothing as complex as chaining through 4-5 associations to get back to the original object. It also feels dangerous to chain that much – if at any point in the evaluation.response.question.exam.professor.school the associated object ends up being deleted somehow, then you break your ability to get back to the original object.

Hope that helps explain my question more.

I've been wrestling with database normalization and the "Rails Way" and I wanted to get peoples opinions as to if there's a best practices guide to how to do it.

IMHO best practice is to keep it properly normalized until you've identified some need to denormalize. Usually this is due to performance... and identified through benchmarking with a profiler, as any human's guess as to why an app is slow is most likely wrong. If you do have a performance problem, and you think denormalization would help, consider other approaches first, or possibly whether your app doesn't need it to be relational in the first place and might benefit from a non-relational approach. (That's not *quite* the same thing as "NoSQL".)

class StudentExam < ActiveRecord::Base

This name tells us nothing. I'd recommend something like "ExamTaking". (Might also have this include the date or some such differentiator, in case the student takes the same exam multiple times.)

@questions = @exam.questions @responses = Response.where(question_id: @questions.pluck(:id) @evaluations = Evaluation.where(response_id: @responses.pluck(:id)

This seems kind of painful

Right. I think you need to look at the "includes" method.