I’ve been wrestling with database normalization and the “Rails Way” and I wanted to get peoples opinions as to if there’s a best practices guide to how to do it. Here’s some background:
Let’s assume we have a model/schema similar to the following:
class Student < ActiveRecord::Base
has_many :student_exams
has_many :exams, through: :student exams
end
class StudentExam < ActiveRecord::Base
belongs_to :student
belongs_to :exam
end
class Exam < ActiveRecord::Base
has_many :questions
belongs_to :professor
has_one :college, through: :professor
end
class Question < ActiveRecord::Base
belongs_to :exam
has_many :responses
question_text: text
end
class Response < ActiveRecord::Base
belongs_to :question
belongs_to :student
response_text: text
end
class Evaluation
belongs_to :response
belongs_to :professor
grade: integer
end
The idea is that a Professor creates an Exam andQuestions, and then assigns students through StudentExam to take that exam. The students then generate Responses to the exam questions, and then those Responses are graded by Evaluations done by the Professor.
Now, given the above background, I wanted to ask a few questions to get peoples opinions on how to structure this app:
Questions
- In general, are there any changes you guys would make with the given associations?
- For the Response model, does it make sense to do “belongs_to :student”, or should it instead be “belongs_to :student_exam”? It feels more logical to have the former, because we can get the exam through the question if need be (response.question.exam), and its really a student responding to it, not a student_exam. But if we used student_exam, we could use response.student_exam.exam to get back to the exam, and response.student_exam.student to get back to the student.
- If we were on an Evaluation and we wanted to back to a exam, it would seem quite cumbersome to do evaluation.response.question.exam just to get back to the exam.
- Now we could always create a method on the model that just shortened it for us, but is there a point when you are chaining associations that somewhere along the line you add in another foreign_key to a table for easy lookup?
- If we wanted to see all Evaluations for an Exam, what would be the best way to do that?
- Only way I could see to do it easily would be to break it up, something like:
-
@questions = @exam.questions
-
@responses = Response.where(question_id: @questions.pluck(:id)
-
@evaluations = Evaluation.where(response_id: @responses.pluck(:id)
- This seems kind of painful
Let me know your thoughts – or if you have any great books on the best strategies for normalizing/denormalizing within or without the scope of a rails app, I’d love to hear them.
Hey Liz,
Thanks for responding. Here’s some initial responses:
- While I understand the need in an actual application for things like a Classroom model (owned by a Professor), and then something like a StudentClassroom model (setting up Students to a Classroom) to better help organize the data, what I’m looking for really is how associations fit together when deeply nested outside of the context of the actual application. I set up this example simply as a way of exploring some of the problems it poses.
- Ditto for a School model – we can assume that a School has_many :professors, etc.
- The idea is that as we get deeper into our models, it becomes harder and harder to get back to the beginning.
- In my example, From an Evaluation, we would need 5 hops to get back a School model for example (imagine evaluation.response.question.exam.professor.school)
- Similarly, from the reverse side, it would be hard to query for all Evaluations for a School:
- @school = School.first
- @all_professors = @school.professors
- @all_exams = Exam.where(professor_id: @all_professors.pluck(:id))
- @all_exam_questions = Question.where(exam_id: @all_exams.pluck(:id))
- @all_exam_responses = Response.where(question_id: @all_exam_questions.pluck(:id))
- @all_exam_evaluations = Evaluation.where(response_id: @all_exam_responses.pluck(:id))
- While the above works, it took 6 lines to get there! Not to mention the expense of doing a crazy amount of lookups
- So instead, what’s the solution?
- We could write an “all_evaluations” method in what would be our School model using the above code
- Solves the multiple lines problem, but not the query complexity problem
- Perhaps using delegate? Little more elegant than perhaps a method, but again the complexity problem
- Adding an extra foreign key to a model somewhere and denormalizing the database a bit
- Eg for Evaluation, maybe add exam_id to the class
- To get back to a School now, it’s 2 less hops (evaluation.exam.professor.school)
- To get all Evaluations for a School, we can now do the following, which is two less lines of code (and less joins):
- @school = School.first
- @all_professors = @school.professors
- @all_exams = Exam.where(professor_id: @all_professors.pluck(:id))
- @all_exam_evaluations = Evaluation.where(exam_id: @all_exams.pluck(:id))
- This feels arbitrary though
- if we are going to do this, why not just add a school_id to an evaluation?
- Or a bunch of other foreign keys to all the tables to make all hops 1-2 at most
In the end, I’m mostly trying to understand which strategy to use under which scenarios – I see a lot of blog posts talking about the law of demeter and simplifying your associations (usually using delegate), but most of them have pretty simple connections like business.location.city_name, nothing as complex as chaining through 4-5 associations to get back to the original object. It also feels dangerous to chain that much – if at any point in the evaluation.response.question.exam.professor.school the associated object ends up being deleted somehow, then you break your ability to get back to the original object.
Hope that helps explain my question more.
I've been wrestling with database normalization and the "Rails Way" and I
wanted to get peoples opinions as to if there's a best practices guide to
how to do it.
IMHO best practice is to keep it properly normalized until you've
identified some need to denormalize. Usually this is due to
performance... and identified through benchmarking with a profiler, as
any human's guess as to why an app is slow is most likely wrong. If
you do have a performance problem, and you think denormalization would
help, consider other approaches first, or possibly whether your app
doesn't need it to be relational in the first place and might benefit
from a non-relational approach. (That's not *quite* the same thing as
"NoSQL".)
class StudentExam < ActiveRecord::Base
This name tells us nothing. I'd recommend something like
"ExamTaking". (Might also have this include the date or some such
differentiator, in case the student takes the same exam multiple
times.)
@questions = @exam.questions
@responses = Response.where(question_id: @questions.pluck(:id)
@evaluations = Evaluation.where(response_id: @responses.pluck(:id)
This seems kind of painful
Right. I think you need to look at the "includes" method.