Will Merrell wrote:
Marnen Laibow-Koser wrote:
and shallow is the proper solution.
Nope! The table is already well normalized, I think. If you disagree,
please tell me what normalization condition is being violated.
If the OP's problem really is lots of truly independent attributes,
selected from a finite list, with a limited and known number of repeated
attributes, then yes, wide and flat, (the OP's option A) is probably an
optimal solution.
That is what the OP said his problem is.
But, that is not what I read in his post. How many pools should he be
able to capture for a property? Can a property have six doormen? What
about the property with the helipad?
What about it? Those fields could be integers.
The problem with wide and flat is
that it requires a fixed problem domain.
The OP seems to think he has one.
Cleaver as I am, I can not
fully anticipate what crazy examples the real world of the users is
going to through at my system.
From what the OP said, I don't see where you get the requirement of the
users specifying custom attributes. You are trying to solve a different
problem.
To be sure, he said it was his "ingoing assumption", but I read that as
meaning he's guessing.
As I read the OP's description I saw that the list of features he needs
to capture is not fixed and that some features could occur multiple
times.
Nowhere does he say anything about features occurring multiple times.
To me that demands that features be objects in their own right
Relational databases don't have objects.
and that the database design should capture the relationship between
properties (et al) and features. There are at least two tables here,
properties and features. The exact structure (and number) of these
tables is a second question.
Yes -- if a feature could occur multiple times. But that is not the
OP's use case.
Oh, it gets used, all right -- by people who either are dealing with
Thank you for the link, I see what you're driving at. I mostly agree.
Certainly in the case discussed in the thread you linked to the
key/value pattern is a bad idea, and should be avoided.
There were really two intents in my original post. The first, and most
important, is that there are other ways to solve the OP's problem beyond
the two that he presented. His first seemed to me too simplistic and
headed for trouble (as discussed above)
Then it can be refactored. Always start with the simplest approach.
and the second seemed too
complex and also headed for trouble.
Actually, option B is better normalized than your idea. I think it's
preferable.
I simply wanted to suggest that
there may be a third choice. Whether my suggestion is that better choice
is a separate issue.
I don't think it is.
The second point i wanted to make is that the recording of a feature
need not be very complex. The presence or absence of a record may be
sufficient.
Or, for that matter, an integer in a field!
What the OP needs to record about a given feature probably
needs a good deal more thought and discussion, but the existence of the
feature can be very simple, and just naming it may be enough.
Right.
I really do agree that recording a feature as a collection of key/value
pairs is almost certainly a bad idea, and I did not intend to suggest
such a structure. If you heard that in what I posted then I did
misspeak.
Not a collection. Your proposal, though, was something very close to
the key/value schema.
I think what we both mean to say is that good database design requires a
good understanding the problem space. Duct tape and bailing wire works
for MacGuyver, but not for the rest of us.
-- Will
Best,