Models without primary key; n-ary associations

Michael_Schuerig1 · April 29, 2008, 7:27pm

Today for the second time in a short while I've come upon a case where I'd like to have a "model" without a primary key. Instances of such a model wouldn't have an identity above the unique combination of (part of) their attribute values.

The cases I've come up with involve expressing facts about two or more related objects. They can be thought of as n-ary associations, in a sense, generalizations of habtm.

Here's an example: There are multiple compilers, each of which potentially runs on many host systems and can compile code for many target systems (depending on host system!).

I'd like to express facts and queries like these

- compiler c running on host OS h can compile for target OS t - find all compilers that run on host OS h and compile for target OS t - find all compilers that match some other condition and run on host OS h

Much of this can be kludged with :through associations, but it doesn't feel right to me as these involve full-fledged association models with an identity independent of their attribute values.

I'm very much interested in suggestions.

Michael

11175 · April 29, 2008, 8:31pm

Michael Schuerig wrote:

I'm very much interested in suggestions.

Michael

--

With no primary key, it's not really an ActiveRecord::Base as all the associations will simply break anyways,

It is suggested to use the habtm if the link table only contains the 2 foreign keys, otherwise, you should make it a model with the primary key that rails expects.

If you just want to manage the association between N to N and this association doesn't go in the DB then you should just write the code yourself as all the code writers that come with rails won't help you one bit

ilan

11175 · April 29, 2008, 8:52pm

Michael Schuerig wrote:

Today for the second time in a short while I've come upon a case where I'd like to have a "model" without a primary key. Instances of such a model wouldn't have an identity above the unique combination of (part of) their attribute values.

Isn't that what all records/models are? It's not the id that's important, but all the attribute goodness present in the fields... 99.99999% of the time (accounting systems excepted), I don't care whether an id value is 1, 42, or a bazillion except as a fast way to get at all the rest of the data for that record.

I just think of those "n-way primary attribute" cases as instances where the id field is just a short-hand method of specifying the unique combination of attributes however many there might be.

Michael_Schuerig1 · April 29, 2008, 9:49pm

Yes, I know that this is how Rails works currently, but things being flexible, a plugin could add new functionality, provided that it is useful.

Michael

Robert_Walker · June 27, 2008, 4:54pm

I'm having trouble seeing your point. Databases consist of tables that contain rows, which in turn contain columns. Why do you even care if each row contains a simple incrementing primary key. In fact I've used databases (OpenBase and Oracle for example) that you cannot create a database table that does not have some form of row id. In OpenBase that would be _rowid. In fact every OpenBase table has three default (and required) columns (_rowid, _timestamp and _version).

The only valid reason I see for supporting tables that have no primary key is to support legacy databases, where the designer did not provide one. And for whatever reason you are not able, or allowed, to add one.

By the way Rails is not the only ORM with this type of limitation. I previously worked with the ORM system in WebObjects, which has a similar requirement for some form of primary key. ORM systems must have a way to identify rows uniquely in a database. That is how they map objects to table rows. How would you map a single object to two or more rows that happen to contain the exact same data. That's just bad design to begin with.

As I said I've run into situation in WebObjects where I had legacy tables with no PK. The only option I had there was to mark every column as part of the PK. That's just silly, but in an ORM system that's really your only option. I found that when the table contained duplicate rows WebObjects ORM would give multiple objects with duplicate information in their attributes, however, they were both read from the same physical row. So if I were to make changes to both objects and save, only one of the rows would receive the updates. Basically the ORM would "randomly" choose just one of the duplicate rows and give you multiple objects, but they all referenced the same physical row in the database.

This is a difficult problem to solve in ORM systems. SQL and object mapping is not exactly compatible. The requirement for a primary key solves one of the incompatibilities. Namely the issue of identity. All objects in OOP have an identity, but that is not necessary the case for SQL rows.

Michael_Schuerig1 · June 27, 2008, 8:22pm

Forget about no PK. I meant no synthetic key, but of course the table has a primary key, however, a composite one. I wouldn't mind a synthetic key that is managed and used by the DBMS only, either. The problem is that ActiveRecord holds onto that numeric ID, even though the semantic identity of the row/object depends on the other attributes.

Michael

LemmingJoel · June 28, 2008, 1:07pm

My understanding of your problem:

The attributes should be able to act as a composite PK. Their combination will always be unique, and yet you've got this synthetic key which would seem redundant.

But that's really just aesthetics. It'll still work with a synthetic PK. Except..

One implementation problem that I can see this: using a synthetic key, it would be possible to have duplicate rows in the database, differing only by the synthetic PK. In which case you'd have to have some model validations to check that the new record is unique - which would be quite messy.

Are there any other implementation problems? Or is it just aesthetics issues? (what I mean by aesthetics is it makes you feel gooie inside when your code reflects your model cleanly)

I think to do it you'd have to override your model's "id" method, and have it return an integer which is derived from the other attributes and guaranteed to be unique. Like a hash of all the attributes. But it would be very hard to guarantee that the hash could not reproduced by another combination of attributes.

Hope this makes sense. I'm tired. I'll probably read over this post later and be shamed by the nonsense I've just polluted the thread with.

Joel.

Michael_Schuerig1 · June 29, 2008, 8:23am

That's exactly the problem. There is even a composite PK plugin/gem (by Dr. Nic), but I shy away from it, as it hacks a bit too deeply into the innards of ActiveRecord to be entirely wholesome.

Why would such a problem where id-identity and attribute-identity diverge ever occur? In my case I have a ternary relationship that can be edited in a view where I can't easily keep track of which/whether ids already belong to a relationship instance. When assigning the changed association, I'd like to do this as

associated_objects = build_assoc(association_params) my_model.the_assoc.replace(associated_objects)

Association#replace out of the box takes care of only actually removing and adding changes, therefore foreign key constraints are not touched for unchanged association instances. It determines changes based on the object id, which does not work in my case. To circumvent this, I've overwritten #== in the association class. This works, but I'm not 100% sure that it won't break anything in AR's inner workings in some arcane case.

Michael

Robert_Walker · June 30, 2008, 5:21pm

There is a dead simple fix to the problem. Add a unique index across your composite key and let Rails have it's synthetic key. Indexes don't have to be PKs to be unique.

Michael_Schuerig1 · July 1, 2008, 9:02am

I already have such an index in place. As I explained, my problem *was* that ActiveRecord identifies records by their synthetic key. By that measure, two records with the same attributes, i.e. the same natural key, are not considered the same when their synthetic keys diverge. This can cause trouble if you're updating (#replace) which objects are associated through a particular association. The solution, in my case, was to override #== in the model class of the associated objects.

Michael

Pardee_Roy · July 1, 2008, 5:59pm

But if you've got a unique key constraint on the natural key, then there can't be > 1 record in your table w/the same values of your natural key columns. In which case the synthetic key is identity enough, isn't it?

bphogan · July 1, 2008, 6:07pm

Just write your own association methods. That’s how I get around these things. Leave the id column around and just ignore it, and don’t associate it :

class Foo < ActiveRecord::Base def bars @bars ||= Bar.find_all_by_some_other_thing(“foo”) end end

Michael_Schuerig1 · July 1, 2008, 7:32pm

What already *is* in the table is not the problem at all. The problem occurs when I try to *update* the table via an association. In that case, when a new record, i.e. one without an id, has the same attribute values as an already existing record, the existing record should not be touched.

This is relevant when editing all associated in a single form. Consider my original example again.

- compiler c running on host OS h can compile for target OS t - find all compilers that run on host OS h and compile for target OS t - find all compilers that match some other condition and run on host OS h

I have a single form where for a given compiler all host-target mappings can be edited. Compiler has_many HostTargetMappings; when this association is updated, it is not possible to simply delete all associated objects and insert new ones. This approach is bound to vioate foreign key constraints imposed by other objects associated to the relevant HostTargetMappings.

AR in principle does the right thing, on host_target_mappings#replace it inserts new objects, deletes the ones no longer pertinent, and should leave the others untouched. However, this latter part depends on the identity of an object being tied to its synthetic id, because #replace does the comparison using #==, which in turn compares AR objects based on their id.

Now, imagine if you will, that the user creates a mapping duplicating the attributes of an already present one. Such a duplicate ought simply to be ignored, but when comparison is based on id, this object is unlike any existing one, therefore AR tries to insert it, but fails due to the foreign key violation.

As I've explained at least three times already, the way around this is to override HostTargetMapping#== so that it is based on attribute values not id. This works perfectly in my case, but I can't guarantee that it won't blow up because some part of AR might depend on #== depending on the id.

Michael

Topic		Replies	Views
model with no primary key rubyonrails-talk	3	196	March 18, 2011
primary key? rubyonrails-talk	3	108	March 31, 2008
belongs_to associations on non-primary keys rubyonrails-talk	1	121	October 4, 2007
association without id in table rubyonrails-talk	1	128	July 24, 2008
Virtual model rubyonrails-talk	6	1514	October 28, 2010

Models without primary key; n-ary associations

Related topics

More Resources