xml-containing models

Jaap_Haagmans · August 8, 2009, 12:49pm

Because the definitions of certain objects in our models are growing out of control, we're looking into the possibility of simply defining some of our objects in XML, rather than using many models and having to use complicated join queries that make our application quite heavy.

Basically, we're defining machines that have many, many attributes. Not all machines use the same attribute definitions and some of them are very different from other machines, but we want to define them all in one "root" model called "machine". Our application builds "bigger machines" using these smaller machines, so we want the application to know specifically what it can and cannot do. In the basic MVC structure, this would mean we would have to have many models and they all would be linked one way or another. Putting them together gets you all you need to know about the machine.

What we would like to do however is define our machines like this: <Machine> <colour>Red</colour> <width>200</width> <height> ... ... <touchpad> <line id="1"> <button id="1">Power switch</button> <button id="2">Reset switch</button> </line> </touchpad> <bcu type="built-in">1029</bcu> </Machine>

This is, of course, a simplified example.

We would like to have a model with simply an id (which is the machine's id within the application) and a field to put this xml into. So far so good, I of course know how to do this.

However, this would, as far as I can tell, limit the great possibilities that Rails models have to offer. I would not be able to simply say:

@machine = Machine.find(params[:id]) num_buttons = @machine.buttons.count ...

I would have to do something with @machine.xml instead and I'm not very sure what.

So, my question is this: how can I put all that XML-data into a MySQL- table and get information from it the smart way?

Jaap_Haagmans · August 9, 2009, 11:06am

I'm pretty much stuck here. Is there anyone who has an idea? I can't seem to find anything about this.

Simon_Macneall · August 9, 2009, 11:24am

Hi,

I'd be inclined to do something with an index type table, that mapped
important bits of the XML into the DB. Maybe with machine_id,
attribute_type, attribute_name, attribute_value as the columns. It can be
a bit more work to keep it up, but you get the full flexibility of XML,
but can access the important attributes using the DB.

Then to get the button count as per your example you'd go

MachineIndex.count(:all, :conditions => {:machine_id => :id,
:attribute_type => 'button'}) or something similar.

Cheers Simon

Jaap_Haagmans · August 9, 2009, 2:26pm

Thank you for your response, Simon!

It's an option, but I prefer an option in which I could actually use one, readable definition of a machine in one single MySQL field.

I know I will have to map all those attributes some way or another, but I can think of situations in which I do not only have <type id=1>value</type>, but many more options. I'm just not sure how I can map the xml-content correctly. It has to be pretty solid and searchable. I want to be able to, for example, look for red machines that have six buttons (simplified example, of course).

I know your way will work fine, but I think it would be beautiful to do it this way, simply because the database will be very readable on itself. I know that when searching for a machine with certain attributes, performance will probably be alot higher doing it your way, but I'm fine with that.

Matt_Jones · August 9, 2009, 3:38pm

Don't. Just don't. Your database is not an XML parser; see this article:

Note that the WTF database described there is almost exactly what you've described.

On a more serious note, I'd recommend that you take a look at some of the schemaless databases, like CouchDB, for your situation. They'll give you the flexibility to define your machines, with more ability to query parts of the data.

--Matt Jones

Jaap_Haagmans · August 9, 2009, 5:11pm

But to the OP, I have another question: what exactly is getting out of control about your schema? There's nothing wrong with having lots of tables (and lots of model classes) if your data is best modelled that way. Is this simply a "fear of adding classes" situation, or is there some other reason for not wanting to do it that way?

Well... It's basically due to the fact that it's very hard to define these machines. They all have certain limitations, possibilities, functions that differ from each other quite a bit but are very similar in another way. Apart from that, the client I'm building this system for wants to be able to add functionality that has never been defined before (e.g. 3 years from now, these machines will have evolved quite a bit) without the need to call me. So we were looking at Zoo (http:// zoo.yootheme.com/) and saw that this was a very versatile way to do it. However, other suggestions are always welcome and I greatly appreciate the help!

The other thing is that we depend on MySQL for other parts of the application.

We also have an application that, in some ways, does the same things to the same data as we want to, but just isn't as powerful as our application. We have (limited) access to its database. We've browsed through it though and I'm very sure that's -not- the way we want to do this. I'm just unsure about the right way.

Simon_Macneall · August 9, 2009, 10:55pm

The other option (and I doubt it is supported in Rails yet) is Oracle's,
XML support. It is limited, but allows you to do finds by xpath. We've
used it for a JEE project, and it can make this sort of thing a bit more
manageable.

Simon

Josh5 · August 10, 2009, 9:33am

I have been working with a project which has a similar sort of issue where there are too many possible sorts of observations a doctor can make on a scan to create fields for them all. One means of solving this was to create a 'ObservationType' an ObservationData and ObservationTypeData (join) table. (Its much much more this this, but simplified is better). IIt is still easliy searched with the finders and you can skip a lot of model logic.

Observation has many ObservationDatas through ObservationTypeDatas (Join Table) ObservationTypeData belongs to ObservationData, ObservationType Observation has many ObservationTypes through ObservationDatas ObservationData has one ObservationType

So, simply put, my user can create an observation type and apply it to any observation.

You have the idea of a parent relationship going on a well, so you will need another join table which joins a parent with a child say Machine has many BiggerMachines, class = Machine, through MachineParents MachineParents (join table) belongs to Machine (fk child_id) MachineParents (join table) belongs to Machine (fk paret_id)

Really this would only require 4 to 6 tables to give you parentchild with dynamic atributes.

I use this a lot and it works very well when using flex itemreneders, datgrids, and trees to show and update your data. I personally don't think it wouldn't be a lot of fun to implement UI for this in ajax/js but I'm sure its possible! Take a good look at the nested attributes, or my addition of dangerousnestedattributes and maybe you will see some ideas there too.

Good luck, -Josh

Frederick_Cheung · August 10, 2009, 9:52am

The other option (and I doubt it is supported in Rails yet) is Oracle's,
XML support. It is limited, but allows you to do finds by xpath. We've
used it for a JEE project, and it can make this sort of thing a bit more
manageable.

and a few other products/databases do this too eg DB2 pureXML and marklogic. Probably no rails integration though.

Fred

Jaap_Haagmans · August 10, 2009, 6:14pm

Right, we've decided today that XML-parsing is not the way to go, as long as my client can easily add functions to the machines. So I'm combining all your ideas into a suitable idea (Josh's comments are especially helpful) and got to this:

A model called "machine" simply to define the unique options of a single machine:

:name :product_code

A model called "machine_attribute" to define attributes that have some overlap between different machines, linked to "machine" with a join table called "machine_machine_attribute":

:type :value

A model called "machine_constructor" to define parts of the machine, for example a touch panel or a display, linked to "machine" with a join table called "machine_machine_constructor":

:type

A model called "machine_constructor_child" to define the constructors, for example buttons or the possibilities of a (set of) button(s), linked to "machine_constructor" by a table "machine_constructor_machine_constructor_child", but it can also only be linked by its own parent_id for unlimited depth (this is very important):

:type :value :parent_id :position # needs to be sortable for consistency

This way we will be able to completely define a machine. Because we need to have unlimited depth I couldn't go deeper than this. Every machine will have attributes and have at least one constructor block, but the constructor blocks are where machines differ from each other. Some have multiple functions but no buttons, some only have a movement sensor and some functionality, some constructors are smart and have some sort of touch screen and so on.

Is this implementable? Say I would like to build the machine with product_code "207" from the database: how would that work? I know I will have to define associations for the models, but I'm unsure as to how I can use the parent_id in the last model to descend or ascend. For example, if a button is contained by a line and a line is the root, linked to constructor.type "touch_panel". How do I get from that constructor to its buttons? Will there be a method similar to this:

@buttons = MachineConstructorChild.find( ??? )

Thank you very much for reading this humongous post! I'm very eager to hear what you have to say about the model structure as well.

Jaap_Haagmans · August 11, 2009, 11:03am

Update. Again...

I've decided to deepen the machine_attribute model to include both constructors and their children. The machine_attribute model is now actually the same as my machine_constructor_child, but also has

In the XML-model example in the first post, I specified an attribute called width. Its type would be "attribute", its name "width" and its value "200". Through the model, I would set, for a machine:

has_many :machine_machine_attributes has_many :machine_attributes, :through => :machine_machine_attributes

And for a machine_attribute:

has_many :machine_machine_attributes has_many :machines, :through => :machine_machine_attributes has_many :nested_attributes, :class_name => :machine_attribute

The type "attribute" in the width example may seem confusing: I'll have to come up with a better name. What I mean is I want to be able to specify whether an attribute is an actual single attribute or a "special" one like colour, constructor (and inside constructor you'll have the special ones called "line", "button", "function" et cetera) or bcu_compatibility. A colour needs to be chosen (type=option, name=colour, value=red), but a button has to be linked to an object (e.g. conveyor_belt where id=1), so it's a droppable in my AJAX-layout (type=droppable, name=button, value=nil). A constructor is a container, just like a button line, which is no more than a way to be able to seperate certain things from each other: it contains no real information. A function is an option, again.

I hope this makes sense. Now my question: how to extend this model logic to be able to use this in my controller without having to descend manually into the machine? Is there any way I can tell my controller (or model) how it should handle an attribute with type=option or type=container or even type=attribute (which is no more than a line stating information about a machine.

The objects with type=option need to be grouped by name, the droppables need to be grouped by their container and containers need to be grouped by their container or just listed by their positions.

Jaap_Haagmans · August 11, 2009, 12:59pm

I guess the nested_attributes thingy isn't the way to go, after trying to build this. Does anyone have an idea?

Jaap_Haagmans · August 11, 2009, 1:26pm

I think you ought to go read about self-referential relationships... Social networking is all about people who have friends who have blah blah blah.

I'll look into it.

After all, a machine is built out of some number of components, components have attributes and are built from some number of other components or parts, and parts are, well parts (it has to end somewhere, parts are components with no subsidiary components?).

Sure, it ends. But at this moment I can't say where it will end, which is why I keep on stressing that theoretically, there should be endless depth.

I once worked on a system where we modeled utility rates, and were tasked with building 'the ultimately flexible rate model' engine. Basically, a rate had a name and some attributes and N components. Each Component had attributes and (possibly) some (sub-)components.

Well, that sounds like what I'm trying to do. However, I think our situations differ in that I want certain attributes to behave differently than other attributes.

Jaap_Haagmans · August 11, 2009, 1:44pm

Also, I need to be able to define a scope for the position value, so that any product can be positioned (using Prototype's sortable_element) within its containing element (parent element), along with its children, but not outside its containing element. I guess this is model logic, but I don't know where to start.

Topic		Replies	Views
multiple models questions rubyonrails-talk	24	737	August 17, 2009
Sharing a resource's schema through REST rubyonrails-core	21	371	April 25, 2007
relationship + XML rubyonrails-talk	1	117	April 26, 2009
Generic extension of an ActiveRecord model rubyonrails-talk	10	150	February 27, 2007
database design question: building features rubyonrails-talk	15	497	January 26, 2010

xml-containing models

Related topics

More Resources