Implementation of Sets in AR

I am at the very early design stages of a web service to implement a
"booking engine". I want it to be a generic in so far as the resource
being booked could be anything, a tennis court, hotel room, ski chalet
or airline seat. It's up to the client (RoR) application calling the
webservice to determine what the resource is, who can book it etc etc.

so really all I need is "booking" and "availability" where
availability is a set of resources, and a booking is a subset (not
necessarily contiguous) of resources. But the point is that a
resource is just a unique numeric key, it has no other attribute. So
it would be terrific to implement the webservice schema using set
theory in Ruby.

require 'set'
require 'yaml'
availability = Set.new [1,2,3,4,5,6,7,8,9,10,11,12,14]
booking = Set.new(4..8).to_a

and then be able to use the standard set operators to check there is
availability (is a subset) and all of the others.

So what I need is to store a set of integers as a set within the
database (in a column?) to represent a set of resources. That could
be done by serializing the Set object but then I lose the ability of
database queries (in particular indexing performance gains) to do
something like, show me all of the bookings of the resources [11, 14,
16], which is of course another set.

But a relational database is just set theory in disguise I hear you
say! But surely my webservice does not need to have a table called
Resource with a column called application_id and a column called
resource_id (and the extra id created by Active Record).

My gut feel is that I am approaching this in the wrong way. Perhaps
Active Record is the wrong approach. I have had a look at Nested Sets
but that isn't the functionality I am after. Or am I?

Does anyone have any suggestions as to a sensible design approach?

O.

Owain wrote:

I am at the very early design stages of a web service to implement a
"booking engine". I want it to be a generic in so far as the resource
being booked could be anything, a tennis court, hotel room, ski chalet
or airline seat. It's up to the client (RoR) application calling the
webservice to determine what the resource is, who can book it etc etc.

so really all I need is "booking" and "availability" where
availability is a set of resources, and a booking is a subset (not
necessarily contiguous) of resources. But the point is that a
resource is just a unique numeric key, it has no other attribute. So
it would be terrific to implement the webservice schema using set
theory in Ruby.

require 'set'
require 'yaml'
availability = Set.new [1,2,3,4,5,6,7,8,9,10,11,12,14]
booking = Set.new(4..8).to_a

and then be able to use the standard set operators to check there is
availability (is a subset) and all of the others.

So what I need is to store a set of integers as a set within the
database (in a column?) to represent a set of resources.

No, you probably don't.

That could
be done by serializing the Set object but then I lose the ability of
database queries (in particular indexing performance gains) to do
something like, show me all of the bookings of the resources [11, 14,
16], which is of course another set.

But a relational database is just set theory in disguise I hear you
say! But surely my webservice does not need to have a table called
Resource with a column called application_id and a column called
resource_id (and the extra id created by Active Record).

No, I think it does. You *want* each resource to be a separate object
in your application, and you *want* each resource to be a separate
object in your DB, I think. That will give you the queries you mention
above, and keep each resource individually addressable.

Why do you think you don't need to do this? It seems like the best way
to me.

My gut feel is that I am approaching this in the wrong way.

You absolutely are. You should be using the relational model for what
it's good at, not trying to defeat it.

Perhaps
Active Record is the wrong approach.

ActiveRecord is the right approach. Serialized arrays are the wrong
approac.

I have had a look at Nested Sets
but that isn't the functionality I am after. Or am I?

No. Those are for tree structures.

Does anyone have any suggestions as to a sensible design approach?

Yes! Keep each resource as a separate record.

O.

Best,

Marnen

Thank you for taking the trouble to reply.

Owain wrote:

> That could
> be done by serializing the Set object but then I lose the ability of
> database queries (in particular indexing performance gains) to do
> something like, show me all of the bookings of the resources [11, 14,
> 16], which is of course another set.

> But a relational database is just set theory in disguise I hear you
> say! But surely my webservice does not need to have a table called
> Resource with a column called application_id and a column called
> resource_id (and the extra id created by Active Record).

No, I think it does. You *want* each resource to be a separate object
in your application, and you *want* each resource to be a separate
object in your DB, I think. That will give you the queries you mention
above, and keep each resource individually addressable.

Why do you think you don't need to do this? It seems like the best way
to me.

The thing is that there is no other attribute of the model needed by
the booking engine other than 'id'. So imagine that the resources
being booked were coloured pencils. A booking would be for red, blue
and green. There is no need to have another table for colours. The
availability object would hold all of the available colours. I don't
really see the point of the having a table with the following rows:

application_id, resource
pencil_booker, red
pencil_booker, orange
pencil_booker, yellow
pencil_booker, green
pencil_booker, blue
pencil_booker, indigo
pencil_booker, violet
pencil_booker, dayglow_yellow

And maybe this is the difference, this table DOES exist in another
form in the RoR application that calls the webservice. So the calling
service knows that orange has an id of 1234567 and the booking engine
sees that as a resource available and for booking. The booking
service is not domain specific, it just implements some booking
logic.
Taking a real example, using our traditional approach I would need to
set up a the resource table with a row for every day that is made
available (using its Julian form as an id) with a HABTM relationship
through an associative entity table just to be able to be able to find
all bookings for a particular day or week?

A sledgehammer to crack a nut?

O.

Owain wrote:

Marnen

Thank you for taking the trouble to reply.

Owain wrote:

No, I think it does. �You *want* each resource to be a separate object
in your application, and you *want* each resource to be a separate
object in your DB, I think. �That will give you the queries you mention
above, and keep each resource individually addressable.

Why do you think you don't need to do this? �It seems like the best way
to me.

The thing is that there is no other attribute of the model needed by
the booking engine other than 'id'. So imagine that the resources
being booked were coloured pencils. A booking would be for red, blue
and green. There is no need to have another table for colours. The
availability object would hold all of the available colours.

...but wouldn't tell you anything about the unavailable colors, and
wouldn't easily let you query the availability of an individual color.

I don't
really see the point of the having a table with the following rows:

application_id, resource
pencil_booker, red
pencil_booker, orange
pencil_booker, yellow
pencil_booker, green
pencil_booker, blue
pencil_booker, indigo
pencil_booker, violet
pencil_booker, dayglow_yellow

The point is that each resource really is a different entity in your
application's terms, so it should be treated like one.

And maybe this is the difference, this table DOES exist in another
form in the RoR application that calls the webservice. So the calling
service knows that orange has an id of 1234567 and the booking engine
sees that as a resource available and for booking. The booking
service is not domain specific, it just implements some booking
logic.

Then perhaps the booking service shouldn't store resources at all, but
should just call the actual agency (say, the pencil box) responsible for
reservations and return a response code.

Taking a real example, using our traditional approach I would need to
set up a the resource table with a row for every day that is made
available (using its Julian form as an id) with a HABTM relationship
through an associative entity table just to be able to be able to find
all bookings for a particular day or week?

No. You probably wouldn't want to consider a day to be a resource.
Rather, each booking should have a start time and and an end time.

A sledgehammer to crack a nut?

No, just poor data modeling.

O.

Best,

Owain wrote:
> Marnen

> Thank you for taking the trouble to reply.

>> Owain wrote:

>> No, I think it does. You *want* each resource to be a separate object
>> in your application, and you *want* each resource to be a separate
>> object in your DB, I think. That will give you the queries you mention
>> above, and keep each resource individually addressable.

>> Why do you think you don't need to do this? It seems like the best way
>> to me.

> The thing is that there is no other attribute of the model needed by
> the booking engine other than 'id'. So imagine that the resources
> being booked were coloured pencils. A booking would be for red, blue
> and green. There is no need to have another table for colours. The
> availability object would hold all of the available colours.

...but wouldn't tell you anything about the unavailable colors, and
wouldn't easily let you query the availability of an individual color.

> I don't
> really see the point of the having a table with the following rows:

> application_id, resource
> pencil_booker, red
> pencil_booker, orange
> pencil_booker, yellow
> pencil_booker, green
> pencil_booker, blue
> pencil_booker, indigo
> pencil_booker, violet
> pencil_booker, dayglow_yellow

The point is that each resource really is a different entity in your
application's terms, so it should be treated like one.

> And maybe this is the difference, this table DOES exist in another
> form in the RoR application that calls the webservice. So the calling
> service knows that orange has an id of 1234567 and the booking engine
> sees that as a resource available and for booking. The booking
> service is not domain specific, it just implements some booking
> logic.

Then perhaps the booking service shouldn't store resources at all, but
should just call the actual agency (say, the pencil box) responsible for
reservations and return a response code.

It does need to track booked resources for the yield management
functionality of the booking engine. i.e. it does dynamic price
calculation based on availability.

> Taking a real example, using our traditional approach I would need to
> set up a the resource table with a row for every day that is made
> available (using its Julian form as an id) with a HABTM relationship
> through an associative entity table just to be able to be able to find
> all bookings for a particular day or week?

No. You probably wouldn't want to consider a day to be a resource.
Rather, each booking should have a start time and and an end time.

That doesn't cleanly work for non-contiguous periods, which I need.

> A sledgehammer to crack a nut?

No, just poor data modeling.

I think we can easily fall into the trap of confusing data modelling
with foisting a data model onto a database implementation. There is a
logical model and there is the normalised technical model. I am
really at the logical model stage where a resource is just a "thing"
and that's all I need to track in the booking engine. It has no other
attributes. It looks like MongoDB has support for arrays and
therefore sets where you can query against arrays. See
http://railstips.org/blog/archives/2009/12/18/why-i-think-mongo-is-to-databases-what-rails-was-to-frameworks/
Picking the right tool for the logical design is not always possible
but at this stage I am lucky enough to have a blank(ish) sheet of
paper.

I'll put some more effort into the de-normalised approach and see
where I get to.

O.

O.

Owain wrote:

>> object in your DB, I think. That will give you the queries you mention

> pencil_booker, green
> And maybe this is the difference, this table DOES exist in another
> form in the RoR application that calls the webservice. So the calling
> service knows that orange has an id of 1234567 and the booking engine
> sees that as a resource available and for booking. �The booking
> service is not domain specific, it just implements some booking
> logic.

Then perhaps the booking service shouldn't store resources at all, but
should just call the actual agency (say, the pencil box) responsible for
reservations and return a response code.

It does need to track booked resources for the yield management
functionality of the booking engine. i.e. it does dynamic price
calculation based on availability.

Then it at least needs to store a resource count. Perhaps that's *all*
it needs to store, not a separate record for each resource or an array
of resource IDs?

> Taking a real example, using our traditional approach I would need to
> set up a the resource table with a row for every day that is made
> available (using its Julian form as an id) with a HABTM relationship
> through an associative entity table just to be able to be able to find
> all bookings for a particular day or week?

No. �You probably wouldn't want to consider a day to be a resource.
Rather, each booking should have a start time and and an end time.

That doesn't cleanly work for non-contiguous periods, which I need.

It can, with some further tricks.

Can you get into more detail about your use case? It seems like every
time I try to simplify your data model, we discover a new wrinkle...

> A sledgehammer to crack a nut?

No, just poor data modeling.

I think we can easily fall into the trap of confusing data modelling
with foisting a data model onto a database implementation. There is a
logical model and there is the normalised technical model.

The two are interdependent.

I am
really at the logical model stage where a resource is just a "thing"
and that's all I need to track in the booking engine.

If it is a "thing", it needs a record IMHO.

[...]

I'll put some more effort into the de-normalised approach and see
where I get to.

It is the wrong approach. It will be harder to query and will not
scale. Don't waste any more time on it.

O.

O.

Best,