Best way to model arbitrary-keyed Hash

Hi all,

I've got an object that looks a lot like a hash, and its valid parameters are essentially arbitrary. I'm successfully storing these objects, but now I need to be able to query them.

This is for my project Puppet[1], which manages many different classes of objects, like files, services, packages, and users. Each of these classes has its own set of valid parameters, and I'm adding new classes and modifying existing classes all of the time, so I don't want to create a separate Rails table for each class. I also don't want to create a separate table for each class because I have a 'host' class that functions as a collection of these objects, along with a bit of other data (like its IP address); if I have a separate table for each class, I'll need a separate association for each one, also.

I've been storing them using an association, with the main info (class and name) in one table, and the parameters in a separate table (e.g., the file's owner, group, and mode) using associations.

However, I expect these queries won't work well:

rails_parameters.name = 'owner' and rails_parameters.value = 'root' OR rails_parameters.name = 'owner' and rails_parameters.value = 'bin'

What is the best way to store these objects so that I can easily find them? I'm assuming I can't serialize the parameters and query against them, since SQL has no clue about the serialized YAML.

Thanks, Luke

1 - http://reductivelabs.com/projects/puppet

Hey Luke,

You might want to take a look at ferret and the acts_as_ferret plugin. On the surface it seems like a good fit for what you are trying to accomplish.

ferret - http://ferret.davebalmain.com/trac aaf - http://projects.jkraemer.net/acts_as_ferret/

Will Groppe

No way, you're doing a Rails frontend to Puppet? Contact me off list and I'll help you out no problem.

I think there's a few little Rails features you can use, but it sounds like you've gotta get the data model worked out to match what Puppet uses. Shouldn't be that hard, but it is possible to get them all organized into the database. It also might be that you have to generalize what all of these things actually are and use attributes to differentiate them.

Anyway, contact me since I'd love to help out on Puppet.

William Groppe wrote:

Hey Luke,

You might want to take a look at ferret and the acts_as_ferret plugin. On the surface it seems like a good fit for what you are trying to accomplish.

ferret - http://ferret.davebalmain.com/trac aaf - http://projects.jkraemer.net/acts_as_ferret/

It's good to know those are options, but I'd prefer something that maps to the database a bit more directly.

Zed A. Shaw wrote:

No way, you're doing a Rails frontend to Puppet? Contact me off list and I'll help you out no problem.

Well, there's already a limited Rails front-end to Puppet:

http://www.reductivelabs.com/projects/puppet/documentation/puppetshow.html

But this is actually only using ActiveRecord so I don't have to deal with SQL or db agnosticism.

I think there's a few little Rails features you can use, but it sounds like you've gotta get the data model worked out to match what Puppet uses. Shouldn't be that hard, but it is possible to get them all organized into the database. It also might be that you have to generalize what all of these things actually are and use attributes to differentiate them.

Yep, that's exactly the trouble I'm having.

Every object has a type (e.g., file) and title (e.g., "/etc/passwd"), so right now I'm just limiting querying to those two fields when using Rails. I'd like to find a db model that would allow me to find any of these objects by arbitrary attributes, rather than just these two.

Anyway, contact me since I'd love to help out on Puppet.

Will do; I'd love the help.

How about using Ferret instead of the database. Ferret basically stores hash objects which would seem to fit well for your free form classes. You could either create a new index for each class or store all objects in a single index. Querying the index will obviously be very simple.

David Balmain wrote:

How about using Ferret instead of the database. Ferret basically stores hash objects which would seem to fit well for your free form classes. You could either create a new index for each class or store all objects in a single index. Querying the index will obviously be very simple.

Does ferret to complex object queries? That is, could I do something like search for 'obj[:owner] == "root"'?

I assume that I can't easily share a ferret index across multiple servers; it's critical that I retain the ability to have multiple servers hit my database, both for performance and service availability.

David Balmain wrote: > > How about using Ferret instead of the database. Ferret basically > stores hash objects which would seem to fit well for your free form > classes. You could either create a new index for each class or store > all objects in a single index. Querying the index will obviously be > very simple.

Does ferret to complex object queries? That is, could I do something like search for 'obj[:owner] == "root"'?

Yes. The query would look like this: 'owner:"root"'. But searching for 'obj1[:owner] == obj2' would be a little more difficult. Mind you, I don't think it would be any easier in a database.

I assume that I can't easily share a ferret index across multiple servers; it's critical that I retain the ability to have multiple servers hit my database, both for performance and service availability.

You could either use a NFS to access the index (you can have multiple readers reading the index at one time) or set up a DRb server to the index. Anyway, here is a random example. I don't really know how this would fit with how Puppet works but hopefully it gives you an idea how Ferret works.

Cheers, Dave

    require 'rubygems'     require 'ferret'

    obj1 = {:id => 1, :children => [3, 4], :size => "large"}     obj2 = {:id => 2, :children => [3, 4], :size => "small"}     obj3 = {:id => 3, :parents => [1, 2], :size => "large"}     obj4 = {:id => 4, :parents => [1, 2], :size => "small"}

    index = Ferret::I.new

    [obj1, obj2, obj3, obj4].each {|obj| index << obj}

    index.search_each('size:small AND (parents:1 OR children:3)') do |id, score|       puts index[id].load.inspect     end

David Balmain wrote:

Yes. The query would look like this: 'owner:"root"'. But searching for 'obj1[:owner] == obj2' would be a little more difficult. Mind you, I don't think it would be any easier in a database.

Ok.

You could either use a NFS to access the index (you can have multiple readers reading the index at one time) or set up a DRb server to the index. Anyway, here is a random example. I don't really know how this would fit with how Puppet works but hopefully it gives you an idea how Ferret works.

Hmmm. I'll have to look into this more; there's already a thread this week on puppet-dev about how important it is to be able to scale horizontally, so I don't want to complicate that any more than I need to.

Cheers, Dave

    require 'rubygems'     require 'ferret'

    obj1 = {:id => 1, :children => [3, 4], :size => "large"}     obj2 = {:id => 2, :children => [3, 4], :size => "small"}     obj3 = {:id => 3, :parents => [1, 2], :size => "large"}     obj4 = {:id => 4, :parents => [1, 2], :size => "small"}

    index = Ferret::I.new

    [obj1, obj2, obj3, obj4].each {|obj| index << obj}

    index.search_each('size:small AND (parents:1 OR children:3)') do |id, score|       puts index[id].load.inspect     end

Okay, cool, thanks for the example. I'll look into it more closely, now that I know it will work for what I need.

Thanks!