Advice on data design idea

I'm about to embark on a project and am seeking advice on my approach. I have an automotive site that needs to have shared categories. I've checked out different nested set, ancestral, acts_as_tree, acyclic plugins but feel they don't exactly fit.

So I am about to roll my own simple solution. Any feedback as to whether this looks good is appreciated.

Essentially I have:

Sections - Automotive, Marine, Air

Groups - Cars, Trucks, Bikes

Categories - Pickup, SUV, Utility Truck

The categories need to be able to be under multiple groups and/or sections. So I am thinking on having a series of HABTM associations:

sections_groups sections_categories groups_categories

This will allow me to add new groups/categories/sections with highly flexible associations.

@section.groups or @section.categories etc...

Does this seem right or am I way off? I'm also concerned whether it will perform well.

Randy Clark wrote in post #966944:

I'm about to embark on a project and am seeking advice on my approach. I have an automotive site that needs to have shared categories. I've checked out different nested set, ancestral, acts_as_tree, acyclic plugins but feel they don't exactly fit.

As I said in your other thread, programming is not about what you feel.

So I am about to roll my own simple solution. Any feedback as to whether this looks good is appreciated.

Essentially I have:

Sections - Automotive, Marine, Air

Groups - Cars, Trucks, Bikes

Categories - Pickup, SUV, Utility Truck

The categories need to be able to be under multiple groups and/or sections.

In what way do you mean this? Will you actually have multiple parents for one category? Can you give an example of the way your hierarchy will look.

So I am thinking on having a series of HABTM associations:

sections_groups sections_categories groups_categories

This will allow me to add new groups/categories/sections with highly flexible associations.

@section.groups or @section.categories etc...

Does this seem right or am I way off? I'm also concerned whether it will perform well..

I think you're way off. Sounds like you have a hierarch which is either a tree or an arbitrary graph (not sure which, pending your answer to my question above). In either case, all you need is one model (call it Category). Use awesome_nested_set if it's a tree. Done.

If it's a graph, things might get a bit harder, but Graphs may give you an overview of what you could do. http://www.dweebd.com/sql/modeling-bidirectional-graph-edges-in-rails/ is a Rails-specific implementation of one of those ideas (though it implements bidirectional edges and you probably only need unidirectional ones). There may be a Rails plugin that abstracts this.

Best,

Correct I would like to support multiple parents per a given category. For instance:

Automotive

Trailers

Cargo Trailer

Marine/boating

Misc.

Cargo Trailer

'Cargo Trailers' may be accessed through different hierarchies.

I have considered using a graph but through the implementation may be a bit overkill or complex. Curious if somehow to do this via awesome_nested_set or ancestry?

btw - thanks for the help.

Please quote when replying.

Randy Clark wrote in post #966962:

Correct I would like to support multiple parents per a given category. For instance:

Automotive

Trailers

Cargo Trailer

Marine/boating

Misc.

Cargo Trailer

'Cargo Trailers' may be accessed through different hierarchies.

Er, why? Why does a cargo trailer belong in a boating category?

I have the impression, here as in your earlier post, that your category hierarchy may be in need of some normalization.

I have considered using a graph but through the implementation may be a bit overkill or complex.

How can it be overkill? It's the exact data structure you're talking about.

Will it be complex? Yes. If you could normalize your categories so that each has only one parent, you'd have a tree, and that would be a lot easier to implement. (Whether you can in fact do this for your data is another question.)

Curious if somehow to do this via awesome_nested_set

No. That only allows for each node to have one parent.

or ancestry?

I've never heard of that one. However, having looked up the docs, it appears that Ancestry uses the materialized path pattern. That generally only supports one parent per node; it's another way of doing trees. Unless you can make your data into a tree, you need a directed graph structure.

btw - thanks for the help.

You're welcome!

Best,

Thanks Marnen -

Er, why? Why does a cargo trailer belong in a boating category? I have the impression, here as in your earlier post, that your category hierarchy may be in need of some normalization.

I may need to rethink the data structure, but as of now the site calls for categories to be able to be found under multiple sections. Customers sometimes search for items under different categories kind of like aliases. So in the case of the cargo trailer its definitely automotive, but some marinas may be used to finding tool/cargo trailers within the marine category. Rather than have duplicate categories, it would be best to share the same category.

How can it be overkill? It's the exact data structure you're talking

about.

Understood but after looking into various DAG databases and methods I think it would be better to find a way to normalize or restructure the data to utilize something better supported such as one parent per node setup such as awesome_nested_set or ancestry.

I am wondering, what about simply manually defining the categories for my root sections? For instance:

automotive = [cat1, cat2, cat3, etc...]

Essentially create 'virtual' sections. This will keep my categories simple yet allow me to 'group' a category into multiple virtual sections when the needed.

Can I get an amen or should I go stand back in the corner? :slight_smile:

Randy Clark wrote in post #966977:

Thanks Marnen -

Er, why? Why does a cargo trailer belong in a boating category? I have the impression, here as in your earlier post, that your category hierarchy may be in need of some normalization.

I may need to rethink the data structure, but as of now the site calls for categories to be able to be found under multiple sections. Customers sometimes search for items under different categories kind of like aliases. So in the case of the cargo trailer its definitely automotive, but some marinas may be used to finding tool/cargo trailers within the marine category. Rather than have duplicate categories, it would be best to share the same category.

Really, or would it be better to have an item belong to multiple categories?

[...]

I am wondering, what about simply manually defining the categories for my root sections? For instance:

automotive = [cat1, cat2, cat3, etc...]

Essentially create 'virtual' sections. This will keep my categories simple yet allow me to 'group' a category into multiple virtual sections when the needed.

Can I get an amen or should I go stand back in the corner? :slight_smile:

Go stand back in the corner. :slight_smile: If you're creating a DAG, make it an actual DAG.

Best,

Really, or would it be better to have an item belong to multiple categories?

This is an established use case, already have decided that optionally belonging to multi-categories will be possible. Plan on doing that through a habtm relationship with the products and the categories.

Go stand back in the corner. :slight_smile: If you're creating a DAG, make it an

actual DAG.

You're right - this keeps coming back to DAG. I am leaning towards this because:

1) Sounds fun

2) My most heavily used data is not stored in the DAG (only the categories) so I don't foresee any performance issues as the DAG will only be updated on occasion.

3) DAG will give me the most options/flexibility for this particular project

Scribbling with this to see how well it fits - thanks a ton for the direction and clear thought process.

Randy Clark wrote in post #967020:

Really, or would it be better to have an item belong to multiple categories?

This is an established use case, already have decided that optionally belonging to multi-categories will be possible. Plan on doing that through a habtm relationship with the products and the categories.

Right.

Go stand back in the corner. :slight_smile: If you're creating a DAG, make it an

actual DAG.

You're right - this keeps coming back to DAG. I am leaning towards this because:

1) Sounds fun

:slight_smile:

2) My most heavily used data is not stored in the DAG (only the categories) so I don't foresee any performance issues as the DAG will only be updated on occasion.

Sure. For tree structures, you'd use a nested set to retrieve the graph efficiently. I'm not sure how to do this for general graphs.

3) DAG will give me the most options/flexibility for this particular project

Don't overdesign in the name of flexibility. Design for your current needs as much as possible, and be ready to refactor when the time comes.

Scribbling with this to see how well it fits - thanks a ton for the direction and clear thought process.

You're welcome!

Best,

Another alternative to consider would be a tagging approach using something like acts-as-taggable-on. (GitHub - mbleigh/acts-as-taggable-on: A tagging plugin for Rails applications that allows for custom tagging along dynamic contexts.)

I've never used tagging in a project, so maybe I'm way off base here.

Eric

Another alternative to consider would be a tagging approach using something like acts-as-taggable-on. (GitHub - mbleigh/acts-as-taggable-on: A tagging plugin for Rails applications that allows for custom tagging along dynamic contexts.)

Thanks Eric - I considered that but went a different direction, setting up a DAG.

Marnen - I am wondering if I can borrow your opinion for another matter. I may have touched on this before, if so pardon me. I want to store all of my products in a central products table, 'products'. However, I'm hesitant on using STI on this because of the potential for a sparse table. My subclasses are similar but may eventually become less related.

IYO - what's the best fit - STI (worry about it later), polymorphic association, or class table inheritance?

Randy Clark wrote in post #967454:

Another alternative to consider would be a tagging approach using something like acts-as-taggable-on. (GitHub - mbleigh/acts-as-taggable-on: A tagging plugin for Rails applications that allows for custom tagging along dynamic contexts.)

Thanks Eric - I considered that but went a different direction, setting up a DAG.

Marnen - I am wondering if I can borrow your opinion for another matter. I may have touched on this before, if so pardon me.

You did. Go read my advice that I already gave you on this point.

I want to store all of my products in a central products table, 'products'. However, I'm hesitant on using STI on this because of the potential for a sparse table. My subclasses are similar but may eventually become less related.

IYO - what's the best fit - STI (worry about it later), polymorphic association, or class table inheritance?

Worry about it later. Remember YAGNI.

STI is smelly, though; you may want optional has_one relationships instead.

Best,