Advice on data design idea

I'm about to embark on a project and am seeking advice on my approach.
I have an automotive site that needs to have shared categories. I've
checked out different nested set, ancestral, acts_as_tree, acyclic
plugins but feel they don't exactly fit.

So I am about to roll my own simple solution. Any feedback as to
whether this looks good is appreciated.

Essentially I have:

Sections - Automotive, Marine, Air

Groups - Cars, Trucks, Bikes

Categories - Pickup, SUV, Utility Truck

The categories need to be able to be under multiple groups and/or
sections. So I am thinking on having a series of HABTM associations:

sections_groups
sections_categories
groups_categories

This will allow me to add new groups/categories/sections with highly
flexible associations.

@section.groups or @section.categories etc...

Does this seem right or am I way off? I'm also concerned whether it
will perform well.

Randy Clark wrote in post #966944:

I'm about to embark on a project and am seeking advice on my approach.
I have an automotive site that needs to have shared categories. I've
checked out different nested set, ancestral, acts_as_tree, acyclic
plugins but feel they don't exactly fit.

As I said in your other thread, programming is not about what you feel.

So I am about to roll my own simple solution. Any feedback as to
whether this looks good is appreciated.

Essentially I have:

Sections - Automotive, Marine, Air

Groups - Cars, Trucks, Bikes

Categories - Pickup, SUV, Utility Truck

The categories need to be able to be under multiple groups and/or
sections.

In what way do you mean this? Will you actually have multiple parents
for one category? Can you give an example of the way your hierarchy
will look.

So I am thinking on having a series of HABTM associations:

sections_groups
sections_categories
groups_categories

This will allow me to add new groups/categories/sections with highly
flexible associations.

@section.groups or @section.categories etc...

Does this seem right or am I way off? I'm also concerned whether it
will perform well..

I think you're way off. Sounds like you have a hierarch which is either
a tree or an arbitrary graph (not sure which, pending your answer to my
question above). In either case, all you need is one model (call it
Category). Use awesome_nested_set if it's a tree. Done.

If it's a graph, things might get a bit harder, but
http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html may
give you an overview of what you could do.
http://www.dweebd.com/sql/modeling-bidirectional-graph-edges-in-rails/
is a Rails-specific implementation of one of those ideas (though it
implements bidirectional edges and you probably only need unidirectional
ones). There may be a Rails plugin that abstracts this.

Best,

Correct I would like to support multiple parents per a given
category. For instance:

Automotive

Trailers

Cargo Trailer

Marine/boating

Misc.

Cargo Trailer

'Cargo Trailers' may be accessed through different hierarchies.

I have considered using a graph but through the implementation may be
a bit overkill or complex. Curious if somehow to do this via
awesome_nested_set or ancestry?

btw - thanks for the help.

Please quote when replying.

Randy Clark wrote in post #966962:

Correct I would like to support multiple parents per a given
category. For instance:

Automotive

Trailers

Cargo Trailer

Marine/boating

Misc.

Cargo Trailer

'Cargo Trailers' may be accessed through different hierarchies.

Er, why? Why does a cargo trailer belong in a boating category?

I have the impression, here as in your earlier post, that your category
hierarchy may be in need of some normalization.

I have considered using a graph but through the implementation may be
a bit overkill or complex.

How can it be overkill? It's the exact data structure you're talking
about.

Will it be complex? Yes. If you could normalize your categories so
that each has only one parent, you'd have a tree, and that would be a
lot easier to implement. (Whether you can in fact do this for your data
is another question.)

Curious if somehow to do this via
awesome_nested_set

No. That only allows for each node to have one parent.

or ancestry?

I've never heard of that one. However, having looked up the docs, it
appears that Ancestry uses the materialized path pattern. That
generally only supports one parent per node; it's another way of doing
trees. Unless you can make your data into a tree, you need a directed
graph structure.

btw - thanks for the help.

You're welcome!

Best,

Thanks Marnen -

Er, why? Why does a cargo trailer belong in a boating category?
I have the impression, here as in your earlier post, that your category
hierarchy may be in need of some normalization.

I may need to rethink the data structure, but as of now the site calls
for categories to be able to be found under multiple sections.
Customers sometimes search for items under different categories kind
of like aliases. So in the case of the cargo trailer its definitely
automotive, but some marinas may be used to finding tool/cargo
trailers within the marine category. Rather than have duplicate
categories, it would be best to share the same category.

How can it be overkill? It's the exact data structure you're talking

about.

Understood but after looking into various DAG databases and methods I
think it would be better to find a way to normalize or restructure the
data to utilize something better supported such as one parent per node
setup such as awesome_nested_set or ancestry.

I am wondering, what about simply manually defining the categories for
my root sections? For instance:

automotive = [cat1, cat2, cat3, etc...]

Essentially create 'virtual' sections. This will keep my categories
simple yet allow me to 'group' a category into multiple virtual
sections when the needed.

Can I get an amen or should I go stand back in the corner? :slight_smile:

Randy Clark wrote in post #966977:

Thanks Marnen -

Er, why? Why does a cargo trailer belong in a boating category?
I have the impression, here as in your earlier post, that your category
hierarchy may be in need of some normalization.

I may need to rethink the data structure, but as of now the site calls
for categories to be able to be found under multiple sections.
Customers sometimes search for items under different categories kind
of like aliases. So in the case of the cargo trailer its definitely
automotive, but some marinas may be used to finding tool/cargo
trailers within the marine category. Rather than have duplicate
categories, it would be best to share the same category.

Really, or would it be better to have an item belong to multiple
categories?

[...]

I am wondering, what about simply manually defining the categories for
my root sections? For instance:

automotive = [cat1, cat2, cat3, etc...]

Essentially create 'virtual' sections. This will keep my categories
simple yet allow me to 'group' a category into multiple virtual
sections when the needed.

Can I get an amen or should I go stand back in the corner? :slight_smile:

Go stand back in the corner. :slight_smile: If you're creating a DAG, make it an
actual DAG.

Best,

Really, or would it be better to have an item belong to multiple
categories?

This is an established use case, already have decided that optionally
belonging to multi-categories will be possible. Plan on doing that
through a habtm relationship with the products and the categories.

Go stand back in the corner. :slight_smile: If you're creating a DAG, make it an

actual DAG.

You're right - this keeps coming back to DAG. I am leaning towards
this because:

1) Sounds fun

2) My most heavily used data is not stored in the DAG (only the
categories) so I don't foresee any performance issues as the DAG will
only be updated on occasion.

3) DAG will give me the most options/flexibility for this particular
project

Scribbling with this to see how well it fits - thanks a ton for the
direction and clear thought process.

Randy Clark wrote in post #967020:

Really, or would it be better to have an item belong to multiple
categories?

This is an established use case, already have decided that optionally
belonging to multi-categories will be possible. Plan on doing that
through a habtm relationship with the products and the categories.

Right.

Go stand back in the corner. :slight_smile: If you're creating a DAG, make it an

actual DAG.

You're right - this keeps coming back to DAG. I am leaning towards
this because:

1) Sounds fun

:slight_smile:

2) My most heavily used data is not stored in the DAG (only the
categories) so I don't foresee any performance issues as the DAG will
only be updated on occasion.

Sure. For tree structures, you'd use a nested set to retrieve the graph
efficiently. I'm not sure how to do this for general graphs.

3) DAG will give me the most options/flexibility for this particular
project

Don't overdesign in the name of flexibility. Design for your current
needs as much as possible, and be ready to refactor when the time comes.

Scribbling with this to see how well it fits - thanks a ton for the
direction and clear thought process.

You're welcome!

Best,

Another alternative to consider would be a tagging approach using
something like acts-as-taggable-on.
(https://github.com/mbleigh/acts-as-taggable-on)

I've never used tagging in a project, so maybe I'm way off base here.

Eric

Another alternative to consider would be a tagging approach using
something like acts-as-taggable-on.
(https://github.com/mbleigh/acts-as-taggable-on)

Thanks Eric - I considered that but went a different direction,
setting up a DAG.

Marnen - I am wondering if I can borrow your opinion for another
matter. I may have touched on this before, if so pardon me. I want
to store all of my products in a central products table, 'products'.
However, I'm hesitant on using STI on this because of the potential
for a sparse table. My subclasses are similar but may eventually
become less related.

IYO - what's the best fit - STI (worry about it later), polymorphic
association, or class table inheritance?

Randy Clark wrote in post #967454:

Another alternative to consider would be a tagging approach using
something like acts-as-taggable-on.
(https://github.com/mbleigh/acts-as-taggable-on)

Thanks Eric - I considered that but went a different direction,
setting up a DAG.

Marnen - I am wondering if I can borrow your opinion for another
matter. I may have touched on this before, if so pardon me.

You did. Go read my advice that I already gave you on this point.

I want
to store all of my products in a central products table, 'products'.
However, I'm hesitant on using STI on this because of the potential
for a sparse table. My subclasses are similar but may eventually
become less related.

IYO - what's the best fit - STI (worry about it later), polymorphic
association, or class table inheritance?

Worry about it later. Remember YAGNI.

STI is smelly, though; you may want optional has_one relationships
instead.

Best,