I've read the rest of the thread with interest, but I just wanted to
go back to the start for a second.
> I've inherited an EAV database and there's really no option to remodel
> it.
Where have you inherited this data from? Is there an application
that's still populating it? or is it an export of data (from something
like an online survey/questionnaire system)?
It's not a static export, and yes there is an application still
populating it.
> Trying to return a meaningful, unified recordset is far too complex to
> be efficient.
Using what approach? The data was set up as EAV for a reason, and
would have probably had tools to query and manipulate it - do you not
have any access to the original tools?
Using SQL joins to build a recordset that represents the dataset. As
far as I can gauge the data was setup this way to allow an "easy" way
to construct a custom set of fields for a relatively non-technical
user/admin. It certainly wasn't designed to handle the reporting
function I'm being asked to create, or, I suspect the volume of data
that is being stored in it.
Yes, the right thing to do would be to replace large chunks of this
and design a better normalized data solution that satisfied all the
systems involved. Unfortunately, in this instance I don't have the
luxury of time or money to go that route. I wouldn't describe myself
as a experienced developer, but I have been in this business for a
long time, and I know the kind of hole I'm digging for myself by
choosing this method.
> I've decided to make use of the MySQL GROUP_CONCAT, and CONCAT_WS
> functions to return a fast query result, with the consolidated fields
> as one string.
> The final data is to be displayed and downloaded (CSV), but not all
> fields will be available for each "row", which makes serialization
> awkward.
So what lead you to decide to use Rails for this? Given Rails has *no*
practical EAV support (just a bit of munging with plugins), it strikes
me as a strange decision to try to use it to analyse your data. Is
there no "EAV data analysis" tool you can find to do the job, rather
than having to create huge amounts of DB views and/or model
manipulation.
It's more complex (isn't it always). A large part of the database is
well-formed, normalized and friendly. This little area that only
recently came to my attention is not (EAV). Rather than re-tool my
entire solution design, I'm sticking to the original plan, and shoe-
horn this section in (and it is a relatively small section in the
scheme of things).
I freely admit that Rails is not the best solution for this. I'd even
go as far as to say that it falls into the category of "right now I
have a hammer, so everything looks like a nail", by which I mean: I
don't do a lot of development, especially web-app development, and
spent some time at the beginning of the year learning RoR. I liked
it, even if I wasn't always using it in the manner for which it's most
suitable. It's been suggested that something like "Sinatra" would be
a much better fit for this, and quick peek seems to show that it
probably would. However, in the short time available to complete this
project, I can't deal with struggling to get through all the "gotchas"
of another framework.
Lame excuses, I know. Completely at odds with the dogma of RoR
development.
I'm not saying to *not* continue what you're doing - just curious that
it seems you've picked a route that needs more work than should be
reasonable, and don't seem to be looking at any other options. It's
always worth considering admitting that you might be making lots of
work for yourself and a new start may be more economical in the long
run.
FWIW Many (6-ish) years ago I got an EAV data export from SurveyMonkey
dumped on my desk, and after importing it into a SQL DB, I produced
Excel cross-tabulated reports with a combination of a big plain old
SQL query (with *lots* of joins! to get the data into a "relation"
equivalent, and used Excel's pivot-table functionality to do the
"heavy lifting" of getting information from those rows.
With data migration, I generally opt to get down to the raw data and
manipulate it all in code. I know it can be a little less efficient,
but when it's my code, at least I know I can trust it, or at least fix
it when it breaks. Having said that, the pivot table stuff in Excel
can be pretty useful.