Database Testing in Rails

11155 · January 4, 2015, 9:50pm

I'm using Rails 4 and the testing framework with which it ships.

I'm using a relational database that needs to be rigorously tested for internal consistency. In this case, it's a sports statistics database with player stats that are updated nightly.

I need to run tests like the following when new data arrives each night: --That the sum of player stats equals the team's. --That the sum of team stats equals the league's. --That the sum of wins and losses equals games played. --etc.

For now I'm copying my development database over to testing and running these alongside my unit tests in the /test/models/ directory.

This is an awkward set-up as my database testing code isn't comprised of unit tests in their proper sense, and it doesn't rely on fixtures as the Rails documentation suggests this folder be used for.

My question is: In Rails, what is the best practice for database testing like that which I describe above?

Colin_Law · January 4, 2015, 10:07pm

I would start by, as far as practical, normalising the database so that there are no such potential inconsistencies.

Next if it is suggested that it is not practical to fully normalise it then probe the logic behind that carefully to make sure it is really true. Do not just assume that something that seems like a good idea from the performance aspect is necessarily worthwhile. Decades of experience in the S/W industry has shown time and again that the bottlenecks in an app are virtually never where they are initially anticipated.

Next if it is absolutely not practical to fully normalise it then arrange all updates using transactions such that if an inconsistency arises it must be a bug.

Next have a comprehensive set of automatic tests that attempt to get inconsistent data into the database.

Finally have a task which is run on the server at appropriate times, possibly via cron or automatically invoked when data are updated, that performs the checks and sounds a siren if a bug is detected.

Colin

11155 · January 4, 2015, 10:20pm

Thanks Colin.

It's not practical to change the database as data arrives from a 3rd party provider with a particular structure.

You're note made me realize that I'm going about this the wrong way, however. I didn't even think to use transactions on insert. I'm accustomed to an old system where transactions aren't supported and this didn't enter my thinking. What I'm trying to accomplish shouldn't really be part of my testing code.

Thanks.

Colin Law wrote in post #1165987:

Scott_Ribe · January 5, 2015, 6:06am

Sure it is--I've done that many times.

Long experience has taught me that if you're dealing with data with a crappy structure, do not try to clean it up outside the db. Use the tools that a powerful database like PG provides. Create a set of tables which mirrors *exactly* the structure of the data you get, no matter how awful it is. Add appropriate indexes. Test everything you can think of. Then move the verified data into a set of reasonably normalized tables.

Now that does assume a pretty bad data structure being handed to you, which is more often than not the case when dealing with legacy systems. But if the data as you're getting it is actually structured reasonably well, but you just want to verify consistency, feel free to ignore that advice and skip the "staging" tables. I just wanted to point out that you really don't have to be bound by the structure of what you get.