Should db:test:prepare Also Call db:seed by Default?

I have a set of static records in my app that only need to be created once so I set up my test environment with:

RAILS_ENV=test rake db:test:prepare db:seed

Tests pass. Then later I add a new migration to my app. When I run tests again they start to fail because the static records that were created by db:seed are missing.

The problem is that Rails doesn’t just call db:migrate on the Test environment when there are new migrations. It instead calls db:test:prepare without calling db:seed. I think this happens here in load_schema_if_pending!.

Based on the number of upvotes on the answers to this Stack Overflow question it looks a lot of people are working around this by either directly calling Rails.application.load_seed from within their test code or they are configuring Rake to call db:seed. Maybe db:test:prepare should call db:seed when it finishes?

3 Likes

At least one person (me) sees seeds.rb as only for development data, not test data, so I would personally disagree with making this a default.

However, based on the Stack Overflow activity you cite, it seems like a number of people don’t take that perspective and find this to be a gotcha. Do you think that calling Rails.application.load_seed from test code is an adequate workaround, just one that needs to be better documented? Or do you think that a better mechanism is necessary?

(One thing that might be coming into play here: Applications I’ve worked on that have needed to load seed data at the beginning of their tests have tended to be applications that relied on particular rows being present in the database to function. e.g. an ecommerce application that had different logic for each ShippingCarrier the ocmpany worked with, but that also stored ShippingCarriers as database rows. So if the UPS seed row was missing, a lot of the UPS-specific shipping code would break.

I don’t like this architectural style. IME, keeping the database’s data in sync with what the code expects is a source of flakiness, and keeping these things submerged in the database rather than hard-coded in the application makes it harder for new developers to onboard into a project. However, Rails is a big tent. I don’t think that this architectural style should be unsupported. However, I’m going to be skeptical of adding defaults I perceive as encouraging it.)

4 Likes

Thanks for the reply. I can understand that viewpoint and agree it can be flaky depending upon how it is managed. But the Rails docs indicate that they expect people to use seeds for tests:

Rails has a built-in ‘seeds’ feature that speeds up the process. This is especially useful when reloading the database frequently in development and test environments.

[Migrations and Seed Data]

I’d still advocate for calling db:seed by default since it’s a bit difficult to figure out why calling rails test will sometimes purge test records. Once you know that the db:test:prepare task is the culprit then it isn’t too hard to track down a workaround.

If an app is written in a way such that tests don’t care about seed data, then changing the default shouldn’t have a functional impact. Though I could see it slowing tests down after each new migration if a huge amount of data was being generated, in which case people would have to make a change to exclude the Test env from the seeds.rb logic. It might be easier to understand and fix tests that are spending a lot of time generating seed data than to understand and fix tests that are purging seed data.

I think we indeed have a muddy definition of seed that doesn’t actually fit the original strict view. Which makes it difficult to rely on it for something like “run before tests”. In HEY, we ended up using seeds for populating basic demo accounts, rather than the strict idea of just populating records you need to bootstrap the system. It would suck if these were run automatically before every test suite, because it would make the tests a lot slower, and for now gain.

But even without reconciling the stretch of the word seed, and the usage of the feature, I think we could get half the way by having a clear path in test_helper.rb to setup whether seeds should be loaded or not. It can start commented out, but make it obvious how you do get seeds loaded before every test run.

2 Likes

Personal view is that seeds.rb is for development and if you want some data loaded by default in your test suite then use fixtures.

3 Likes

I agree that fixtures would be another option. But I have a core set of static records that are necessary in both Dev and Test and I’d like to create them in one place rather than maintain both test/fixtures and db/seeds.rb.

Yes, a clear path in test_helper.rb would be a way to help others quickly figure this out in the future.

There’s no reason why we couldn’t provide a way of loading fixtures from the seeds file as well

1 Like

I actually call ActiveRecord::FixtureSet.create_fixtures for some of the work in my seeds.rb file. :slightly_smiling_face: But that’s another thing that isn’t easy to know how to do when you’re first editing seeds.rb because the examples recommend using the ActiveRecord create method.

Just wanted to share a solution here for a case that we had which was quite similar to opening post.

In our case we had a lot of fixtures to load for our application’s access and roles system. This was going to make our test suite take 1hr 10mins in CI instead of 10mins

So we wanted to look at front loading the database with fixtures and leveraging the fact that tests run in a transaction to maintain the state of the database between tests.

end solution was to modify our test script to something like this:

# for test console
RAILS_ENV=test bundle exec rails db:reset
RAILS_ENV=test bundle exec rails db:fixtures:load
bundle exec rails c -e test

# for CI
RAILS_ENV=test bundle exec rails db:setup
RAILS_ENV=test bundle exec rails db:fixtures:load
bundle exec rails t

My two cents on opening post. Adding a comment to test_helper.rb is a great idea for newly generated projects, but for longer running projects it may also be a good idea to add something to the rails docs.

In all those years on many different Rails apps, almost all of them used seeds.rb for development, but only 1 or 2 of them for tests.

Personally I prefer a clean setup for tests and therefore usually rely on FactoryBot. When a lot of “base” setup is needed fixtures (guess those could be replaced by seeds). But I don’t think it is a common pattern.

1 Like

Big :+1: to @Andrew_White’s suggestion of providing a way to load fixtures from seeds.rb. I’ve looked for this. I setup the first half with a db:fixtures:dump task but am still wanting the bit to load them from seeds.

While I don’t seed tests, this would eliminate some duplication between my test fixtures and my seed data. If I had this, I think my seeds would use less arrays literals, less Hash literals, but probably not less CSV files.

1 Like

Does anyone want to step up to champion the “make it easy to load fixtures in seeds.rb” idea?

If so, here’s the process I’d suggest:

Open an issue on the Rails GitHub, linking this discussion, describing the problem, and proposing your desired API. In that PR, make sure to mention that you’re interested in doing the implementation work – right now, the maintainers are busy enough that they’re not accepting feature requests unless someone volunteers to make the PR.

2 Likes

Here’s a recap of the issue, the workaround we use, and how we use fixtures in Dev.

A. The Use of seeds.rb Differs Across Apps

Some teams run db:seed in all envs to build a small set of records that are needed to bootstrap the app. The Rails Guides say seeds can be used for tests and CIs like Semaphore suggest using db:setup which invokes db:seed. Tests will pass until you add a new migration. Then tests will continue to pass on the CI but will fail locally because db:test:prepare will wipe out the seed data.

Some teams only run db:seed in the Dev env to build a very large set of demo data. Running db:seed in the Test env would unnecessarily degrade performance.

B. Possible Workarounds

  1. Adjust db:test:prepare to always call db:seed. This is what our team chose to do.
  2. Call Rails.application.load_seed from tests. It would be easy to add this as a suggestion in test_helper.rb.
  3. Use test fixtures. This can cause duplication between seeds.rb and test/fixtures.

C. Running Fixtures Outside of Tests

We like the fixture structure so we define our static records in db/fixtures instead of test/fixtures so it’s clear they aren’t just for tests. Then our seeds.rb includes:

fixtures_dir = 'db/fixtures'
names = Dir["#{fixtures_dir}/**/*.yml"].map { |path| File.basename(path, '.yml') }
ActiveRecord::FixtureSet.create_fixtures(fixtures_dir, names)

One downside to fixtures is that they disable referential integrity checks. So if you’re using PostgreSQL you have to give the DB user greater permissions than necessary, at least when you’re first seeding the app.

2 Likes

I’d be happy to see #2. Having db:test:prepare always call db:seed isn’t going to fly. Too divergent use case. Totally fine for an individual team to do this, of course. Happy to make that even easy and documented.

I made a PR for #2: Add instructions on how to use `db/seeds` to `test_helper` template by ghiculescu · Pull Request #46703 · rails/rails · GitHub

The call to ActiveRecord::Tasks::DatabaseTasks.truncate_all is needed because otherwise the test DB gets new (duplicated) seed data every time you run tests again. Unless anyone has better suggestions on how to implement that?

Making this also work with parallel tests is actually a bit more complex. This is the simplest implementation I have got working.

module LoadFixturesFromSeeds
  def load_fixtures(*)
    ActiveRecord::Tasks::DatabaseTasks.truncate_all
    Rails.application.load_seed
  end
end

class ActiveSupport::TestCase
  parallelize(workers: :number_of_processors)
  ActiveRecord::TestFixtures.prepend LoadFixturesFromSeeds
end

But I think that’s a bit too much, to live in a comment in the test_helper template.

@DHH do you think you would consider a new feature in Rails for this? It would be neat if you could do

class ActiveSupport::TestCase
  fixtures_from_seeds
end

And it would be equivalent to what’s above.

I fundamentally don’t think it’s a good pattern to have tests rely on your seed data, so I don’t want to sugar coat a path to encourage that. Of course, everyone can run their tests as they see fit, but Rails is not going to pave the road for you on approaches that aren’t deemed to be in the general interest.

So I don’t see us making changes either to the default test_helper.rb or providing something like fixtures_from_seeds. This smells like something that should live in an external gem.

For those following along who do want to try this pattern out, I extracted a working implementation out into a gem: GitHub - ghiculescu/seed-fixtures

Please give it a try and let me know what you think!

I can see this potentially being a benefit to integration / system tests og established systems where you know how the “foundation” looks, and are more testing edge parts of the system.