New feature to make introducing counter caches safer and easier

Counter caches are used to avoid costly sql queries like parent.children.any?/parent.children.none?/parent.children.count or just make some queries based on the column itself etc.

It is very easy to add a counter cache column to a new table, just add a column.

It is also easy to add a counter cache column to an existing small table - just add a column, lock additionally the referenced table (to avoid adding new records) and backfill it in a single transaction - this will not take too much time and is safe.

Adding a counter cache to a big table is problematic. It should be done in multiple steps.

One way:

  1. add a counter column to the parent table
  2. add triggers to the referenced table to update the parent table when records are created/deleted
  3. add counter_cache: true to the parent model
  4. delete the triggers

Or we can change step 2 to use a child model callback instead.

But in either case, this makes introducing counter caches more complex than it needs to be and many people (myself included some years ago) just don’t think about (aren’t aware of) this problem and do it the straightforward (incorrect) way.

So I propose a new option, something like: counter_cache: { filled: false }, which skips considering the column to optimize some COUNT based queries (like in the examples at the beginning) and maybe raises so error when trying to access it via record.somethings_count.

With this change, the new process for adding a counter cache will be like the following:

  1. add a column
  2. add counter_cache: { filled: false }
  3. backfill
  4. remove filled: false, so now the app uses this column

Nice and simple, no triggers, callbacks and less chances to get a race condition

Wdyt? I would like to introduce this feature into rails.

@byroot Interested in your opinion on this. Does Shopify uses rails’ counter caches and how you solve the mentioned problem.

2 Likes

I need to run so I can’t give a proper long answer.

But long story short, I wouldn’t do it exactly like that, but yes a way to have counters “active”, as in incremented / decremented, but not used (e…g .size doesn’t consider them) would be useful to help the backfill.

1 Like

Yes, so please give a longer answer when you have time :slight_smile:

So the longer answer is that at Shopify we basically do what you propose, we just don’t have much help for it, you have to be careful not to use any API that may read the counter cache while it’s backfiling. So in other words, it’s very brittle and easy to shoot yourself in the foot, so I would welcome a such a feature.

So I encourage you to start a PR, we can iterate on it.

PR - Add the ability to ignore counter cache columns while they are backfilling by fatkodima · Pull Request #51453 · rails/rails · GitHub