ActiveRecord::Persistence.increment! requires a row lock to ensure isolated updates

The method is here: https://github.com/rails/rails/blob/master/activerecord/lib/active_record/persistence.rb#L288.

The method takes the in-memory attribute value and increments it by the specified amount. A safer approach (from an isolation standpoint) would be to let the database determine the value. Instead of telling the database what value to persist in the database, the SQL can written (at least for postgres) so that the database will atomically increment a value:

UPDATE “posts” SET view_count = view_count + 1 WHERE id=123;

Currently, rails generates the following SQL:

UPDATE “posts” SET “view_count” = 3, “updated_at” = ‘2013-01-20 23:20:24.154852’ WHERE “posts”.“id” = 123

It would be great to see a method like this to perform atomic update operations for databases that support it. If there’s support for this, I’m happy to write the patch. Thanks!

Interestingly, ActiveRecord::CounterCache does this the appropriate way.

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/counter_cache.rb#L72

Interesting. Hopefully we can get an explanation as to why the increment methods are not done this way, and if the core team would be open to a patch.

  • Alex

+1. The same update statement would work for MySQL as well.

Just a small thing: don’t forget to take the updated_at timestamp into account in your patch. Just mentioning in case.

Segunda-feira, 21 de Janeiro de 2013 0:56:37 UTC+1, ajsharp escreveu:

I may be wrong but that's my understanding: #increment happens at instance
level, so it takes into account the current value at that particular
instance, whereas .update_counters is just a straight sql query, so it can
operate using column + value directly. If you want to allow the database to
determine the value, use the latter.

I understand what you mena. However, how would one handle such an increment on concurrent threads/processes? You do have to handle the ambiguity somehow. Delegate the responsibility to the DB sounds reasonable, but some more inputs would be nice.

Segunda-feira, 21 de Janeiro de 2013 0:56:37 UTC+1, ajsharp escreveu:

But increment is inherently subject to concurrency issues, and the only way to safely avoid them is to use a row level lock when incrementing the value, which presents a whole host of complications of its own.

Cheers,

Alex Sharp

Yea, good point, the DB does not return updated value to the client, it
just returns "UPDATE 1". Curious if there's a way to alter that behavior at
the DB level.

Side note on the updated_at column; the .update_counters method relies on
.update_all, so it does not bump the updated_at timestamp. This is not
that unexpected since the column timestamping is handled via the .create
and .update methods for ActiveRecord::Base. While I agree it seems bad to
have public methods changing the record without changing the updated_at
(thereby breaking the cache key), doing so would be inconsistent with some
similar AR mechanisms already in place.

Yea, I've never *really* understood the AR distinction between things that
update the timestamp and things that don't. In my mind, it's a blurry line
that is defined somewhere along the lines of "has the model really been
updated, or did you just update a counter?".

Given a vacuum though, I'd prefer to bust the cache implicitly when I'm
bumping a counter.

By "bust the cache", do you just mean update the counter?

By “bust the cache”, do you just mean update the counter?

I was referring to updating the updated_at timestamp, as the default ActiveRecord #cache_key integrates that column if available to help ease your cache expiration. In this context I mean a cache of memcache/redis/etc, not an instance attribute cache.

It makes sense!

I have a debt entity in my application and payments this entity can happen three or more times in parallel (stupid brazilian banks, don’t ask me why they do it).

Since I have to keep a cache column of the paid value for the debt, I have 25 workers (sidekiq) that can call increment!(:paid_value, paid_value) and it should keep the total paid value.

I’m not sure if it will be a problem to happen something like that:

debt.paid_value

=> 100.0

debt.increment!(:paid_value, 100.0)

=> 300.0

We can have problems if other columns relies on that value.

Let suppose I have another column called paid_interest and with the paid_value, I get the total paid:

debt.paid_value

=> 100.0

debt.paid_interest

=> 10.0

debt.total_paid

=> 110.0

debt.increment!(:paid_value, 100.0)

=> 300.0

debt.total_paid

=> 310.0

The paid_interest in this case is updated in another process.

It’s a contrived example but if that’s the case, I will need to reload the entire record to get the right total paid.

At least some documentation telling that increment and increment! are subject to race conditions is needed.

I believe the suggested solution in this case is to use optimistic locking (via a lock_version column) and then handle collisions manually.

--Matt Jones

Yes, I’m thinking about that and seems there is no way to handle this in a smart way.

Developers can think “from what reason I’ve incremented by 100 and it incremented by 300” until he figure out that was incremented by other threads too.

Maybe some documentation on increment(!)/decrement(!) methods explaining that is not safe to use it concurrently will avoid to think they are.

Since I have to keep a cache column of the paid value for the debt, I have 25 workers (sidekiq) that can call `increment!(:paid_value, >> paid_value)` and it should keep the total paid value.

[SNIP]

Yes, I'm thinking about that and seems there is no way to handle this in a
smart way.

Developers can think "from what reason I've incremented by 100 and it
incremented by 300" until he figure out that was incremented by other
threads too.

Gabriel: I may be confused about your use case, but it sounds like
your concern is around ensuring that the database client that makes an
update can rely on the fact that his update was the only one made. Or,
at least, your application has an "audit" concern, where, you need to
be able to prove how values were mutated. There are a couple of ways
to handle this, optimistic locking is one, row-level locks are
another, or designing a double-entry transaction model is another (so
you don't rely on locks at all).

If this is indeed your use case, what I'm advocating here is much
simpler than that. I just want to be able to increment numeric columns
in active record without declaring them as counter cache columns.

Maybe some documentation on increment(!)/decrement(!) methods explaining
that is not safe to use it concurrently will avoid to think they are.

That'd certainly be helpful, but the same logic applies to any method
you use. Any sort of CRUD application is subject to concurrent access
and update problems. Are you simply suggesting documenting
#increment!/#decrement! because they *don't* behave in the way that
I'm advocating for in this thread?

If this is indeed your use case, what I’m advocating here is much
simpler than that. I just want to be able to increment numeric columns
in active record without declaring them as counter cache columns.

You should still be able to do that by calling increment_counter or update_counters class methods, they’re generic methods for updating “counter columns”, which mean any numeric column should work, they’re not bound to counter cache columns (even though the filename kinda implies that).

object.class.increment_counter(:attr_name, object.id)

object.class.update_counters(object.id, :attr_name => 1)

I believe that should work fine for this purpose.

Carlos,

I’m suggesting to document this because the increment and decrements are intended to increment and decrement instead of replace the value, like the update_attributes method does.

I expect to call increment in a object and the value get incremented by the given value instead of replacing the database value, like this:

x = Debt.find(1)

y = Debt.find(1)

x.value

#=> 100

y.value

#=> 100

x.increment!(:value, 100)

#=> 200

y.increment!(:value, 50)

#=> 150

x.reload

x.value

#=> 150

I’m not sure which behavior is better but I would expect to get 250 on x instance because I incremented by 100 in first time and after incremented by 50.

One solution:

x.increment!(:value, 100)

#=> 200

y.increment!(:value, 50)

#=> 250

But the y instance need to be reloaded before or after incrementing.

Both are weird behaviors, I would not expect my object to be reloaded nor replace the value instead of incrementing.

I can avoid this problem using row lock but how will the developer discover that row lock is needed to keep increment/decrement concurrently safe? That’s the point! :wink:

I’ve always been ok with this but when I saw another developer reporting the same problem, I figured that this deduction is not just me.

One line about and will be happy, like this one:

“This method is not safe to update multiple instances at same time. You will need to use some lock like pessimistic lock, optimistic lock or row lock to avoid this problem.”

Do you think something like this is acceptable?

This is true for all save/update methods, not only for increment/decrement, because they act on the current instance values, so after you load them from the database, you’re fated to work on these values you have. As you said, if you’re working with a possible concurrent update, you should use some lock and handle it. Adding documentation to talk briefly about locking should be fine I guess.

Yes, I figured out right now that is not about concurrency safe but it’s about multiple instances.

I just assumed that increment will increment instead of replacing the database values, which I expect in save/update.

Thanks! :slight_smile: