ActiveRecord::Persistence.increment! requires a row lock to ensure isolated updates

Alex_Sharp · January 20, 2013, 11:56pm

The method is here: rails/persistence.rb at main · rails/rails · GitHub.

The method takes the in-memory attribute value and increments it by the specified amount. A safer approach (from an isolation standpoint) would be to let the database determine the value. Instead of telling the database what value to persist in the database, the SQL can written (at least for postgres) so that the database will atomically increment a value:

UPDATE “posts” SET view_count = view_count + 1 WHERE id=123;

Currently, rails generates the following SQL:

UPDATE “posts” SET “view_count” = 3, “updated_at” = ‘2013-01-20 23:20:24.154852’ WHERE “posts”.“id” = 123

It would be great to see a method like this to perform atomic update operations for databases that support it. If there’s support for this, I’m happy to write the patch. Thanks!

Matt_Huggins · January 21, 2013, 2:01pm

Interestingly, ActiveRecord::CounterCache does this the appropriate way.

github.com

rails/rails/blob/main/activerecord/lib/active_record/counter_cache.rb#L72


      
            end
          
          
  unscoped.where(primary_key => object.id).update_all(updates) if updates.any?
          
          
  true
          end
          
          
# A generic "counter updater" implementation, intended primarily to be
          # used by #increment_counter and #decrement_counter, but which may also
          # be useful on its own. It simply does a direct SQL update for the record
          # with the given ID, altering the given hash of counters by the amount
          # given by the corresponding value:
          #
          # ==== Parameters
          #
          # * +id+ - The id of the object you wish to update a counter on or an array of ids.
          # * +counters+ - A Hash containing the names of the fields
          #   to update as keys and the amount to update the field by as values.
          # * <tt>:touch</tt> option - Touch timestamp columns when updating.
          #   If attribute names are passed, they are updated along with updated_at/on
          #   attributes.

Alex_Sharp · January 21, 2013, 7:01pm

Interesting. Hopefully we can get an explanation as to why the increment methods are not done this way, and if the core team would be open to a patch.

Alex

ChuckE · January 22, 2013, 2:54pm

+1. The same update statement would work for MySQL as well.

Just a small thing: don’t forget to take the updated_at timestamp into account in your patch. Just mentioning in case.

Segunda-feira, 21 de Janeiro de 2013 0:56:37 UTC+1, ajsharp escreveu:

carlosantoniodasilva · January 22, 2013, 3:00pm

I may be wrong but that's my understanding: #increment happens at instance level, so it takes into account the current value at that particular instance, whereas .update_counters is just a straight sql query, so it can operate using column + value directly. If you want to allow the database to determine the value, use the latter.

ChuckE · January 22, 2013, 4:15pm

I understand what you mena. However, how would one handle such an increment on concurrent threads/processes? You do have to handle the ambiguity somehow. Delegate the responsibility to the DB sounds reasonable, but some more inputs would be nice.

Segunda-feira, 21 de Janeiro de 2013 0:56:37 UTC+1, ajsharp escreveu:

Alex_Sharp · January 22, 2013, 6:52pm

But increment is inherently subject to concurrency issues, and the only way to safely avoid them is to use a row level lock when incrementing the value, which presents a whole host of complications of its own.

Cheers,

Alex Sharp

Alex_Sharp · January 22, 2013, 7:00pm

Yea, good point, the DB does not return updated value to the client, it just returns "UPDATE 1". Curious if there's a way to alter that behavior at the DB level.

Alex_Sharp · January 22, 2013, 9:05pm

Side note on the updated_at column; the .update_counters method relies on .update_all, so it does not bump the updated_at timestamp. This is not that unexpected since the column timestamping is handled via the .create and .update methods for ActiveRecord::Base. While I agree it seems bad to have public methods changing the record without changing the updated_at (thereby breaking the cache key), doing so would be inconsistent with some similar AR mechanisms already in place.

Yea, I've never *really* understood the AR distinction between things that update the timestamp and things that don't. In my mind, it's a blurry line that is defined somewhere along the lines of "has the model really been updated, or did you just update a counter?".

Given a vacuum though, I'd prefer to bust the cache implicitly when I'm bumping a counter.

By "bust the cache", do you just mean update the counter?

Brendon_Murphy · January 22, 2013, 9:19pm

By “bust the cache”, do you just mean update the counter?

I was referring to updating the updated_at timestamp, as the default ActiveRecord #cache_key integrates that column if available to help ease your cache expiration. In this context I mean a cache of memcache/redis/etc, not an instance attribute cache.

G_Sobrinho · January 23, 2013, 12:03pm

It makes sense!

I have a debt entity in my application and payments this entity can happen three or more times in parallel (stupid brazilian banks, don’t ask me why they do it).

Since I have to keep a cache column of the paid value for the debt, I have 25 workers (sidekiq) that can call increment!(:paid_value, paid_value) and it should keep the total paid value.

I’m not sure if it will be a problem to happen something like that:

debt.paid_value

=> 100.0

debt.increment!(:paid_value, 100.0)

=> 300.0

We can have problems if other columns relies on that value.

Let suppose I have another column called paid_interest and with the paid_value, I get the total paid:

debt.paid_value

=> 100.0

debt.paid_interest

=> 10.0

debt.total_paid

=> 110.0

debt.increment!(:paid_value, 100.0)

=> 300.0

debt.total_paid

=> 310.0

The paid_interest in this case is updated in another process.

It’s a contrived example but if that’s the case, I will need to reload the entire record to get the right total paid.

At least some documentation telling that increment and increment! are subject to race conditions is needed.

Matt_Jones · January 23, 2013, 6:21pm

I believe the suggested solution in this case is to use optimistic locking (via a lock_version column) and then handle collisions manually.

--Matt Jones

G_Sobrinho · January 24, 2013, 12:41am

Yes, I’m thinking about that and seems there is no way to handle this in a smart way.

Developers can think “from what reason I’ve incremented by 100 and it incremented by 300” until he figure out that was incremented by other threads too.

Maybe some documentation on increment(!)/decrement(!) methods explaining that is not safe to use it concurrently will avoid to think they are.

Alex_Sharp · January 24, 2013, 1:28am

Since I have to keep a cache column of the paid value for the debt, I have 25 workers (sidekiq) that can call `increment!(:paid_value, >> paid_value)` and it should keep the total paid value.

[SNIP]

Yes, I'm thinking about that and seems there is no way to handle this in a smart way.

Developers can think "from what reason I've incremented by 100 and it incremented by 300" until he figure out that was incremented by other threads too.

Gabriel: I may be confused about your use case, but it sounds like your concern is around ensuring that the database client that makes an update can rely on the fact that his update was the only one made. Or, at least, your application has an "audit" concern, where, you need to be able to prove how values were mutated. There are a couple of ways to handle this, optimistic locking is one, row-level locks are another, or designing a double-entry transaction model is another (so you don't rely on locks at all).

If this is indeed your use case, what I'm advocating here is much simpler than that. I just want to be able to increment numeric columns in active record without declaring them as counter cache columns.

Maybe some documentation on increment(!)/decrement(!) methods explaining that is not safe to use it concurrently will avoid to think they are.

That'd certainly be helpful, but the same logic applies to any method you use. Any sort of CRUD application is subject to concurrent access and update problems. Are you simply suggesting documenting #increment!/#decrement! because they *don't* behave in the way that I'm advocating for in this thread?

carlosantoniodasilva · January 24, 2013, 2:52am

If this is indeed your use case, what I’m advocating here is much simpler than that. I just want to be able to increment numeric columns in active record without declaring them as counter cache columns.

You should still be able to do that by calling increment_counter or update_counters class methods, they’re generic methods for updating “counter columns”, which mean any numeric column should work, they’re not bound to counter cache columns (even though the filename kinda implies that).

object.class.increment_counter(:attr_name, object.id)

object.class.update_counters(object.id, :attr_name => 1)

I believe that should work fine for this purpose.

G_Sobrinho · January 24, 2013, 12:02pm

Carlos,

I’m suggesting to document this because the increment and decrements are intended to increment and decrement instead of replace the value, like the update_attributes method does.

I expect to call increment in a object and the value get incremented by the given value instead of replacing the database value, like this:

x = Debt.find(1)

y = Debt.find(1)

x.value

#=> 100

y.value

#=> 100

x.increment!(:value, 100)

#=> 200

y.increment!(:value, 50)

#=> 150

x.reload

x.value

#=> 150

I’m not sure which behavior is better but I would expect to get 250 on x instance because I incremented by 100 in first time and after incremented by 50.

One solution:

x.increment!(:value, 100)

#=> 200

y.increment!(:value, 50)

#=> 250

But the y instance need to be reloaded before or after incrementing.

Both are weird behaviors, I would not expect my object to be reloaded nor replace the value instead of incrementing.

I can avoid this problem using row lock but how will the developer discover that row lock is needed to keep increment/decrement concurrently safe? That’s the point!

I’ve always been ok with this but when I saw another developer reporting the same problem, I figured that this deduction is not just me.

One line about and will be happy, like this one:

“This method is not safe to update multiple instances at same time. You will need to use some lock like pessimistic lock, optimistic lock or row lock to avoid this problem.”

Do you think something like this is acceptable?

carlosantoniodasilva · January 24, 2013, 12:10pm

This is true for all save/update methods, not only for increment/decrement, because they act on the current instance values, so after you load them from the database, you’re fated to work on these values you have. As you said, if you’re working with a possible concurrent update, you should use some lock and handle it. Adding documentation to talk briefly about locking should be fine I guess.

G_Sobrinho · January 24, 2013, 12:19pm

Yes, I figured out right now that is not about concurrency safe but it’s about multiple instances.

I just assumed that increment will increment instead of replacing the database values, which I expect in save/update.

Thanks!

Topic		Replies	Views
atomic INSERT or UPDATE with activerecord rubyonrails-talk	3	455	November 9, 2009
ActiveRecord Increment Method rubyonrails-core activerecord	0	297	April 2, 2019
Multiple instances adding to the same attribute. rubyonrails-talk	1	148	June 11, 2007
counter implementation rubyonrails-talk	9	121	September 10, 2008
counter_cache won't increment rubyonrails-talk	0	92	April 1, 2007

ActiveRecord::Persistence.increment! requires a row lock to ensure isolated updates

=> 100.0

=> 300.0

=> 100.0

=> 10.0

=> 110.0

=> 300.0

=> 310.0

Related topics

More Resources