Failed write transaction upgrades in SQLite3

Context: transactions in SQLite

The default SQLite transaction is a read transaction and many connections are allowed to read concurrently. As soon as SQLite encounters a write operation (e.g. INSERT/UPDATE/DELETE) it will automatically try to upgrade the transaction to a write transaction, which requires an exclusive lock, since SQLite only allows a single writer at any time.

Whenever SQLite fails to upgrade a read transaction to a write transaction, it immediately fails with SQLITE_BUSY which appears in Ruby as SQLite3::BusyException: database is locked - it will not use the timeout to wait for the blocking write transaction to finish. Why is that? Because the data that you have just read might be stale and SQLite doesn’t allow this as it might result in inconsistent writes.

Why is this a problem?

Rails is sometimes issuing several read operations before doing the write operation requested by the application code. This is for fetching metadata from the database to check the database version, table schema and/or which indexes are available. When Rails opens a transaction - e.g. because of model callbacks - these read operations are the first statements executing within that transaction. Meaning the transaction will start off as a read transaction and any subsequent writes will require an upgrade. In practice the metadata is often cached so the issues don’t occur very often, but even in our small business application that has perhaps a handful of users every day, I see about 3-5 of these errors per day.

Parallel testing

Our app has a couple of hundred tests and I also occasionally run into this issue while running the tests in parallel. Other people seem to run into this issue as well: Regression with parallel tests connected to SQLite in Rails 6.1? · Issue #38104 · rails/rails · GitHub It is very unpredictable and hard to figure out when tests occasionally fail, but I believe what these people are describing is caused by this issue.

How to reproduce

There is a github issue that was sort-of related where I did some investigation. There is also some code to reproduce this problem: ActiveRecord + SQLite3 does not handle SQLite3::BusyException correctly within transactions · Issue #13908 · rails/rails · GitHub

How to avoid failed write lock upgrades

This is only an issue when a transaction starts with any kind of read operation, followed by a write operation. When the first operation inside of a transaction is a write operation, the issue does not occur, since SQLite will just wait for the write lock to be released before continuing (it does use the timeout setting here, so it can still timeout and give you the same error).

SQLite Transactions are started in “deferred” mode. This is the behavior described above, where SQLite will open a read lock and only upgrade it to a write lock whenever it finds a write operation. It is also possible to open a transaction in “immediate” mode, which means SQLite will try to obtain a write lock before executing any statements. For more information, see the SQLite documentation on transactions

The sqlite3-ruby gem accepts a mode argument for the #transaction method. When I set this to immediate by default for all transactions, I do not see any busy errors anymore. There doesn’t appear to be any noticeable difference in latency for our own application.

Writing a fix

I was planning to write a fix for this so that everyone using Rails + SQLite could deal with this issue. However, I’m not sure what a good approach would be:

  • Allowing the application code to explicitly pass the mode argument to a transaction block is not an option. It might help for explicit transaction blocks, but will not help when Rails is checking the database metadata. It would be annoying to have to remember it, and having database-specific arguments in Rails seems like a bad idea. It would also mean migrating to another database would be a bit awkward.
  • Making sure that Rails only issues metadata queries in a separate transaction might be a lot of work, and would change behavior for people using other databases as well. I think it would be too risky. And you would still run into the issue with explicit transaction blocks.
  • The easiest solution is to always open every transaction in immediate mode (as I did). While this might solve the issue, read-only transactions might block without good reason. There might be good reasons to do several read operations inside of a single transaction, however I think this is not very common - I might be wrong though! I think setting the transaction mode in the database configuration might be a good compromise. It would mean no changes to any application code, just to configuration. There would still be a performance issue, but you would be aware of that as the documentation could point it out for you.

Perhaps there are other ideas I did not come up with.

I do think it would be a good thing to deal with this issue in Rails one way or another, as it basically means that SQLite becomes . That’s a shame because SQLite is a solid database that is used in production more than any other database in the world.

1 Like

I am very interested in your findings as I maintain a data-related gem, The Brick. This spins up a Rails application from any database, offering ERD diagrams, ability to expose an API, tools to search for orphaned data, etc. You don’t have to create any scaffolded files for this – it all happens in RAM. So a gem like this becomes useful to quickly look inside of SQLite files to see what’s there, making data updates or creating migration files from a database and so forth.

Based on your findings I’ll update The Brick so that write operations in SQLite are always opened in immediate mode.

Automatically using IMMEDIATE mode for all SQLite transactions has now been added in v1.0.75 of The Brick. If you drop that gem into any project it will work (and also do lots of additional stuff … perhaps more than you really wanted…) so for just this feature then here is the relevant code, compatible with ActiveRecord 4.2 - 7.1:

# Overwrite SQLite's #begin_db_transaction so it opens in IMMEDIATE
# mode instead of the default DEFERRED mode.  More info here:
#   https://discuss.rubyonrails.org/t/failed-write-transaction-upgrades-in-sqlite3/81480/2
if ActiveRecord::Base.connection.adapter_name == 'SQLite'
  arca = ::ActiveRecord::ConnectionAdapters
  db_statements = arca::SQLite3::DatabaseStatements
  # Rails 7.1 and later
  if arca::AbstractAdapter.private_instance_methods.include?(:with_raw_connection)
    db_statements.define_method(:begin_db_transaction) do
      log("begin immediate transaction", "TRANSACTION") do
        with_raw_connection(allow_retry: true, uses_transaction: false) do |conn|
          conn.transaction(:immediate)
        end
      end
    end
  else # Rails < 7.1
    db_statements.define_method(:begin_db_transaction) do
      log('begin immediate transaction', 'TRANSACTION') { @connection.transaction(:immediate) }
    end
  end
end

I think opening an IMMEDIATE transaction could immediately result in another SQLITE_BUSY error when there is already another transaction in progress, so you would have to implement retries on opening an immediate transaction. You could also retry the entire transaction on failure (and keep using DEFERRED mode) but that could potentially be a lot more expensive in case the database runs into a lock.

The more I look into this, the more I get convinced that SQLite indeed isn’t a good database for production usage in Rails, at least not when you expect to have a high volume of write transactions. I have often seen expensive operations in model callbacks, meaning a transaction could be open for a relatively long time, since Rails might be busy doing something other than executing database statements. This is practically a non-issue when using a database that uses finer level locking than the entire database (i.e. row-level locking).

If you design your transactions carefully so they will execute quickly, you could achieve a high write throughput with Rails and SQLite and you can avoid running a more complicated database setup for a long time. However I expect most people would rather use PostgreSQL or MySQL and avoid having to think about the size of their transactions.

Until seeing your post I had only tested The Brick with Sqlite in the most simple of configurations using the Northwind data sample. And while I’ve seen many read-intensive sites using this database in production, I only consider Postgres / MSSQL / Oracle when setting up the back end, so had somewhat written off Sqlite.

Might further test The Brick with heavier load and multiple connections, and see if I can get it to fail. If so then I’ll probably add an auto-retry the transaction ten times kind of thing. Hoping for this gem to be a simple drop-in solution for anyone wanting to quickly build complex data solutions. Striving to take away any and all guesswork in the process, and have something that just works – every single time.