[Proposal] SQLite3 full-text-search (and other virtual tables): Do not dump shadow/virtual tables

Hi, I’d like feedback on a patch, before I open a PR:

Using the builtin full-text-search module (FTS5) in SQLite messes up schema.rb, since neither the virtual table or the shadow tables can be properly handled by the schema dumper. Running CREATE VIRTUAL TABLE test USING fts5(content) in a migration, produces the following in schema.rb:

# Could not dump table "test" because of following StandardError
#   Unknown type '' for column 'content'

# Could not dump table "test_config" because of following StandardError
#   Unknown type '' for column 'k'

# Could not dump table "test_content" because of following StandardError
#   Unknown type '' for column 'c0'

  create_table "test_data", force: :cascade do |t|
    t.binary "block"
  end

  create_table "test_docsize", force: :cascade do |t|
    t.binary "sz"
  end

# Could not dump table "test_idx" because of following StandardError
#   Unknown type '' for column 'segid'

Would you be interested in a PR that ignores virtual/shadow tables during schema dump for SQLite? I’ve already got it working locally using:

# activerecord/lib/active_record/connection_adapters/sqlite3/schema_statements.rb
# L175
def data_source_sql(name = nil, type: nil)
  scope = quoted_scope(name, type: type)
  scope[:type] ||= "'table','view'"

  sql = +"SELECT name FROM pragma_table_list WHERE schema <> 'temp'"
  sql << " AND name NOT IN ('sqlite_sequence', 'sqlite_schema')"
  sql << " AND name = #{scope[:name]}" if scope[:name]
  sql << " AND type IN (#{scope[:type]})"
  sql
end

The only tests that do no pass, are two tests that specifically check that the old query was being logged.

Best regards, Zacharias

Hi Zacharias, thanks for bring this subject up.

What you’re suggesting is probably a small improvement over what currently exists, but to me it still leaves FTS support in a very weird state.

For casual readers, I’ll first give a bit of background explanation about SQLite3’s FTS5 tables. If you create a virtual fts table “foo_fts5” in a migration:

class CreateFts5Table < ActiveRecord::Migration[7.1]
  def change
    execute 'create virtual table foo_fts5 using fts5(content)'
  end
end

and run rails db:migrate, then from the Rails console you can see the related tables of type “shadow” that are created:

> db = SQLite3::Database.open("storage/development.sqlite3")

> db.execute("select * from pragma_table_list order by name")
=>
[["main", "ar_internal_metadata", "table", 4, 0, 0],
 ["main", "foo_fts5", "virtual", 3, 0, 0],
 ["main", "foo_fts5_config", "shadow", 2, 1, 0],
 ["main", "foo_fts5_content", "shadow", 2, 0, 0],
 ["main", "foo_fts5_data", "shadow", 2, 0, 0],
 ["main", "foo_fts5_docsize", "shadow", 2, 0, 0],
 ["main", "foo_fts5_idx", "shadow", 3, 1, 0],
 ["main", "posts", "table", 6, 0, 0],
 ["main", "schema_migrations", "table", 1, 0, 0],
 ["main", "sqlite_schema", "table", 5, 0, 0],
 ["main", "sqlite_sequence", "table", 2, 0, 0],
 ["temp", "sqlite_temp_schema", "table", 5, 0, 0]]

and then schema.rb is a bit of a mess, with error messages for some of those tables (including the virtual table itself) but keeping two of the shadow tables:

ActiveRecord::Schema[7.1].define(version: 2024_07_15_205331) do
# Could not dump table "foo_fts5" because of following StandardError
#   Unknown type '' for column 'content'

# Could not dump table "foo_fts5_config" because of following StandardError
#   Unknown type '' for column 'k'

# Could not dump table "foo_fts5_content" because of following StandardError
#   Unknown type '' for column 'c0'

  create_table "foo_fts5_data", force: :cascade do |t|
    t.binary "block"
  end

  create_table "foo_fts5_docsize", force: :cascade do |t|
    t.binary "sz"
  end

# Could not dump table "foo_fts5_idx" because of following StandardError
#   Unknown type '' for column 'segid'

  create_table "posts", force: :cascade do |t|
    t.string "title"
    t.text "body"
    t.boolean "published"
    t.datetime "created_at", null: false
    t.datetime "updated_at", null: false
  end

end

And of course this puts the app into a bad state if you do rails db:drop ; rails db:schema:load because then you’ll see that those shadow tables are no longer shadow tables:

> db.execute("select * from pragma_table_list order by name")
=>
[["main", "ar_internal_metadata", "table", 4, 0, 0],
 ["main", "foo_fts5_data", "table", 2, 0, 0],
 ["main", "foo_fts5_docsize", "table", 2, 0, 0],
 ["main", "posts", "table", 6, 0, 0],
 ["main", "schema_migrations", "table", 1, 0, 0],
 ["main", "sqlite_schema", "table", 5, 0, 0],
 ["main", "sqlite_sequence", "table", 2, 0, 0],
 ["temp", "sqlite_temp_schema", "table", 5, 0, 0]]

Zacharias, what you’re suggesting would (if I’m understanding correctly) omit all of the tables from schema.rb … but in this case, the schema is still unusable because that FTS table won’t be recreated if someone runs db:schema:load. What do you think about the particular use case I’m describing? Can you say more about your workflow and how you’re handling database recreation (I assume you’re always using the migrations?)?

Or, put another way, is this change worth making even if Rails still doesn’t have first-class FTS5 support?