RFC: new ActiveRecord hook after_load_schema

Background:

Years ago while working on one of the acts-as-taggable gems I discovered that certain method calls in ActiveRecord classes rely on database connections. Specifically, any method which wants to first check which columns or any column-related information is defined, will necessitate a database connection. So, you could override or alias the columns method to add behavior on when columns is first evaluated. I’ve since learned at the actual hook method at this time is load_schema

Proposal:

Add an after_load_schema hook to ActiveRecord::Base which can be defined per subclass.

Example code I have right now in our ApplicationRecord

  def self.inherited(base)
    return super if respond_to?(:after_load_schema_hooks)
    base.instance_eval <<~CLASS_METHODS, __FILE__, __LINE__ + 1
      class_attribute :after_load_schema_hooks, default: []

      def self.after_load_schema_hook(hook_context, &block)
        after_load_schema_hooks << [hook_context, block]
      end

      # # Define hooks which run after the record class first connects to the database
      alias without_after_load_schema_hooks load_schema
      def self.load_schema
        return if schema_loaded?
        without_after_load_schema_hooks.tap do |result|
          next unless result
          after_load_schema_hooks.each do |(hook_context, block)|
            # Rails.logger.debug { "[AFTER LOAD_SCHEMA HOOKS] running" }
            hook_context.instance_exec(&block)
          end
        end
      end
    CLASS_METHODS
    super
  end

We use this, for example, in our lib/sti_preload.rb

# per https://guides.rubyonrails.org/autoloading_and_reloading_constants.html#single-table-inheritance
# Usage:
# In an STI base class
#      AppConfig.when_not_eager_loading do
#        include StiPreload
#        self.deleted_sti_models = []
#      end
module StiPreload
  extend ActiveSupport::Concern

  included do
    cattr_accessor :preloaded, instance_accessor: false
    cattr_accessor :deleted_sti_models, instance_accessor: false, default: []
  end

  class_methods do
    def descendants
      preload_sti unless preloaded
      super
    end

    # Constantizes all types present in the database. There might be more on
    # disk, but that does not matter in practice as far as the STI API is
    # concerned.
    #
    # Assumes store_full_sti_class is true, the default.
    def preload_sti
      after_load_schema_hook(self) do
        begin
          polymorphic_name.constantize.preload_sti!
        rescue PG::UndefinedTable, ActiveRecord::StatementInvalid => e
          raise unless e.is_a?(PG::UndefinedTable) || e.cause.is_a?(PG::UndefinedTable)
          $stderr.puts "Skipping db-dependent code: #{polymorphic_name}"
          nil
        end
      end
      self.preloaded = true
    end

    def preload_sti!
      types_in_db = \
        base_class
          .unscoped
          .select(inheritance_column)
          .distinct
          .pluck(inheritance_column)
          .compact - deleted_sti_models

      logger ||= Rails.logger
      types_in_db.each do |type|
        logger&.debug("Preloading STI type #{type}")
        begin
          type.constantize
        rescue NameError
          logger&.warn "StiPreload: type class not found: table_name=#{table_name} type_name=#{type}. (ActiveRecord::SubclassNotFound)"
          Rails.env.production? ? raise : nil
        end
      end
    end
  end
end

=begin
module AppConfig
  extend self

  def when_not_eager_loading(&block)
    return :not_eager_loading if is_eager_loaded_env?
    Rails.application.configure do
      config.after_initialize(&block)
    end
  end

  # NOTE(BF): tasks which require "config/environment" won't have required "config/boot"
  # which means that config.eager_load may be false when we want to treat it as true, as
  # for example, with StiPreload, which we don't want to run during the `rake release` task
  # or `rake assets:precompile`.
  def is_eager_loaded_env?
    return @is_eager_loaded_env if defined?(@is_eager_loaded_env)
    @is_eager_loaded_env = Rails.application.config.eager_load || %w[production staging].include?(Rails.env)
  end
end
=end

Known gems using this general pattern

I think it’s a good idea. I’ve seen a lot of cases of apps causing the schema to be loaded to early by accident because of this.

Until now, I’ve simply redefined load_schema! on the model, but an official hook would be better, and it would open the door to some “strict mode” that entirely prevent touching the DB during boot.

Do you feel like opening a PR, I’d be happy to review it.

Thanks @byroot for taking a look. I’ll see if it can be done via activesupport callbacks first, else consider if the interface can behave like activesupport callbacks.

Taking a shot at it Start thinking about how to define after_schema_load_callbacks by bf4 · Pull Request #46205 · rails/rails · GitHub