ActiveStorage and Database Cloning

Some of the changes to ActiveStorage for Rails 6.1 have been focused on enabling finer-grained configuration around storage backends for specific Attachments/Blobs. This seems like a generally fantastic feature, and the changes around it seem well-considered.

I stumbled on to this change as I was implementing another feature in our own project that seems to be at odds, however. In order to have representative data in our development and staging environments – and verify that our backups are functional – we will regularly restore production backups to those environments. Today, with different service configurations in development and staging, ActiveStorage resource loads simply fail. (In an ideal world, we would use a FallbackService to delegate reads to one or more read-only services.)

If the database records for Blobs also name the service they’re stored in, that simultaneously solves the issue of failing loads (*assuming we have access to the configured production Service) and creates an issue whereby updates in development are capable of writing back to the production Service. This issue ends up being particularly serious for when Blobs are being replaced or destroyed, as it can lead to production data being destroyed. (We also can’t implement a FallbackService as described above, as the configured service is only ever used when creating new Blobs.)

As I suspect that loading production data in non-production environments is not an uncommon practice, this seems risky.

  • Is there a planned solution for this that we can document?
  • How do we communicate the error case where the Blob’s service_name cannot be accessed? (e.g. production is backed by the FileService, or configured by ENV variables that only exist on the production servers)
  • Are there safety measures we can put in place to avoid destroying non-local data?

The plan is for Active Storage to use a three-level configuration for multiple services, like Active Record with multiple databases:


  primary:
production:
service: S3
# ...
development:
service: Disk
# ...
avatars:
production:
service: S3
public: true
# ...
development:
service: Disk
public: true
# ...

Glad to hear that the risky bits are accounted for. :slight_smile:

I suppose, then, that takes me back to the original problem – has any thought been given to guidance or official support around duplicating a storage backend? The `Mirror` service exists, but write replication isn't a great solution (e.g. production would have to mirror writes to S3 buckets for each developer or risk occasional missing attachments, production data wouldn't be automatically backfilled, can't mirror to Disk services on local development environments).

Is there a reasonable case to be made for shipping read replication in Rails core? Is there a better solution I'm not thinking of?