[ActiveStorage] Same file attached multiple times; single Blob or multiple Blobs?

I’m hoping to get some clarification on the expected behavior surrounding uploading the same file multiple times using ActiveStorage. Currently each time the same file is uploaded a new ActiveStorage::Blob is created having a unique key but a shared checksum. This behavior surprised me but I’m wondering if this is a bug or intended behavior. The following test illustrates how I was expecting AS to behave:

activestorage/test/models/attachments_test.rb

test “attached blobs with same checksum are shared” do
@user.avatar.attach io: StringIO.new(“STUFF”), filename: “town.jpg”, content_type: “image/jpg”

second_user = User.create!
second_user.avatar.attach io: StringIO.new(“STUFF”), filename: “town.jpg”, content_type: “image/jpg”

assert_equal @user.avatar.blob.checksum, second_user.avatar.blob.checksum
assert_equal @user.avatar.blob, second_user.avatar.blob # currently fails
end

``

This is expected behaviour.

Blob models are intended to be immutable in spirit. One file, one blob. And if you want to do transformations of a given Blob, the idea is that you’ll simply create a new one, rather than attempt to mutate the existing (though of course, you can delete that later if you don’t need it). Because of this, you have 2 different Blobs in your test case.

Checksum is calculated from the content of the file(data), so image **town.jpg** will always produce the same checksum since it is the same image (not the same Blob :))

I hope this helps

Thank you for your response Dino, but I still feel like I’m missing something because the last sentence of your Blob explanation confuses me.

Because of this, you have 2 different Blobs in your test case

If this is the same file attached twice on two different User model instances I’d expect:

  • a single ActiveStorage::Blob to represent this single file
  • two ActiveStorage::Attachment entries to associate this single blob to the two different Users. If we create any variants then another blob is created for the variant but to get two blobs right from the start feels to me like a bug. Shouldn’t checksums be unique within the active_storage_blobs table in order to maintain the “One file, one blob”? I apologize if I’m missing something incredibly obvious here and I appreciate your patience with me :slight_smile:

Thanks again!

-Dan

Hey Dan!

ActiveStorage doesn’t work that way currently. When you call attach you are creating the new blob and attachment for the object.

Only use case that ActiveStorage covers is this https://github.com/rails/rails/blob/master/activestorage/db/migrate/20170806125915_create_active_storage_tables.rb#L22

But you should be able to extend ActiveStorage and layer your custom requirements on top of the engine :slight_smile:

I suppose someone from the Rails team could comment weather is this something worth implementing in the engine out-of the box, or maybe I’m missing something :slight_smile: