Attach pictures gets very slow

Hello folks, I want to add about 15000 pictures. But after about 800 pictures it becomes very, very slowly. Does anyone know what could be the reason and how to solve it? The source and destination are on the same SSD

path = Dir["/home/username/pictures/M/**/*.jpg"]

counter = 1
path.each do |row|
  User.first.pictures.attach(io: File.open(row), filename: "%05d" % counter + ".jpg")
  counter += 1
end

What I suspect may be happening here is that the first thousand or more are going as fast as they possibly can, but as the number of files open and blocks in the garbage increase, your computer struggles to maintain all those pieces at once. Someone else here may be able to give you the tools to actually inspect this and confirm or refute my assertion here – I am entirely self-taught, but I have a lot of “machine empathy” and intuition guiding this theory.

You’re probably building up a huge amount of wasted memory or unharvested GC because you are are calling User.first inside your each. You only need that once (and ideally you would set it outside of the iterator). Further, you could use each_with_index to avoid incrementing the counter, although I don’t imagine that is causing you any actual memory or garbage collection issues.

Finally, you may be creating 15K tempfiles or open file references, and not harvesting them until the outermost (implicit) block closes. Try creating a block inside your iterator with the File.open as that will definitely close the file before moving on to the next one.

What you’re aiming for with this many turns of the wheel is for each one to maybe be a bit slower than your fastest loop time, but for them to each take exactly the same amount of time, so it doesn’t get slower as you go. Perhaps this will work:

path = Dir["/home/username/pictures/M/**/*.jpg"]
@user = User.first
path.each_with_index do |row, idx|
  File.open(row) do |file|
    @user.pictures.attach(io: file, filename: "%05d" % (idx + 1) + ".jpg")
  end
end

Hope this helps,

Walter

Hello Walter, many thanks for your help.

Unfortunately there was no significant improvement. The process is still getting slower every minute.

I’m thinking about not using Active storage. At least not for the first import. This would also work if I copy the pictures into the project and create a separate table for them.

2000 pictures ~ 50 minutes

Are you operating inside a transaction by chance?