ActiveStorage efficient streaming possible?

I have an upload that transfers a CSV from a user hard drive direct to S3. These CSV files can be many MB in size. I need to parse through them efficiently.

So my current modus operandi is to open the file using the model and the method “open” and then process it in a block. So I execute the call:

my_mode.my_upload.open do |file|

Then I use SmarterCSV to read that file in chunks and process the CSV. Problem is, this blows up my memory and destroys my computing. Is there a more reasonable way to deal with this? I was thinking I could just open the S3 file into an IO.stream, and then read each line using my CSV parser. Is that the ticket? Or is there a way to stay withing the elegance of the attachment and ActiveStorage?

You can download a blob in chunks using blob.download, eg:

blob.download do |chunk|
  # do stuff with your chunk
end

However, these chunks won’t be delimited at the line end, so you might have to accommodate for that.

1 Like

So it is not exactly a readline block, but instead you get a chunk of binary data? Working with CSV that is not terribly handy, but I get it. Thanks.