Cannot send_file then delete it

My application accepts a form to create multiple large(100's of MB) temporary files and then zip's them up to send off to a user.

I have the files constructed, and the zipping working. The problem is that if I use send_file to send the zip off to the user, I cannot delete the file afterwards as it seems send_file forks off another process and deletes the file before the streaming starts. I tried to put my File.delete() in a method called by after_filter but that caused the same issue.

I don't want to load everything into memory and use send_data as there is the potential for these files to be 200+ MB. If send_data accepted a block that would be cool but it isn't very flexible either.

Am I missing something?

Thanks, Chad Burt

My application accepts a form to create multiple large(100's of MB) temporary files and then zip's them up to send off to a user.

I have the files constructed, and the zipping working. The problem is that if I use send_file to send the zip off to the user, I cannot delete the file afterwards as it seems send_file forks off another process and deletes the file before the streaming starts. I tried to put my File.delete() in a method called by after_filter but that caused the same issue.

I don't want to load everything into memory and use send_data as there is the potential for these files to be 200+ MB. If send_data accepted a block that would be cool but it isn't very flexible either.

Am I missing something?

My understanding is that send_file sets a header that gets passed back to your front end web server and instructs that to send the file in question. So, no matter where you put your File.delete() it's going to do it before the front end web server gets a chance to send it.

I would look at a periodic clean up script to remove zip files older than say 30 minutes (or whatever length of time it takes for your users to download them)...

I'm wondering if there are any solutions other than using cron or the like. That involves accounting for download speeds, deciding to store the timestamps in the filename vs the filesystem vs a database table. Also it's another point of failure and deployment step.

It seems like this should be a common problem. Is this just an oversight of http?

-Chad

I'm wondering if there are any solutions other than using cron or the like. That involves accounting for download speeds, deciding to store the timestamps in the filename vs the filesystem vs a database table. Also it's another point of failure and deployment step.

It seems like this should be a common problem. Is this just an oversight of http?

Don't think so... same thing would happen if you manually removed a file while it was being downloaded...

You don't need to store the timestamps in the filename... just use the last accessed time... if that's longer than an hour say remove it.

Should be able to do the whole thing with ruby...

-philip

On a 'nix system, you should be able to delete the file at any time after the download has started. 'nix filesystem semantics usually give an opened file handle access to the file contents -- as they were at the time it was opened -- until the file handle is closed. At that point, the filesystem will reclaim the bits.

If you can detect when the file is being downloaded (instead of when it's created), just delete it then.

YMMV on other platforms.

J.

This is a sane approach, because it would be crazy to try to hold these in memory for the duration of the download.

But cut yourself some slack on your cleanup. It looks like you're trying to over-engineer that part of the problem.

Just schedule a Unix cron job (or Wintendo scheduled task) that cleans up files older than, say, 2 days in the temporary zip construction directory. This cuts people some slack when they have trouble downloading it and need several attempts, possibly spread out over a day.

In other words, you can make things _better_ for your users by being _less_ brilliant on the server side. :slight_smile:

An example in cron on a Linux host might be:

0 * * * * find /path/to/zips -type f -mtime +2 | xargs -r rm -f

Ciao, Sheldon.

Ok, cron is seeming more reasonable now. I actually have to run a cron job anyways to maintain concurrency with another database. It runs a rake task and I will just add a new task to that file. Just wanted to make sure I wasn't adding another step where I didn't need to.

Thanks, Chad