[scalability, performance, DoS] To or not to process images at runtime

Hello,

My next project will be a kind of online photo viewer. All of these photos will need to have watermark applied to them. The problem is that, depending on the picture, different watermarks need to be applied, exclusivelly (not all in the same image). The easiest solution would be to process these picture at runtime using RMagick, apply the watermark(s) and serve them. The other approach, would be to pre-process them (also using RMagick or the ImageMagick CLI tool) and create different copies on the disk for each watermark, the obvious advantage being that it could be served directly via the webserver (apache) using x-sendfile, but, it would be much harder to manage (need to fix a watermark error? Re-process and re-create the images on the disk…) and would take much more disk space (which could probably be solved by using a distributed filesystem, though).

One thing that I thought could work is to have a dedicated (or a cluster) server to process the images, this server (or cluster) would receive a request from the web server(s), process the desired image and return the stream of bytes to the web server that would in turn delegate it back to the client with the right mime/type (image/jpg). The software to process the images could be written in C to better optmize things. This, I think, could solve the CPU/RAM problem for processing the images per-request at runtime, but would result in a rather expensive backend. However, I’m not sure if passing the stream of bytes through Rails is optimal and I think it could be a major bottleneck, since I’ve heard of memory leaks and other hairy bugs when you try to serve big files using Rails - is that true?

I would rather process them at runtime (more flexibility (code can decide, per request, which watermark(s) to apply), much less disk space (in the other approach I would have to keep several versions of the pictures for each watermark(s)) per HTTP request, however, this site will probably have lots of traffic. So, I’ve reached a deadend. Could someone share his/her experiences and thoughts and help me decide? :slight_smile:

PS: I’ve put DoS in the subject tagline meaning Denial of Service as I think that maybe dynamic processing of images VS lots of request could result in DoS.

Thanks in advance,

Marcelo.

Hi,

It all depends on amount of visitors/serving images you'll get. You could try to measure it somehow, to find out if your resources are enough for your app... but for me - if the image would be served more than once, I would save it to disk somewhere and serve using apache/ send_file for better performance.

Especially when images are big and processing time grows fast.

Best, H.

Store just the raw images and cache watermarked images, expire the cache as necessary?

http://scottstuff.net/presentations/rails-caching/

"From the Sparklines code in Typo:

fragment_cache = read_fragment(fragmentname)

if(not fragment_cache)   fragment_cache = Sparklines.plot(ary,params)   write_fragment(fragmentname,fragment_cache) end

send_data(fragment_cache,           :disposition => 'inline',           :type => 'image/png')" (slide 25)

Wow, this cache thing could work. Can I cache the images for as long as I want?

Also, caching these assets (the images) would result serving the data as fast as x-sendfile ?

Thanks,

Marcelo.

Wow, this cache thing could work. Can I cache the images for as long as I want?

I don't see why not.

Thanks, I will have a deeper look into Rails caching.

Cheers,

Marcelo.