Scaling challenge

Hi guys,

Our app has been getting traction recently, and we struggle to cope with the traffic. I was hoping you guys could point me in the right direction.

For each request, the app does some heavy calculations. They are not done in Ruby, we use a compiled, shared lib for this. Our current architecture is:

  • user request triggers 2 background jobs (via Workling/Starling)

  • the first worker does some pre-processing (including polling the DB), call the shared lib, and loop. It sends results along the way to the second worker.

  • the second worker does some post-processing and sends results back to the app.

I think the key problem is the call to the shared lib is blocking, so when we get lots of calls at once, it seems to me all the workers are stuck on this lib and nothing else happens (there are a limited number of workers).

I have a couple of options in mind, but I have no idea which one should give better results:

  • upgrade to Ruby 1.9.3 (from 1.8.7)

  • migrate from DL/Load to FFI (but I don’t believe it sorts the blocking issue)

  • consider another Ruby implementation

  • consider a different architecture. For instance I could run this shared lib on a separate instance without anything else (just Ruby processes) but I am not sure how to do this considering speed will be key.

Thanks!

PJ