Parallelizing Tasks in the Background

I’m experimenting with optimizing some areas that involve long processing times on some SQL selects (or any background task, really). This is in Rails 4.0.0 with Ruby 2.0.0p247, PostgreSQL 9.2.4.

In this case, I can do intelligent dividing of selects to even the workload amongst N processes for the DB backend.

I currently am trying Spawnling, and this works for exactly one hit before I have to restart the web service:

spawns =

ABC.each do # criteria for selects to divide load

spawns << Spawnling.new do

portion of big SQL select here via ActiveRecord, and store in memcache

end

end

wait for all N blocks of code to finish running

Spawnling.wait(spawns)

assemble result from memcache and deliver

render …

Works for one hit and then future hits log correctly but fail to deliver any content to the browser. There seems to be a number of issues with Spawnling anyways as it seems to be in disrepair.

How are folks parallelizing tasks these days for lots of CPUs per hit? What’s the Rails 4/Ruby 2 way to do it?

Thanks for any thoughts.

Phil

Hi Phil,

Sidekiq sounds like the gem you need. It does background processing using threads so you get true parallelism. You would need an implementation of Ruby though that can support true parallelism like Rubinius or JRuby. Ruby MRI has the GIL so it can’t do true multithreaded processing.

Here are links for Sidekiq:

  1. #366 Sidekiq - RailsCasts

  2. GitHub - mperham/sidekiq: Simple, efficient background processing for Ruby

Hope it helps!

Cheers,

Gabe