Efficient background process

I have a general question about a web service.

My app will (hopefully) parse RSS (XML) so that it returns newly updated feeds.

The parsing job is not the problem. The problem is how to let the server do it efficiently.

Obviously, one of the methods is the cron job. But it cannot process all if the number of the feeds is too large. Somehow the server does the job little by little.

For instance, I have some experience in AppEngine which provides Task Queue that finishes the job little by little maintaining the server intact.

Is there such functionality that Ruby on Rails has? so that my app can finish parsing without reaching the server limit?

soichi

If you want to queue projects why would you use Cron at all? And Rails 4 does it's called Queuing but you don't need Rails to do that, you could use Resque or Sidkiq which Queueing does but in with an integrated API. And your apps limit is only defined by your servers limit.

thanks! Resque and Sidliq sound great! I haven't even heard about them until now.

If you are talking about a system where a polling happens for jobs, and then the processing of the jobs happens independently, I’d suggest decoupling the two parts.

This is how I would go about it ( using AWS )

  • 1 EC2 instance to do the polling.

  • an SQS( a queue ) into which the polling EC2 pushes jobs.

  • An autoscaling EC2 group which scales based on the number of jobs in the queue and processes the jobs in the queue.

  • Emil