Scheduled tasks

Hi

What is the best way to go about creating a service that has access to
my rails environment, but that runs on its own?

My specific problem is creating a process that runs in the background
and updates certain models every 5 minutes. Is there a standard way of
creating something like this?

Thanks
Ryan

  • cron tab and script/runner

  • backgroundrb’s scheduler

  • small ruby daemon with loop with sleep

You can choose what suits you best.

Best regards

Peter De Berdt

I'll chime in a small variant of Peter's suggestions. If you have a
ruby or other program running, you might want to run it from inittab
to keep it running no matter what.

I vote for crontab calling the shots and calling script/runner.

H

Same here, crontabs would be my personal favorite too. It takes a few seconds to start up the rails process, but at least it releases memory again afterwards. Backgroundrb being a persistent process leaked memory for me, which lead to server crashes over time (earlier versions, haven’t tried the last one).

Thanks for the help.

What if I wanted it to constantly be working. For example, monitoring
a database table for jobs to process?

Depending on what you mean exactly, you could use triggers in the database (if the processing can happen there)
http://dev.mysql.com/doc/refman/5.0/en/triggers.html

If you’re talking about your rails app saving something to the db and you want to process it without locking a mongrel or the user, you could use backgroundrb, starling or some other asynchronous processor out there. There’s a chapter devoted to such setups in the excellent Advanced Rails Recipes book.

Best regards

Peter De Berdt

Has anybody used rufus-scheduler? Opinions? I just downloaded it and
am trying to get it to work. I'm somewhat new to rails so the fix is
not obvious.

I need to run a task every 5 minutes that gathers data and adds it to my
database, so the periodic task needs access to ActiveRecord to save
objects.

Steven Line wrote:

Has anybody used rufus-scheduler? Opinions? I just downloaded it and
am trying to get it to work. I'm somewhat new to rails so the fix is
not obvious.

I need to run a task every 5 minutes that gathers data and adds it to my
database, so the periodic task needs access to ActiveRecord to save
objects.

If anybody is interested, here are the errors. It doesn't seem to be
finding an ActiveRecord object it's trying to access:

running incremental_scan:
trigger() caught exception
A copy of QueryController has been removed from the module tree but is
still active!
/usr/lib/ruby/gems/1.8/gems/activesupport-2.0.2/lib/active_support/dependencies.rb:237:in
`load_missing_constant'
/usr/lib/ruby/gems/1.8/gems/activesupport-2.0.2/lib/active_support/dependencies.rb:469:in
`const_missing'
/home/adminline/cvswatch/app/controllers/query_controller.rb:43:in
`incremental_scan'

The QueryController contains a method called incremental_scan.
Incremental_scan is configured to run every 5 minutes. However when the
scheduler calls it, it throws an exception when it tries to do a
.find(1) from an ActiveRecord object.

Steven Line wrote:

Has anybody used rufus-scheduler? Opinions? I just downloaded it and am trying to get it to work. I'm somewhat new to rails so the fix is not obvious.

I need to run a task every 5 minutes that gathers data and adds it to my database, so the periodic task needs access to ActiveRecord to save objects.

Yes, I am just in the process of using rufus-scheduler.

To use ActiveRecord models from your rails application you need to bring in the rails environment. Put your script that uses rufus-scheduler in your rails root application directory. Near the top of my little script:

rails_root = File.expand_path(File.join(File.dirname(__FILE__)))
RAILS_HOME = rails_root

require RAILS_HOME + '/config/environment'

Then you can use schedules like:

scheduler.schedule_every("1m00s") do
         puts Time.now.to_s + "...popping up every 1 minute..."
         Session.delete_all("updated_at < now()-'20 minutes'::interval")
end

Does this help?

Cheers,
Gary.

Ok, I got it to work using script/runner but first I had to move my
method from my controller to a model and make the methods static.

Thanks.

Steve

Hmm, I doubt that BackgrounDRb itself leaked memory. However with newer
version there is an option to keep scheduled workers not persistent. As
in, your worker will be stared afresh each time schedule is arrived.
Just define your usual configuration file and in worker:

class FooWorker < BackgrounDRb::MetaWorker
  set_worker_name :foo_worker
  reload_on_schedule true
end

IMHO BackgrounDRb has advantage that you can easily monitor status of
your workers. You wouldn't need a separate crontab entry and since
everything is under RAILS_ROOT you can manage it with Capistrano or
Vlad.

Everywhere I look, I see people saying negative things about
backgroundrb, about how it is unstable and is a memory leak or adds a
bunch of hassles. I assume this must have been true some time in the
past (perhaps when backgroundrb was actually using drb).

However, I have been using backgroundrb (with scheduling) [latest
version] and it seems stable, easy to use and very low memory. I think
Hemant must have made some radical revisions because nothing bad I hear
about it rings true anymore as far as I know. For one it doesn't even
use drb.

I would really recommend backgroundrb as it is easy to control through
capistrano and monitor using monit. Also it has full access to the rails
environment, models, etc.

Just my two cents.

Well, I said I didn't use the latest version yet, and if I look in the changelog, it clearly says:

2008-02-28 hemant kumar <hemant@shire>

  * fixed some meory leaks.

My problems had to do with exporting records to csv (with lots of calculations being done on them before exporting). I tested the same report generator from a normal Rails process and it was stable in memory use, handing it over to BackgroundRB caused the server to run out of memory in a matter of days (after it had been running without a hitch for over a year).

That said, I believe backgroundrb is a good solution and the advantages the other posters claim are indeed true. And given that it's easy to integrate and if it fails for you, just as easy to take out again and replace with something else, I see no reason not to try it (I might well give it another shot in the future :-))

Best regards

Peter De Berdt

> Hmm, I doubt that BackgrounDRb itself leaked memory.

Well, I said I didn't use the latest version yet, and if I look in the
changelog, it clearly says:

2008-02-28 hemant kumar <hemant@shire>

  * fixed some meory leaks.

My problems had to do with exporting records to csv (with lots of
calculations being done on them before exporting). I tested the same
report generator from a normal Rails process and it was stable in
memory use, handing it over to BackgroundRB caused the server to run
out of memory in a matter of days (after it had been running without a
hitch for over a year).

Gosh, I feel naked. I dunno if you were affected by those bugs. But
anyways, since a fix was posted about 20 days back(and was in git repo
for much longer time), I think its okay.

That said, I believe backgroundrb is a good solution and the
advantages the other posters claim are indeed true. And given that
it's easy to integrate and if it fails for you, just as easy to take
out again and replace with something else, I see no reason not to try
it (I might well give it another shot in the future :-))

Yup, fully agree. Choose what works for you. Heck, I will dump bdrb in a
zip if it didn't work for me. I just wanted to draw your attention to
the facility of reloading workers on schedule, so as they are not
persistent and gets loaded only on scheduled time.

There were/are couple of problems around bdrb.

1. Google search about BackgrounDRb refers to documentation/tutorials
which are very old.
2. There have been too many bleeding versions and hence the confusion.
3. I am a lousy and lazy programmer.

We are improving things bit by it and hope its usable for everyone as
its usable for you and me.

Gary Doades wrote:

To use ActiveRecord models from your rails application you need to bring
in the rails environment. Put your script that uses rufus-scheduler in
your rails root application directory. Near the top of my little script:

rails_root = File.expand_path(File.join(File.dirname(__FILE__)))
RAILS_HOME = rails_root

require RAILS_HOME + '/config/environment'

Then you can use schedules like:

scheduler.schedule_every("1m00s") do
         puts Time.now.to_s + "...popping up every 1 minute..."
         Session.delete_all("updated_at < now()-'20 minutes'::interval")
end

Does this help?

Yes, I learned several things by reading this. Thanks!

Steve