Design Dilemma - Please Help

Hi, I'm new. :wink:

I creating a little rails app, that will crawl the web on a regular basis and then show the results.

The crawling will be scheduled, likely a cron job.

I can't wrap my head around where to put my crawler. It doesn't seem to fit.

An example: Model - News Story Controllers - Grabs a story from the DB, Sort the Stories, Search the Stories etc. View - HTML News Story, RSS Story etc.

Then a I have a news crawler, that will go crawl some feeds for new stories, then insert them into the db. Where do I put it, and how do I get cron to execute it?

Maybe put it in the NewsStoryController?

Do I make another file for cron to run, that contains something like this:

I'd place it in the class of it's own, in a file under lib/ and use script/runner to invoke it.

Crawl would be a class method, so you would invoke it like this:

   script/runner 'NewsStoryController.crawl'

or, in production

   script/runner -e production 'NewsStoryController.crawl'

Awesome...thank you.

An alternative to the suggestion already offered would be to use backgroundrb [1] and to set up your crawler as a worker class in

lib/workers

I’m using that to good effect in an application (not yet launched) which makes heavy use of regular screen scraping.

James.

1: http://backgroundrb.rubyforge.org/