I have writtern a set of ruby scripts that are used to poulate a database that I am going to use rails as a front end for users. Curently they are stand alone, having there own lib directory and a few classes. These scripts are run every night using cron/bash.
I am planing on changing the database to conform to ActiveRecord so rails can easily access it but I am wondering if ti is worth moving the scripts within rails rather than having them stand alone. If doing this is a good idea I dont really know where to start. The key thing is I need to be able to scedurle tham to run every night.
Great, just had a look at this, looks like just the ticket.
So I take it I need to run this from the rails app directory? Where should I put the scripts, vendor maybe? There is at least one class (regarding screen scraping) that wont be needed from within ruby, where should I put this (currently everything is in a module called scrape so I would put it in the ruby app under lib/scrape, or I could have a lib under vendor. What would be best proactive.
So I take it I need to run this from the rails app directory? Where should I
put the scripts, vendor maybe?
I normally put them in the /lib directory.
Yes, that was what I thought (after a bit more googling). Would it be OK to create a scripts directory under lib to put the ,rb scripts in. Also I have a module, so I guess I should put this in lib/modulename·
Now a bit of context is probably needed. I will have quite a lot of scripts (possibly 40, each one is a web scraper). They do very gentle scraping (i.e. a http request every 15 seconds) so currently I run them concurrently using cron. Doing this all within rails seems a great idea but I am worried about the (resource) cost of doing this (initializing 40 rails environments) will be quite high. Did some googeling and could not find out what the cost is. Don’t want to put undue stress on the server so although I like the ideal it may be better to keep them as separate stand alone scripts outside rails. Like I said I would rather run them inside rails. What so you think?
Now a bit of context is probably needed. I will have quite a lot of scripts
(possibly 40, each one is a web scraper). They do very gentle scraping
(i.e. a http request every 15 seconds) so currently I run them concurrently
using cron.
Out of curiousity -- what OS are you using that gives you sub-minute
granularity for cron?
Doing this all within rails seems a great idea but I am worried
about the (resource) cost of doing this (initializing 40 rails environments)
will be quite high. Did some googeling and could not find out what the cost
is. Don't want to put undue stress on the server so although I like the
ideal it may be better to keep them as separate stand alone scripts outside
rails. Like I said I would rather run them inside rails. What so you
think?
Depends on the server, but yes, starting 40 full rails environments
every 15 seconds seems like a bad idea. At this point, you should
probably be thinking in terms of background jobs. Or leaving them
as standalone scripts.
A lot depends on the details -- how close to exact that 15 seconds
needs to be, how to handle timeouts, what you're doing with the data
they're collecting, etc.