Moving stand alone scripts into rails, architecture

This is going to be a bit of a long one;) I have a bunch of stand alone scripts that I want to move into rails. They are batch scripts that update the database (run from cron). Where I live the music listings websites are very incomplete so I am using web scrapers to collect listings for a listings website. They are very gently, fetching a page every 15 seconds. There is a .rb script for each venue. Basically I have three main classes, scrape, venue and event, the later 2 having corresponding tables. I have used scaffold to create the tables in a postgres database, now I need to work out where to put the methods. The scrape one does not need access to the database but the venue and events ones do. I was planing on punting scrape class in a module in lib, there will be other classes eventually but for now the module will probably only have the one. This is basically how it works: The Venue class has arrays for all the attributes needed for events (title, date, time, description…), it used the Scrape class to collect the raw data for events. The venue arrays are then looped through and loaded into the Event class which interprets the raw data and writes the event record. It uses a lot of setters, so when, for example, the time is set it is interpreted into a HH:MM format (it can be 10am, 1030am, 1030, 10:30…). There are lots of other loose/fuzzy formatted data for events and this is all interpreted using setter. I am not saying this architecture is correct but it works.

As an aside I also clean the data as there are lots of white-space characters.

The first step is to move stuff into rails, at a later date I may change things but for now I just want to get it working in a smeller way to the currently functioning stand alone scripts. I will be using rails runner from cron to schedule the scripts.

So do I put the venue and event methods in to controllers or is there somewhere else. The venue and event classes map to tables which is why this seems the way to go. I want to put them into rails so I can use ActiveRecords, and also because it feels like the correct thing to do? I then need to work out how to change the setters in Active record, or maybe not, but at the moment I simply am trying to work out where the methods go.

Why do you want to move it to Rails? Is your goal to present the scraped data on a web page? If you could tell us a bit more about your goal then we'll be able to give you more specific advice.

If you indeed want to present the results on a web page then I recommend that you convert your classes to Active Record models. You can do this without using Rails as Active Record can be used as a standalone library.

Once you're down it'll be much easier to generate a Rails app and build a UI around those models.

Yes, I’me building a website that lists what’s on at local venues. Others have suggested I move the scripts into rails and using ActiveRecords certainly appals to me. Ime very open to suggestions/advice and if I can use ActiveRecord and the models from my ruby app, and keep the scripts stand alone that may be a way to go.

I am using terms like outside rails and stand alone but this may be the wrong terminology. It just seems logical to have access to rails infrastructure and put stuff there so there is not duplication, or I could use the rals infrastructure and have the batch processing classes separate, wit the scripts using both.

Maybe I should be asking a more genital question lets give it a go:

If you have a rails app you are primevally using to display data collected by web scraping scripts and want to leverage the rails framework (ActiveRecords and anything else that is useful) to batch import data. How should I do this architecturally. Is there any documentation/guides/howtos that discuss this. I am tuning data into information.

I already have classes I have written to-do this that have them working as stand alone .rb. It seems logical to move the methods into rails and extend the ActiviveRecord classes etc. This may involve restructuring the object model but using the methods.

Broadly speaking the three things

  1. I am doing is scraping the data from websites (and loading the data into arrays to represent the data on these websites).
  2. Processing this data (the arrays) from the formats the websites has to a standard format (i.e. turning the various ways times are from the website to HH:MM). Also spiting up data (often there’s a string that has the date and time and this needs splinting up to date and time).
  3. Cleaning the data, removing white-space, trailing/leading spaces etc.
  4. Writing the data to a table.

rails runner has been suggested as part of the solution.

Regards, Ben

Uh. That whole paragraph is such a mish-mash that it leads to me believe that you don't have a clear picture of the MVC pattern that Rails embodies.

Foe one thing, classes ("models") derived from ActiveRecord are just that, models, *not* controllers. Controllers do not "map to tables".

You might want to, as Colin suggested, run through a tutorial before going any further.

Then decide if it makes sense to combine your data import function with the Rails *data display* function.

So do I put the venue and event methods in to controllers or is there

somewhere else. The venue and event classes map to tables which is why this

seems the way to go. I want to put them into rails so I can use

ActiveRecords, and also because it feels like the correct thing to do? I

then need to work out how to change the setters in Active record, or maybe

not, but at the moment I simply am trying to work out where the methods go.

Uh. That whole paragraph is such a mish-mash that it leads to me

believe that you don’t have a clear picture of the MVC pattern that

Rails embodies.

Foe one thing, classes (“models”) derived from ActiveRecord are

just that, models, not controllers. Controllers do not “map to tables”.

You might want to, as Colin suggested, run through a tutorial before

going any further.

Then decide if it makes sense to combine your data import function

with the Rails data display function.

I get the MVC pattern now, but I did not when I wrote the stand alone scripts (I started by reacquainting myself with ruby and writing the batch processing). Basally I was trying to get my head around how, if at all, its best to do data loading and manipulation in rails. It seems very geared towards data input in forms. Ime sure there is a good way of doing it, just wondering if there is some guides about how to do it. Ime also thinking from a OO perspective there should be a VenueWebsite object so rather than collecting and manipulating the events in the Venue object it should be a separate class.

Anyway learning ruby and rails is enough for now. As the scripts work well so I’ve decided to park looking at moving them into rails until I know it better.

Thanks for everyone’s help.

Since I don’t have too information about your scripts, I’ll just guess some things.

If the scripts are proved to work well (especially if you have unit tests for them), I would recommend to keep them as they are or move them to a gem and use them inside your rails app.

A very simple, proof of concept example:

Your already existing class

class ScrapeWebsite def initialize(url) @url = url end

def call # Logic for parsing the page # returns a string with the content end end

Controller

class EventsController def index @events = Event.all end

def create data = ScrapeWebsite.new(params[:url]).call Event.create(content: data)

flash[:success] = 'Data retrieved.'
redirect_to :back

end end

ActiveRecord model

class Event < ActiveRecord::Base validates :content, presence: true end