rake tasks

I'm not really creating a rails app, but may integrate this with rails
down the road.

I've adapted rss2mysql.rb from:

Practical Ruby Gems
http://www.apress.com/book/view/9781590598115

Chapter 10, parsing feeds

but am tweaking everything. Currently I have a rakefile, but am a bit
unclear about what sort of tasks go in there. For instance, grabbing rss
data from the interwebs could be a task, so maybe it should go in the
rake file?

Also, the rake file should have provisions to add/drop tables?

I started to put some of the db interactions into the rakefile, and then
thought that maybe everything in this file (script?) belongs in a
rakefile. Where do you draw the line?

thufir@ARRAKIS:~/rb$
thufir@ARRAKIS:~/rb$ cat rss2mysql.rb
require 'rubygems'
require 'active_record'
require 'feed_tools'
require 'yaml'

db = YAML.load_file("database.yml")
ActiveRecord::Base.establish_connection(
  :adapter => db["adapter"],
  :host => db["host"],
  :username => db["username"],
  :password => db["password"],
  :database => db["database"])
class Item < ActiveRecord::Base
end

unless Item.table_exists?
  ActiveRecord::Schema.define do
    create_table :items do |t|
        t.column :title, :string
        t.column :content, :string
        t.column :source, :string
        t.column :url, :string
        t.column :timestamp, :timestamp
        t.column :keyword_id, :integer
        t.column :guid, :string
        t.column :html, :string
      end
  end
end

puts "connected"

#feed = FeedTools::Feed.open(‘http://www.slashdot.org/index.rss’)
feed = FeedTools::Feed.open(‘www.amazon.com/rss/tag/blu-ray/new’)

feed.items.each do |feed_item|
  if not (Item.find_by_title(feed_item.title) \
    or Item.find_by_url(feed_item.link) \
    or Item.find_by_guid(feed_item.guid))
  puts "processing item '#{feed_item.title}' - new"

    Items.new do |newitem|
       newitem.title=feed_item.title.gsub(/<[^>]*>/, '')
       newitem.guid=feed_item.guid
       if feed_item.publisher.name
           newitem.source=feed_item.publisher.name
       end
       newitem.url=feed_item.link
       newitem.content=feed_item.description
       newitem.timestamp=feed_item.published
       newitem.save
    end
  else
    puts "processing item '#{feed_item.title}' - old"
  end
end

thufir@ARRAKIS:~/rb$

Also, what sort of IDE is popular for ruby? Eclipse?

thanks,

Thufir

Rake, like it's namesake Make, is useful for organizing complex tasks
that typicallly have other complex tasks as preconditions. It's
typically used to organize "build cycle" stages which are passed
through on the way to presenting the completed project (document,
test, application, installation, ...).

The original (make) program was useful for two reasons. It provided
an automated build cycle for complex software builds (make unix, make
install unix) that documented the build process (assuming you
understood make syntax). The build cycle it produced was able to
selectively only reconstruct the pieces that were out of date - this
was very important when computers had not yet achieved 1 MIP
performance and swap space was implemented by write/read cycles to 1/2
inch tape.

If you have a rubot in your apartment, Rake would be a logical choice
for organizing it's chores. I'ld use cron to schedule things like:

rake eggs toast coffee oj breakfast
rake whites laundry

emacs and vim are two opensource ide's that get my vote for all things
ruby and rails.

I broke that code into three rake tasks:

thufir@ARRAKIS:~/rb$
thufir@ARRAKIS:~/rb$ rake --tasks
(in /home/thufir/rb)
rake connect # connects to db
rake exists # creates table unless it already exists
rake populate # gets rss data
thufir@ARRAKIS:~/rb$

so that exists and populate depend upon the connect task. Is this
reasonably compatible with RoR?

thanks,

Thufir

Right, in this case I'm automating creating, destroying and populating
a db. Is that inappropriate somehow?

-Thufir