initialization of new model objects in Rails

Hey guys… having an issue that hopefully someone can help with.

I have a ActiveRecord model called WebPage, it has two fields - url and title. I want the title to be determined by parsing the html (I’m using Nokogiri) - and this is where I’m having issues.

My code looks something like:

class WebPage < ActiveRecord::Base

attr_accessible :url, :title, :doc

def doc

@doc ||= Nokogiri::HTML(open(@url))

end

def title

title = @doc.css(‘title’)

end

end

What is happening, is if I run this and try and create a new object (@page = WebPage.new(@url), I get:

ActionView::Template::Error (undefined method `css’ for nil:NilClass)

Now, if I set @doc in my controller, and change the name of title to set_title and call @page.title = @page.set_title, it works. But that is very ugly and if I’ve learned anything from rails, is that if it looks ugly to start with, it’s probably not the right way.

What am I doing wrong?

Hey guys... having an issue that hopefully someone can help with.

I have a ActiveRecord model called WebPage, it has two fields - url and title. I want the title to be determined by parsing the html (I'm using Nokogiri) - and this is where I'm having issues.

My code looks something like:

class WebPage < ActiveRecord::Base
  attr_accessible :url, :title, :doc

  def doc
    @doc ||= Nokogiri::HTML(open(@url))
  end

  def title
    title = @doc.css('title')
  end
end

What is happening, is if I run this and try and create a new object (@page = WebPage.new(@url), I get:
ActionView::Template::Error (undefined method `css' for nil:NilClass)

Now, if I set @doc in my controller, and change the name of title to set_title and call @page.title = @page.set_title, it works. But that is very ugly and if I've learned anything from rails, is that if it looks ugly to start with, it's probably not the right way.

What am I doing wrong?

You're not calling your doc() method when you use the @doc instance variable, so you aren't initializing @doc. I believe you could fix this by removing the @ before doc in your title method. You're still going to get the benefit of "memoizing" the value (you'll only look it up once, no matter how many times you ask for it).

One other thing to consider here. doc.css will always return an array-like NodeSet rather than a Node. You can do one of two things: doc.css('title').first() or doc.at_css('title'), which will do the same thing. And then if you want the content of the title, you need to say so. Otherwise, you will get a Node, and if it gives you the string contents of itself as a return value, that's pure coincidence.

title = doc.at_css('title').content

Walter

Wow… you have no idea how long I’ve been staring at this. That’s exactly what it was… changed “@doc” to “doc” and that did it.

I do have some code for parsing out the NodeSet… just removed it for simplicity in my code. However, my code isn’t as clean as what you have so I’ll play around with it some.

Thanks a bunch!