May AR objects contain "invalid" data when not saved yet?

Hi all

I have a question about good ActiveRecord style. I have a model Page
with two attributes, named "title" and "body". The title is never set by
the user directly, but is extracted from the body, which includes an
HTML structure like the following:

<h1>heading 1.1</h1>
<p>content</p>
<h2>heading 2.1</h1>
<p>content 2</p>
<h2>heading 2.2</h1>
<p>content 2</p>
...etc...

The page model should automatically extract the content from the H1
element(s) and store them in the attr_protected title attribute.

At the moment I do this the following way:

...
  def before_validation
    self.title = detect_title
  end
private
  def detect_title
    Nokogiri::HTML(self.body).xpath('//h1').collect(&:content).to_sentence
  end
...

This works great so far. But I remarked that this way the data in any
Page object will only be valid after the call of valid? or safe (or any
other such related methods). So before doing this, it's possible that I
have invalid data in a Page object!

My question: is this OK? Or should I overwrite the body=() method or
something to make the "magic" happen as soon as the body itself changes?

Or is it agreed that modified AR objects can have temporary invalidities
as long as they're not safed/validated?

Thanks a lot for help :slight_smile:
Josh

If the title is always determined by the contents of other fields then
arguably you should not store it at all as then you have redundant
data in the db. Just provide a method called title that calculates it
when required. If you want to store it for reasons of efficiency then
leave it till efficiency becomes an issue.

Colin

While generally I agree with the idea of "don't optimize prematurely",
there's a line between that idea and "don't write clearly slow code".
I think overriding body= here is a good idea, since the
before_validation callback will fire every time the record is saved,
even if body hasn't changed. Normally that's OK, but instantiating a
Nokogiri parser is quite a bit heavier than the typical validation
action...

--Matt

This works great so far. But I remarked that this way the data in any
Page object will only be valid after the call of valid? or safe (or any
other such related methods). So before doing this, it's possible that I
have invalid data in a Page object!

My question: is this OK? Or should I overwrite the body=() method or
something to make the "magic" happen as soon as the body itself changes?

Or is it agreed that modified AR objects can have temporary invalidities
as long as they're not safed/validated?

While generally I agree with the idea of "don't optimize prematurely",
there's a line between that idea and "don't write clearly slow code".

It is possible I skimmed the question rather too quickly and did not
notice the complexities of determining the title.

Colin