May AR objects contain "invalid" data when not saved yet?

Hi all

I have a question about good ActiveRecord style. I have a model Page with two attributes, named "title" and "body". The title is never set by the user directly, but is extracted from the body, which includes an HTML structure like the following:

<h1>heading 1.1</h1> <p>content</p> <h2>heading 2.1</h1> <p>content 2</p> <h2>heading 2.2</h1> <p>content 2</p> ...etc...

The page model should automatically extract the content from the H1 element(s) and store them in the attr_protected title attribute.

At the moment I do this the following way:

...   def before_validation     self.title = detect_title   end private   def detect_title     Nokogiri::HTML(self.body).xpath('//h1').collect(&:content).to_sentence   end ...

This works great so far. But I remarked that this way the data in any Page object will only be valid after the call of valid? or safe (or any other such related methods). So before doing this, it's possible that I have invalid data in a Page object!

My question: is this OK? Or should I overwrite the body=() method or something to make the "magic" happen as soon as the body itself changes?

Or is it agreed that modified AR objects can have temporary invalidities as long as they're not safed/validated?

Thanks a lot for help :slight_smile: Josh

If the title is always determined by the contents of other fields then arguably you should not store it at all as then you have redundant data in the db. Just provide a method called title that calculates it when required. If you want to store it for reasons of efficiency then leave it till efficiency becomes an issue.

Colin

While generally I agree with the idea of "don't optimize prematurely", there's a line between that idea and "don't write clearly slow code". I think overriding body= here is a good idea, since the before_validation callback will fire every time the record is saved, even if body hasn't changed. Normally that's OK, but instantiating a Nokogiri parser is quite a bit heavier than the typical validation action...

--Matt

This works great so far. But I remarked that this way the data in any Page object will only be valid after the call of valid? or safe (or any other such related methods). So before doing this, it's possible that I have invalid data in a Page object!

My question: is this OK? Or should I overwrite the body=() method or something to make the "magic" happen as soon as the body itself changes?

Or is it agreed that modified AR objects can have temporary invalidities as long as they're not safed/validated?

While generally I agree with the idea of "don't optimize prematurely", there's a line between that idea and "don't write clearly slow code".

It is possible I skimmed the question rather too quickly and did not notice the complexities of determining the title.

Colin