ActiveRecord::Base.serialize and a model in a plugin

All right, I posted about this yesterday and I have to believe the problem is not with AR's serialize method but with my particular circumstance.

The two obvious issues are that

(1) I'm serializing a hash of 15 objects that contain seven strings apiece. The objects were Structs, but in the interest of pulling the definition out of a module and putting it in app/models, I've since turned it into a non-AR Class because I thought maybe the issue was that the Struct was defined in a library module.

models/idx_row.rb:

class IdxRow   attr_accessor :image_url, :price, :beds, :baths, :sqft, :mls_id, :mls_url

  def initialize     @image_url = ""     @price = ""     @beds = ""     @baths = ""     @sqft = ""     @mls_id = ""     @mls_url = ""   end end

(2) The model I have the serialized attribute on is the Comment class created by acts_as_commentable. The basic class definition is in the plugin, not in app/models. I looked through the code and it looked like the proper way to extend the class was through a module of my own, so I created lib/commentable_extensions.rb and put it there. I added a TEXT column called "feed", and wrote the following:

lib/commentable_extensions.rb:

module CommentableExtensions   module Juixe      module Acts #:nodoc:       module Commentable #:nodoc:         module ClassMethods           def acts_as_commentable             serialize :feed           end         end       end      end     end end

I've tried a few ways of loading the extension, currently an include in application.rb, and when I load a Comment into c and assign one of my hashes to c.feed, I get the same old SystemStackError when I attempt c.save:

SystemStackError: stack level too deep         from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:168:in `to_yaml'         from c:/ruby/lib/ruby/1.8/yaml.rb:387:in `quick_emit'         from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:164:in `to_yaml'         from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:41:in `to_yaml'         from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:40:in `to_yaml'         from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:39:in `to_yaml'         from c:/ruby/lib/ruby/1.8/yaml.rb:387:in `quick_emit'.....

Though it's possible that my object is simply too complex for #to_yaml's tiny brain, I'm reluctant to think that's the case with what seems to me such a simple data structure. At this point I'm wondering if the problem is the way I'm trying to declare the serialization of :feed. Is there some other place or way I'm supposed to do this?

Thanks.

Much as I have trouble believing it, I'm becoming convinced that my problem really, truly is that AR's serialize method or the #to_yaml method behind it simply can't handle the hypercomplexity of a hash of fifteen structs of seven strings. My code shown above serializes tiny, simple objects like a champ (a single, small array or a single struct? no problem). But when given 6 kilobytes of text nested two levels deep, #to_yaml pops a gasket.

Either I'm missing something obvious, or this is kinda lame. I saw a closed ticket in Rails Trac from two years ago in which I think Jamis submitted a patch that extended AR Serialize to use Marshal and a binary column optionally in place of YAML because of this very sort of #to_yaml explosion whenever it was asked to do anything meaningful. Anyone know if this is present and undocumented, or something that fell by the wayside?

Okay, working through it myself. I can serialize a struct, but I can't seem to serialize my custom, non-AR classes at all. This seems to be the Insurmountable Problem. Am I crazy for wanting to serialize my own classes and not just ruby core classes?

That is, I cannot say:

r = IdxRow.new r.price = "$300" c.feed = r c.save

...but I can say

x = Struct.new(:foo, :bar, :baz, :feh, :fez, :fex) r = x[1,2,3,4,5,6] c.feed = r c.save

I keep narrowing it down. Now I've banished my nice Struct and the less-nice custom IdxRow class i created in its place, and instead I'm now putting my data into a single hash containing 15 hashes of 7 strings apiece. No Procs in sight, nothing fancy, just a hash of hashes of strings. I figured this would work for sure.

Nope. AR Serialize exploded like it always does when given anything more complex than a one-liner.

Also, attempting Marshal.dump on this hash of hashes gives me an error message that helpfully tells me I can't dump a Proc. Which would be more helpful if I could see how a vanilla hash of vanilla hashes of vanilla strings is being seen as containing a Proc.

hatless wrote:

Thanks for the help. Hmmm. I build my Hash of Hashes this way:

myhash = {} i = 1 for row in [1,2,3]   temphash = {}   temphash['foo'] = "bar"   myhash[i] = temphash   i += 1 end

This code works. My actual code, which assigns string values to a bunch more elements of the inner hashes the same way makes #to_yaml and Marshal.dump() break.

The generated Hash and its members don't contain any object identifiers. There are no custom constructors. Like the above code, the real method results in an ordinary looking hash with integer keys and Hash values, and inner Hashes with String keys and values.

Here's my actual code that returns the hash. I've only changed the names of my keys so I don't give away whose HTML I'm scraping:

def fetch_idx_hash(obj)   counter = 1 # 1..n instead of starting with zero, used as the hash index.   myhash = {}   page = fetch_page(obj)   soup = BeautifulSoup.new page   soup.html.body.center.find_all('table')[5].find_all('tr').each do

tr>

    working_row = {}     if tr.find_all('td')[1].string.include? "$"       working_row['price'] = tr.find_all('td')[1].string     end     unless tr.find_all('td')[0].a.nil?       working_row['fruit'] = tr.find_all('td')[0].a.string       working_row['vegetable']= "http://www.foo.com" + tr.find_all('td')[0].a['href']     end     working_row['starch'] = tr.find_all('td')[3].string     working_row['meat'] = tr.find_all('td')[4].string     working_row['cake'] = tr.find_all('td')[5].string     # This must be the last member populated. Rows w/o images get discarded.     unless tr.img.nil?       working_row['image_url'] = tr.img['src']       myhash[counter] = working_row       counter += 1     end   end   return myhash end

Calling to_yaml on the resulting hash blows up with a SystemStackError. Calling Marshal.dump() carps about a Proc. I'm wondering if after everything I've done to simplify the data structure, it's just too big. Incidentally, none of the member strings exceed 300 characters, and most are small. A typical myhash.to_s.size is about 7K.

zdennis wrote:

All right! Solved!

Turns out some of the methods in RubyfulSoup -- the RubyfulSoup#string method, in fact -- don't return Strings. They return NavigableStrings, as I discovered when I ran

  myhash[1].values.each {|v| puts "#{v}: #{v.class}"}

All was good once I changed things like:

  working_row['fruit'] = tr.find_all('td')[0].a.string

to

  working_row['fruit'] = tr.find_all('td')[0].a.string.to_s

ARGH!

Now myhash serializes fine. I'm surprised it doesn't seem to deserialize automagically and I have to YAML.load() it, but that's for another thread (or not).

It didn't help that the culprit objects rendered like normal Strings. Thanks again for the help and thanks to anyone sitting through a 5-message thread of me talking myself through it.

hatless wrote: