ActiveRecord::Base.serialize and a model in a plugin

All right, I posted about this yesterday and I have to believe the
problem is not with AR's serialize method but with my particular
circumstance.

The two obvious issues are that

(1) I'm serializing a hash of 15 objects that contain seven strings
apiece. The objects were Structs, but in the interest of pulling the
definition out of a module and putting it in app/models, I've since
turned it into a non-AR Class because I thought maybe the issue was
that the Struct was defined in a library module.

models/idx_row.rb:

class IdxRow
  attr_accessor :image_url, :price, :beds, :baths, :sqft, :mls_id,
:mls_url

  def initialize
    @image_url = ""
    @price = ""
    @beds = ""
    @baths = ""
    @sqft = ""
    @mls_id = ""
    @mls_url = ""
  end
end

(2) The model I have the serialized attribute on is the Comment class
created by acts_as_commentable. The basic class definition is in the
plugin, not in app/models. I looked through the code and it looked like
the proper way to extend the class was through a module of my own, so I
created lib/commentable_extensions.rb and put it there. I added a TEXT
column called "feed", and wrote the following:

lib/commentable_extensions.rb:

module CommentableExtensions
  module Juixe
     module Acts #:nodoc:
      module Commentable #:nodoc:
        module ClassMethods
          def acts_as_commentable
            serialize :feed
          end
        end
      end
     end
    end
end

I've tried a few ways of loading the extension, currently an include in
application.rb, and when I load a Comment into c and assign one of my
hashes to c.feed, I get the same old SystemStackError when I attempt
c.save:

SystemStackError: stack level too deep
        from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:168:in `to_yaml'
        from c:/ruby/lib/ruby/1.8/yaml.rb:387:in `quick_emit'
        from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:164:in `to_yaml'
        from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:41:in `to_yaml'
        from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:40:in `to_yaml'
        from c:/ruby/lib/ruby/1.8/yaml/rubytypes.rb:39:in `to_yaml'
        from c:/ruby/lib/ruby/1.8/yaml.rb:387:in `quick_emit'.....

Though it's possible that my object is simply too complex for
#to_yaml's tiny brain, I'm reluctant to think that's the case with what
seems to me such a simple data structure. At this point I'm wondering
if the problem is the way I'm trying to declare the serialization of
:feed. Is there some other place or way I'm supposed to do this?

Thanks.

Much as I have trouble believing it, I'm becoming convinced that my
problem really, truly is that AR's serialize method or the #to_yaml
method behind it simply can't handle the hypercomplexity of a hash of
fifteen structs of seven strings. My code shown above serializes tiny,
simple objects like a champ (a single, small array or a single struct?
no problem). But when given 6 kilobytes of text nested two levels deep,
#to_yaml pops a gasket.

Either I'm missing something obvious, or this is kinda lame. I saw a
closed ticket in Rails Trac from two years ago in which I think Jamis
submitted a patch that extended AR Serialize to use Marshal and a
binary column optionally in place of YAML because of this very sort of
#to_yaml explosion whenever it was asked to do anything meaningful.
Anyone know if this is present and undocumented, or something that
fell by the wayside?

Okay, working through it myself. I can serialize a struct, but I can't
seem to serialize my custom, non-AR classes at all. This seems to be
the Insurmountable Problem. Am I crazy for wanting to serialize my own
classes and not just ruby core classes?

That is, I cannot say:

r = IdxRow.new
r.price = "$300"
c.feed = r
c.save

...but I can say

x = Struct.new(:foo, :bar, :baz, :feh, :fez, :fex)
r = x[1,2,3,4,5,6]
c.feed = r
c.save

I keep narrowing it down. Now I've banished my nice Struct and the
less-nice custom IdxRow class i created in its place, and instead I'm
now putting my data into a single hash containing 15 hashes of 7
strings apiece. No Procs in sight, nothing fancy, just a hash of hashes
of strings. I figured this would work for sure.

Nope. AR Serialize exploded like it always does when given anything
more complex than a one-liner.

Also, attempting Marshal.dump on this hash of hashes gives me an error
message that helpfully tells me I can't dump a Proc. Which would be
more helpful if I could see how a vanilla hash of vanilla hashes of
vanilla strings is being seen as containing a Proc.

hatless wrote:

Thanks for the help. Hmmm. I build my Hash of Hashes this way:

myhash = {}
i = 1
for row in [1,2,3]
  temphash = {}
  temphash['foo'] = "bar"
  myhash[i] = temphash
  i += 1
end

This code works. My actual code, which assigns string values to a bunch
more elements of the inner hashes the same way makes #to_yaml and
Marshal.dump() break.

The generated Hash and its members don't contain any object
identifiers. There are no custom constructors. Like the above code, the
real method results in an ordinary looking hash with integer keys and
Hash values, and inner Hashes with String keys and values.

Here's my actual code that returns the hash. I've only changed the
names of my keys so I don't give away whose HTML I'm scraping:

def fetch_idx_hash(obj)
  counter = 1 # 1..n instead of starting with zero, used as the hash
index.
  myhash = {}
  page = fetch_page(obj)
  soup = BeautifulSoup.new page
  soup.html.body.center.find_all('table')[5].find_all('tr').each do

tr>

    working_row = {}
    if tr.find_all('td')[1].string.include? "$"
      working_row['price'] = tr.find_all('td')[1].string
    end
    unless tr.find_all('td')[0].a.nil?
      working_row['fruit'] = tr.find_all('td')[0].a.string
      working_row['vegetable']= "http://www.foo.com" +
tr.find_all('td')[0].a['href']
    end
    working_row['starch'] = tr.find_all('td')[3].string
    working_row['meat'] = tr.find_all('td')[4].string
    working_row['cake'] = tr.find_all('td')[5].string
    # This must be the last member populated. Rows w/o images get
discarded.
    unless tr.img.nil?
      working_row['image_url'] = tr.img['src']
      myhash[counter] = working_row
      counter += 1
    end
  end
  return myhash
end

Calling to_yaml on the resulting hash blows up with a SystemStackError.
Calling Marshal.dump() carps about a Proc. I'm wondering if after
everything I've done to simplify the data structure, it's just too big.
Incidentally, none of the member strings exceed 300 characters, and
most are small. A typical myhash.to_s.size is about 7K.

zdennis wrote:

All right! Solved!

Turns out some of the methods in RubyfulSoup -- the RubyfulSoup#string
method, in fact -- don't return Strings. They return NavigableStrings,
as I discovered when I ran

  myhash[1].values.each {|v| puts "#{v}: #{v.class}"}

All was good once I changed things like:

  working_row['fruit'] = tr.find_all('td')[0].a.string

to

  working_row['fruit'] = tr.find_all('td')[0].a.string.to_s

ARGH!

Now myhash serializes fine. I'm surprised it doesn't seem to
deserialize automagically and I have to YAML.load() it, but that's for
another thread (or not).

It didn't help that the culprit objects rendered like normal Strings.
Thanks again for the help and thanks to anyone sitting through a
5-message thread of me talking myself through it.

hatless wrote: