Production mode bug with ruby/amazon

Hi all,

I've got an issue that only ever appears in production mode. I've got my app hooked up to amazon's web service (ECS, using ruby/amazon from Ruby/Amazon).

In the dev environment, everything's peachy. But in production mode, after I've left the app up for a couple of hours (sometimes, minutes), I get an exception:

Processing LibraryController#add_to_library (for 24.7.104.3 at 2007-09-11 07:36: 11) [POST]   Session ID: 1bc197aadd3678c166a758604c072753   Parameters: {"commit"=>"Submit", "action"=>"add_to_library", "controller"=>"li brary", "query"=>{"media"=>"books", "asin"=>"quilt"}} Connection to amazon remade: + Connection reset by peer

Timeout::Error (execution expired):     /usr/lib/ruby/1.8/timeout.rb:54:in `write0'     /usr/lib/ruby/1.8/net/protocol.rb:151:in `write'     /usr/lib/ruby/1.8/net/protocol.rb:166:in `writing'     /usr/lib/ruby/1.8/net/protocol.rb:150:in `write'     /usr/lib/ruby/1.8/net/http.rb:1542:in `write_header'     /usr/lib/ruby/1.8/net/http.rb:1500:in `exec'     /usr/lib/ruby/1.8/net/http.rb:1044:in `request'     /usr/lib/ruby/1.8/net/http.rb:771:in `get'     /lib/amazon/search.rb:973:in `get_page'     /lib/amazon/search.rb:1013:in `search'     /lib/amazon/search.rb:734:in `keyword_search'     /app/models/asset.rb:98:in `amazon_search'

The timeout error comes from a timeout statement I've thrown around the whole thing to hopefully catch the error. Without it, the call just times out and I lose the thread completely and have to restart the mongrel cluster. Here's what that looks like:

    retries = 3

    #Errno::ECONNRESET     begin       Timeout::timeout(2) { products = search_method.call(key, mode).products }     rescue       # Errno::ECONNRESET, Errno::EPIPE => failure       if retries > 0         @@req = Amazon::Search::Request.new(@@dev_token)         retries -= 1         logger.error "Connection to amazon remade: #{$!}"         retry       else         logger.error "Connection to amazon destroyed: #{$!}"         raise       end     end

Here, search_method is just an alias to one of the Amazon::Search::Request methods. I'm totally lost here, if anyone can help. I have no idea why the whole thing only seems to retry once, and why I even get the connection reset at all while in production mode. I tries reducing my mongrel cluster to a single instance, thinking that maybe it was some weird multithreading issue that was causing it, but that dies too.

I'd appreciate any help. Thanks.

I had all sorts of problems with this, and in the end wrapping the call to Amazon::Search::Request in a 10 second timeout was the best compromise I could manage:

def amazon_wrapper_method     Timeout::timeout(10) do         begin             request = Request.new(DEV_TOKEN,AMAZON_ASSOCIATES[country],country)             ...         end     end rescue Timeout::Error end

I'd originally tried a 5 second timeout, but had problems with the REXML parsing and the timeout, and so this at least cut down the errors, and in particularly Mongrel timing out the thread and subsequent probs that caused.

For what it's worth, I've now ditched Ruby/Amazon and made the move to ECS 4 (ECS 3 is being phased out early next year, so you'll need to do it pretty soon), and used the amazon-ecs gem (http://www.pluitsolutions.com/projects/amazon-ecs).

It's very lightweight, but partly because of that is a cinch to extend -- I just put it into vendor, added a file of methods to extend it (e.g. so you can do item.asin, item.title etc) and job done.

Hope this helps

Chris

Andrew Chen wrote: