Hi all,
I've got an issue that only ever appears in production mode. I've got my app hooked up to amazon's web service (ECS, using ruby/amazon from Ruby/Amazon).
In the dev environment, everything's peachy. But in production mode, after I've left the app up for a couple of hours (sometimes, minutes), I get an exception:
Processing LibraryController#add_to_library (for 24.7.104.3 at 2007-09-11 07:36: 11) [POST] Session ID: 1bc197aadd3678c166a758604c072753 Parameters: {"commit"=>"Submit", "action"=>"add_to_library", "controller"=>"li brary", "query"=>{"media"=>"books", "asin"=>"quilt"}} Connection to amazon remade: + Connection reset by peer
Timeout::Error (execution expired): /usr/lib/ruby/1.8/timeout.rb:54:in `write0' /usr/lib/ruby/1.8/net/protocol.rb:151:in `write' /usr/lib/ruby/1.8/net/protocol.rb:166:in `writing' /usr/lib/ruby/1.8/net/protocol.rb:150:in `write' /usr/lib/ruby/1.8/net/http.rb:1542:in `write_header' /usr/lib/ruby/1.8/net/http.rb:1500:in `exec' /usr/lib/ruby/1.8/net/http.rb:1044:in `request' /usr/lib/ruby/1.8/net/http.rb:771:in `get' /lib/amazon/search.rb:973:in `get_page' /lib/amazon/search.rb:1013:in `search' /lib/amazon/search.rb:734:in `keyword_search' /app/models/asset.rb:98:in `amazon_search'
The timeout error comes from a timeout statement I've thrown around the whole thing to hopefully catch the error. Without it, the call just times out and I lose the thread completely and have to restart the mongrel cluster. Here's what that looks like:
retries = 3
#Errno::ECONNRESET begin Timeout::timeout(2) { products = search_method.call(key, mode).products } rescue # Errno::ECONNRESET, Errno::EPIPE => failure if retries > 0 @@req = Amazon::Search::Request.new(@@dev_token) retries -= 1 logger.error "Connection to amazon remade: #{$!}" retry else logger.error "Connection to amazon destroyed: #{$!}" raise end end
Here, search_method is just an alias to one of the Amazon::Search::Request methods. I'm totally lost here, if anyone can help. I have no idea why the whole thing only seems to retry once, and why I even get the connection reset at all while in production mode. I tries reducing my mongrel cluster to a single instance, thinking that maybe it was some weird multithreading issue that was causing it, but that dies too.
I'd appreciate any help. Thanks.