High CPU load upon upgrade to 2.3.5

After recently updating a rather large app from 2.0.2 to 2.3.5
(running on ruby 1.8.6, latest apache and passenger) the CPU used by
the ruby process would jump to 100% for a good 3 secs and locked up
the server within minutes. I was able to roll back the changes easily
enough, but I am now pulling out my hair trying to figure what the
issue might be.

How can I figure out what it is in my large code base that has run
perfectly smoothly on 2.0.2 for years, but now running on 2.3.5 causes
the CPU to jump to 100% and render the app unusable? (The app runs
fine and all my tests pass, it just crumbles under any load)

Thanks...

Tim W wrote:


After recently updating a rather large app from 2.0.2 to 2.3.5
(running on ruby 1.8.6, latest apache and passenger) the CPU used by
the ruby process would jump to 100% for a good 3 secs and locked up
the server within minutes. I was able to roll back the changes easily
enough, but I am now pulling out my hair trying to figure what the
issue might be.
How can I figure out what it is in my large code base that has run
perfectly smoothly on 2.0.2 for years, but now running on 2.3.5 causes
the CPU to jump to 100% and render the app unusable? (The app runs
fine and all my tests pass, it just crumbles under any load)
Thanks...

Anything in
the logs. Does it work the same way in production as in development.
Have you tried it with another server (like mongrel or webrick) to see
if there is any interaction with passenger… I am a brute force
debugger…I would just add some logger calls in the likely places to
see what is going on… do any of the gems you use need to be updated
for 2.3.5 and did any of them change??

Norm

Having been through similar questions I’d say…

Use passenger-status (called frequently) to check the global wait queue.

Use newrelic to drill down to slowest actions (and from there where the time is spent).

Analyse the logs for slow requests (and what parameters were sent in/where the time was spent).

I have a repo of some potentially useful utilities at - http://github.com/andyjeffries/rails-analysis-tools

rails_log_analyser might be useful (it summarises requests from a passenger log and then enables you to output just the log for that request which can be difficult when different request lines overlap). passenger_mon_grapher may also be useful (in reference to my first point) but you’ll need to write your own sampler to parse the output of passenger-status and write it to a CSV file every few seconds.

Cheers,

Andy

After recently updating a rather large app from 2.0.2 to 2.3.5
(running on ruby 1.8.6, latest apache and passenger) the CPU used by
the ruby process would jump to 100% for a good 3 secs and locked up
the server within minutes. I was able to roll back the changes easily
enough, but I am now pulling out my hair trying to figure what the
issue might be.

How can I figure out what it is in my large code base that has run
perfectly smoothly on 2.0.2 for years, but now running on 2.3.5 causes
the CPU to jump to 100% and render the app unusable? (The app runs
fine and all my tests pass, it just crumbles under any load)

There was a change between 2.3.3 and 2.3.4 where escape_once (used
among other things by form helpers) became quite a bit more expensive,
in order to fix a security bug. If your app is ok on 2.3.3 but not on
2.3.4 then that's probably the root cause

Fred