mod_rails fedora core

ive found a couple other posts with similar errors, but they turn out to be permissions which doesnt seem to be the case here. any ideas?

first it does something like this :

[30140:Hooks.cpp:370] Processing HTTP request: / [28940:ApplicationPoolClientServer.h:426] Client 0x2aaaaad68b30: received message: ['get', '/opt/app', 'true', 'nobody'] [28946:Hooks.cpp:370] Processing HTTP request: / [28940:ApplicationPoolClientServer.h:426] Client 0x2aaaaad68b30: received message: ['get', '/opt/app', 'true', 'nobody']

then it starts doing this and the app is completely unresponsive :

[28940:Application.h:274] Application 0x2aaab5d47510: destroyed. [28940:StandardApplicationPool.h:249] Cleaning idle app /opt/app (PID 22116)

any help would be great. followed the screenast ( as if that can be screwed up ).

FC7 64-bit rails 2.0.2

thanks in advance

..

Hi.

These are just debugging messages, and are not errors. It seems that Apache is doing just fine. Have you already checked your Rails logs though?

These are just debugging messages, and are not errors. It seems that Apache is doing just fine. Have you already checked your Rails logs though?

thanks for the reply.

i thought that might be the case.

basically, im load testing with apachebench and at some point ab and the mod_rails server just stop communicating. no log data anywhere. ( no tcpdump activity either on the test machine or the rails server )

i am able to open a browser on a third machine and hit the app.

strangely, i think i can also run another ab test on the original ab testing server ( a second instance ) while the original ab test is hanging and the second test does the same thing. runs for a bit. hangs.

so it kind of does and kind of doesnt seem like an apache issue?

strangely, i think i can also run another ab test on the original ab testing server ( a second instance ) while the original ab test is hanging and the second test does the same thing. runs for a bit. hangs.

also, something i dont get is :

./ab -c 100 -n 500 http://192.168.1.53 ./ab -c 500 -n 500 http://192.168.1.53 ./ab -c 10 -n 10000 http://192.168.1.53

all complete, however :

./ab -c 100 -n 1000 http://192.168.1.53

hangs.

an apache restart on the server will allow the tests to complete. so its either apache or mod_rails?

ive worked the client settings in apache, but still no luck

theres absolutely no log messages, and even all tcpdump stops.

so any help would be great.

thanks ..

ok, so was able to get all the testing to finish.

i added :

net.ipv4.tcp_keepalive_time = 300 net.ipv4.tcp_max_orphans=1000

to sysctl. though apachebench completes at all levels of concurrency and connections now, i still find this suspect because the same test to the same machine using 12 mongrels instead of mod_rails works fine. as does nginx / mongrel ( which doesnt seem to have a direct relationship to OS settings here )

wierd.

also, there is no logging of _any_ type anywhere on the system indicating any issues. everything just stops. and an apache reload allows the test to finish.

oh well

mod_rails is very slick stuff so far

..

Fdsa Fdsa wrote:

ok, so was able to get all the testing to finish.

i added :

net.ipv4.tcp_keepalive_time = 300 net.ipv4.tcp_max_orphans=1000

I'm having similar problems - see Mod_rails kills my Apache - Rails - Ruby-Forum - but I haven't solved it yet.

One thing you should check though is: Is Apache running mpm-worker or mpm-prefork?

According to Hongli Lai, mod_rails only supports mpm-prefork at the moment.

If you have found out anything else, I would very much like to know about it.

- Carsten

If you have found out anything else, I would very much like to know about it.

i read your posts with hongli and those are similar but different. since i simply couldnt get the reliability i needed with the proc tweaks etc. i just upgraded the box(es) to a quad core with more ram and that seems to have solved at least the

./ab -c 100 -n 1000 http://192.168.1.53

problem...for now.

interestingly, i came across this post last week or so :

http://poocs.net/2006/3/27/the-adventures-of-scaling-stage-3

and towards the end there this quote :

"Using tcpdump to monitor the traffic on the listener ports showed.. nothing. Not a single byte crossing the line. Using strace to check what the “stuck” listener is busy doing showed it sitting there in “Waiting..” state. Also doing nothing.

Now the stunning part: If you restart lighttpd or the dispatcher things start working again. In the end, this didn’t indicate either side as being responsible for the hang and we started looking elsewhere."

which is exactly the problem im having but completely architecturally different and two years later. bizarre.

they also started meddling around in proc except their changes had little to no effect from the sounds of things.

i knew about that mpm-prefork sorta, but since the default apache install seemed to work, i havent yet concerned myself with it.

is there a release date for 1.1? im on 1.0.5, but havent been able to reproduce the strange hanging which i assume is because of the quad core

actually, three quad core machines load balanced.

ill have more information available next month when we begin serious load testing. were going live in august from java to rails and our site is 125,000+ page views per day. the money is on passenger for now, but a last second move to mongrel may be inevitable

well see

..