Dying server processes on Windows under load...

Hey there :slight_smile:

I've been testing RoR on a Windows server backed by MS SQL for a couple
of days. Yesterday, I was trying some automated tests, with this setup:

- 10 Mongrels on ports 4000-4009 on the server (Windows 2000 AS + MS
SQL 2000)
- No other webserver/balancer/proxy/whathaveyou

- 10 concurrent timed wget scripts running on my laptop requesting 1000
pages from each of the Mongrels on the server, the pages were fetching
an item with quite a few relating tables from the db.

Observing the task manager on the server, I witnessed each of the
ruby.exe processes die one after the other, that is, an overuse of cpu
time compared to the other yet-not-dying ruby.exe processes, and what
appeared to be memory leaking.

The point at which they died varied a lot, from failing on first
request to failing after hundreds of sound responses.

I tried, then, with 10 WEBrick servers instead of the 10 Mongrels, with
the exact same result.

Have you guys tried something similar? And is there any way for me to
find out where the processes fail? I mean, whether it's inside one of
my Rails scripts, inside the ado.rb mssql adapter, inside Rails logic,

Thanks in advance for any input,
Daniel :slight_smile:

First thing I’d try, honestly, is to use odbc instad of ado. If you can’t find code for how to do this, let me know and I’ll rip something out of one of my projects. You’ll need to set up a DSN on your machine, but it seems to perform better (and then it’s cross-platform if you ever want to move your app to Linux). I’ve seen the ADO stuff act flakey in our setups.

Oh… check your logs. And are you running in production mode or development mode?

Hi Brian, thanks for your replies :slight_smile:

Well, I found out what was up. I was barking up the wrong tree
completely! I tried ODBC in different flavors, tried MS SQL 2005, tried
porting the app to Linux to check if it had something to do with the
OS, then finally we got it right: It was a logical error on my part in
an ActiveRecord relation: We have a many-to-many relation with an extra
relation attached to it: Books have many Authors, but the Authors take
on different Roles. I had the relation to the roles owned by the
Author, but it should have been owned by the relation itself.

This had an unfortunate effect when stumbling across a fall-back Author
named "Mr. Unknown" - which is put in for god knows what reason :slight_smile: This
"guy" ate up over half a million roles, and the we start joining on
those... Oups. So, it wasn't dead processes, and it wasn't memory
leaking either, it was dumbness on my part. Thank god :wink: