I presume most people here read today's article on Slashdot which had
some critique about Ruby and scaling to a large architecture. Though
the article didn't seem to elaborate into specifics, I'm interested in
other people's feedback and perspective on this.
I'm currently learning Ruby. One of the first questions I had (and
Googled for) had to do with scalability, for large enterprise-class
applications. I found a couple of articles, but haven't yet tested
this in a lab setting. Then there is Parrot, which I've not used
yet.
Due to the growing popularity of Ruby and Rails, I would imagine this
would be of importance. Pardon if I've missed something (I have
searched, and am searching) - that being the case, URLs to articles
would be appreciated.
Also how would such any approaches differ from that of how JAVA scales
with VM.
I haven't read the Slashdot article yet -- I *seriously* doubt that
there's any new information in it, however. Here's *my* view of the
subject.
1. The Ruby 1.8 interpreter, aka MRI, does have some bottlenecks in it
that could be alleviated. Most notable in the context of Rails are the
garbage collector and method dispatch.
2. ActiveRecord also has some bottlenecks in it that could be
alleviated. I'm not at all familiar with them, but there are folks on
this list actively working on them and I trust that they will pipe up
and respond, assuming that their work is open source.
3. There are several alternative implementations of Ruby 1.8, pretty
much all of which have improved Rails performance as a goal. The
farthest along is jRuby, an implementation of Ruby on the Java Virtual
Machine. The next farthest along is IronRuby on the Microsoft Dynamic
Language Runtime, followed very closely by Rubinius. Parrot appears to
have fallen behind, although I would hope a pleasant surprise will
emerge from that camp.
4. Finally, Ruby 1.9 is due out the end of this year. I haven't seen
any *Rails* benchmarks for it yet, but for other benchmarks, Ruby 1.9
kicks serious butt.
So ... watch this space ... I'll go read the Slashdot article.
OK ... I went to Slashdot and all I found was a pointer to a blog post
where the proprietor of CD Baby described how and why he moved from
PHP to Rails and back again.
I didn't see any critiques about scaling Rails to a large architecture...
it was an article by Derek of CDBaby, saying how he finally gave up trying
to port his existing PHP application to Rails, and went back to PHP.
Nothing about scaling at all - just that (a) Rails can't do anything PHP
can't do (which is true for any two languages, really), and (b) he had to
reinvent large parts of Rails to get it to do what he wanted, and (c)
migrating an existing application to a new framework is a pain, and doubly
so with Rails which likes to make a lot of assumptions about how data is
laid out. All true.
Back to scaling:
The honest truth is that very few Rails sites have grown to the point where
they have to worry about scaling in an enterprise-y sense. Some of the
larger sites have dealt with multiple front-end web servers. Some of the
very large sites have dealt with read-only database replication to multiple
back-end servers. Twittr, probably the largest, has faced their own unique
issues. But there aren't a lot of AOL/Google/eBay/Amazons in the Rails
world.
Part of it is that Rails sites tend to be database-y, not message-y.
There's not a lot of shared state and concurrency, which is always the fun
part of scaling. Part of it is that Rails is (a) relatively new and (b)
not good for porting existing sites; therefore, very little written in
Rails has had the time to go from zero to Amazon.
The 1.8 ruby interpreter, as znmeb said, is fairly slow. JRuby is catching
up to it, and in some cases exceeding it, and you can run JRuby-on-Rails
today. And over the next year, there will be even more options.
Only you can know if the interpreter speed of the front-end web server is
going to be a bottleneck in your site.
Here are some helpful Google terms to research Rails scaling:
Rails is just a conventional architecture with a web front end and a database backend. As with all such systems, the biggest challenge is the database layer. As usual, the key is to make the app read-heavy and build multiple readonly slaves. Brutally cut out all unecessary writes. For example, if you’re using the db to log impressinos, clickstream, etc., that’s gotta go - either to a different db or using lazy logging (
e.g. syslog, udp fire and forget). [I posted earlier, haven’t heard back yet: Which is better: acts_as_readonlyable or mysql-proxy or other?]
In practice, the AJAX part of a Rails application (or any other web-app) is what can make or break scalability: If every click, drag, etc. leads to db updates and inserts (for example: resequencing a list by dragging and dropping an element → update for each element of the list), then clearly the app will not scale. On the other hand the AJAX layer could actually help scalability by making such updates less frequently than a non-ajax app.
It isn't just the database interactions you have to watch out for with
Ajax. You also need to pay attention to
a. Network traffic, number and size of network interactions with the
server, and how these interplay with WAN bandwidth from the client to
the server. Now of course, *everybody* has at least 10 mbits download
and 768 kbits upload speed that uses *your* app.
b. Client-side processor and RAM resources. Now of course, *everybody*
has a 2 GB dual-core 2 GHz PC that uses *your* app.
In short, test driven development *must* include load and scalability
testing using *realistic* network bandwidth and full client user
interface testing.