Will RoR Scale?

Advice from smart and well-intentioned people notwithstanding, I firmly believe you should think about scale early and often. Google thought about scale early - their very name is scale.

My deep concern is that those who work on improving the Ruby on Rails framework may have this “worry about scale later” mentality. I’m concerned that many of the programming paradigms in Rails lead to unscalable practices like “transparent” joins that novice programmers won’t know to avoid (e.g. image.user.comments.categories - in views no less). It would be very helpful if RoR actually solved some of the tough scale issues by, for example:

  • Including support for Amazon’s Simple DB service
  • Intelligently unwrapping joins in AR layer, or allowing/encouraging developers to do that
  • Providing a mysql proxy (better than mysql-proxy) which is scriptable in Ruby, that provides for db load-balancing, failover, multi-master, master-per-model, index-modulo multi-master, etc. acts_as_readonlyable is a great start, but you can only scale so far without multi-master.
  • Support for lazy insertion, insert ignore, insert later, insert many, batch insertion, etc. Perhaps “acts_as_batched” ?
  • Providing better examples and support for logging using syslog over udp (there actually may be plenty of support for this already, I just haven’t had time to dig into it yet). Educate/encourage through examples how to write certain log-type info to logs instead of writing to db. m

  - Providing a mysql proxy (better than mysql-proxy) which is

You may find this useful...

http://blog.rapleaf.com/dev/?p=5

I guess the main difference between mysql-proxy and things like this and acts_as_readonlyable is that mysql-proxy acts at the mysql layer. You can connect to mysql-proxy from the command line just as if it were a mysql server. It allows you to say things like “if it’s an INSERT or UPDATE and the table is users, then use this db” - thus it hides those details from the RoR code, which may be kept clean, simple, and flexible if/when you need to add a slave, move a master, etc.

My only real problem with mysql-proxy is lua - what’s up with that? ;^)

Just my $0.02,

m

Marc Byrd wrote:

Advice from smart and well-intentioned people notwithstanding, I firmly believe you should think about scale early and often. Google thought about scale early - their very name is scale.

My deep concern is that those who work on improving the Ruby on Rails framework may have this "worry about scale later" mentality. I'm concerned that many of the programming paradigms in Rails lead to unscalable practices like "transparent" joins that novice programmers won't know to avoid (e.g. image.user.comments.categories - in views no less).

I'm quite sure Google architects knew these pitfalls and worked out by themselves how to avoid them... As they simply *coded* their own framework (and probably several times along their path).

If you really believe you can throw a "scaling" framework into the hands of "novice programmers" and expect them to code something that scales anything like the Google underlying architecture, just wait a minute, I'll fetch a good seat and popcorn to watch the show. I'm sure I'll find it either quite instructive or good laughing material :slight_smile:

Good luck,

Lionel

My experience taught me to think about it often, but code for it only when needed.

Just build it sounds cool to some and stupid to others, but it does works.

Until you have real people smashing around in your carefully crafted application, you will never know where and more importantly when to address scaling.

You can assume it is in location X, Y, or Z, but who knew users of the application would be clicking on A, B, C like hamsters on a treadmill.

When I started with Rails, it was only a few weeks after David and 37Signals released it to the world. At the time my application ran under Apache, FastCGI and MySQL. That was the entire technology chain using 1 web server and 1 database server.

Today, the very same application runs on 5 servers, using MySQL, Nginx, Mongrel, Amazon S3, a dedicated email platform and numerous back processing scripts on dedicated servers.

Some of my scaling challenges:

- Concurrent users blocking others with long running actions, which are now offloaded to the background. - Storing large objects in the database, are instead sent to Amazon S3 or the filesystem and only store metadata in the database - Unreliable stability of FastCGI under heavy loads, are now running on proxy mongrels - Slow loading of images and javascripts, are now served statically outside application using a couple lines of Nginx configuration - Long render times for large pages were retooled to avoid slow methods like url_for and more intelligently paginated - Sluggish performance when using MySQL Fulltext searches, now use an external search tool like Sphinx

After several years of real world use by customers, I now have enough experience to see some issues down the road.

Rails can and will scale if you think of it as only one part of the solution.

If you think every tool needed is in the framework itself, your application will not scale.