Need info on RoR for big e-commerce project...

Hey there :slight_smile:

I'm about to start development on a project to build a new online book store for a small retailer and publisher.

The system, feature-wise, is pretty much Amazon, and the data that's backing it is a catalog of more than 2½ million items.

To back the Amazon-like functionality (like "other users bought" etc.) we'll be tracking user behavior, offering customized news letters, etc. etc. etc.

So, the data is immense, and data handling is quite significant, too. The system will be running on an IIS server with an MS SQL Server. That's just how it is, as this is the setup the business is using, and they're not interested in migrating.

We actually already decided on TYPO3 as our base system, after a longer research into different CMS'es, CMF's and eCommerce systems.

Then, by chance, I stumbled across Ruby on Rails. My brother told me about it once, how students at his design school were using it to build applications in no time. Considering the programming skills of designers at his school, and the way he talked about it, I got the impression that it was more a toy than a serious development tool, so I never really considered it for this project.

I then grabbed a book on the subject and read a few pages. I was intrigued. We spoke about it as a possible last-minute candidate, and decided that I should give it a few days of scrutiny to determine its viability for our project. I'm currently playing around with RoR tutorials, and I'm extremely excited. This thing is so intuitive and easy, it's amazing.

But then, I see the performance comparisons at shootout.alioth.debian.org, and compared to PHP (which would be the alternative), it's not exactly impressive. But, on the other hand, it's not like I'm going to be doing much of the stuff that these benchmarks do, like create mandelbrots etc. Basically, we need to get and put data, much like any other web site out there. Data processing in scripts is probably minimal (?), as most of it can be handled by stored procedures (or whatever it's called in MSSQL terminology).

But still, I need to know about scalability. Can anyone with sufficient insight help me out with this? Although this system should mimick Amazon's feature list, the userbase is significantly lower. In 2005, there was an average of 760 visitors per day, so the odds of having something like 100 hits at the same time are marginal. 2,5+ million items plus all the candy, yes, but not an extreme amount of users.

So is RoR fast enough?

Someone mentioned problems with MSSQL with a lot of concurrent users (amateur that I am, I forgot to bookmark it). Now, this guy may just have made crappy code, his network was congested, his hardware crappy, I don't know... But are there any known problems with the MSSQL "wrapper" in RoR? And how does it handle scaling in general?

So is RoR stable enough?

And how about eCommerce add-ons? A lot of other systems, like TYPO3 and Drupal have eCommerce-functionality available as add-ons. Does something like this exist for RoR?

Links to similar big sites, preferably eCommerce but any big site really, would be great! So that I can show the big guys with the money that RoR can do this, because subjectively speaking, it makes my belly tinkle inside, it's really really that sweet...

Thanks in advance for any help!

Daniel Buus, Denmark :slight_smile:

I would recommend building a sample app that pulls plenty of data from your MSSQL server and benchmarking it. From the tests I've done. RoR can run very quickly, especially if you implement action and fragment caching. I'm running on lighttpd, so I don't know about implementing Rails with IIS (are you sure you can't convince them to switch web servers?), but I've never seen any stability problems with a properly written app.

As far as ecommerce, you may want to check out Substruct <http://dev.subimage.com/projects/substruct&gt;\.

Jason

Daniel Smedegaard Buus wrote:

A)

But still, I need to know about scalability. Can anyone with sufficient insight help me out with this? Although this system should mimick Amazon's feature list, the userbase is significantly lower. In 2005, there was an average of 760 visitors per day, so the odds of having something like 100 hits at the same time are marginal. 2,5+ million items plus all the candy, yes, but not an extreme amount of users.

So is RoR fast enough?

As most of your heavy lifting (dealing with the 2.5 million items, creating 'Other users looked at', etc) will be done on the db, should be able to handle the rest of the requirements (fetching, displaying and putting data) as well as any web app framework. It's unlikely to be the fastest possible solution, but should easily handle your needs.

In addition, rails provides very easy-to-use caching functionality that can help reduce load on your db.

Someone mentioned problems with MSSQL with a lot of concurrent users (amateur that I am, I forgot to bookmark it). Now, this guy may just have made crappy code, his network was congested, his hardware crappy, I don't know... But are there any known problems with the MSSQL "wrapper" in RoR? And how does it handle scaling in general?

I maintain the SQL Server adapter, so if you find that link again I'd be very keen to look at it. In general we've had few issues using SQL Server for long-running but low-volume internal applications. I would recommend you use ODBC and steer clear of the ADO mode though - the ADO dbi drivers are not nearly as mature as the ODBC ones. Using ODBC, I've never encountered any real stability issues.

So is RoR stable enough?

The one ingredient in your setup I'm unfamiliar with is IIS, so I can't really tell you how it will all fit together. I'd suggest setting up a test environment (iis, mongrel cluster, db, app that hits the database in some trivial way) and hammering it with a testing tool to see how many requests it can handle. If you could take difficult request from your existing app and try and trivially model that, it should provide you with all the information you'd need.

Tom

PS Information on setting up IIS and mongrel can be found here - I've never done it (we use Apache) but it looks fairly straightforward.

http://www.napcsweb.com/howto/rails/deployment/railsonIISWithMongrel.pdf

Have a look at http://shopify.com, an e-commerce hosting provider, written in Rails.

My gut feeling and the herd opinion too is that Rails is perfectly scalable beyond your needs. As long as you dont burn CPU frivolously, your individual web servers will never be CPU pegged - you will max out your database first in a properly architected system.

To get to Amazon levels, no web framework will suit your really, you will have to write your own architecture from scratch.

<snip>

So, the data is immense, and data handling is quite significant, too. The system will be running on an IIS server with an MS SQL Server. That's just how it is, as this is the setup the business is using, and they're not interested in migrating.

We actually already decided on TYPO3 as our base system, after a longer research into different CMS'es, CMF's and eCommerce systems.

Stop right there. You've got an immense amount of data in an existing backend system and the customer has already constrained you to use IIS and MS SQL server and you're already going with TYPO3.

Adding rails to the mix would be an integration nightmare. It doesn't run as well on windows, it doesn't really like legacy databases, it is limited behind IIS (but talk to Brian Hogan), and Ruby is just dog slow on windows. Throw in nastiness like trying to get IIS to properly bounce between TYPO3, your app, and whatever legacy stuff they have now and you're in for some serious trouble.

If you go with Rails in your case it'll take you much longer than if you just went with straight up Microsoft tools. Sorry to say it, but Rails is totally the wrong tool for the job.

Take a look at the monorail project for .NET if you really want to be fancy: http://www.castleproject.org/index.php/MonoRail

I'd read it that he'd decided on TYPO3, then stumbled across RoR, and was considering using RoR *instead* of TYPO3. If the plan is to integrate the two together (alongside backend systems and what-not), then everything Zed wrote is correct - don't do it, just go with the Microsoft flow.

If (as I'd originally assumed) the question was whether it's feasible to run *just* RoR on an IIS/SQL Server setup (with no other integration), I'd say it was worth looking into. We've successfully integrated legacy SQL Server DBs into windows hosted RoR apps with few major issues. On the whole, the process has been very pleasant (particularly now we have Mongrel - thanks Zed).

Tom

It seems to me like the foreign_key_associations plug-in (or something like it) should be folded into Active Record. Then Rails could just stipulate (we like to be opinionated) that a foreign key constraint must be set to unique if there is a has_one relationship through that foreign key that is to be picked up auto-magically, otherwise it's an auto_magic has_many.

This would take care of that point made in Agile_v1: "We have to give Active Record a little help when it comes to intertable relationships. This isn't really Active Record's fault--it isn't possible to deduce from the schema what kind of intertable relationships the developer intended" (p226).

O.k., there are a "few" issues (habtm, :through, etc.) but this seems clean and on the right track.

My 2 cents for the day.

--Russ

I can help with regards to IIS…

Rails will not work with IIS. You need to get something else behind it. There are ways to hack it, but you will find very quickly that it does not hold up well, if at all.

The solution is simple. In the upcoming Rails Deployment book, I outline a few possible solutions where you run a Rails app server behind IIS and use a combination of a reverse-proxy plugin for IIS along with a custom plugin I developed.

We use Mongrel for our smaller sites (only one instance) and it runs VERY well for us. Our larger instances use an old Windows port of Pen which proxies to several Mongrels. IIS forwards to Pen which forwards to Mongrel. Static files are still served by Mongrel which isn’t fast but is fast enough for our needs.

Need faster? Throw VMWare on your Windows machine, download one of the free official Ubuntu 6.06 server images, fire that up, install Rails, mongrel, etc and then use IIS to forward requests to that. I want to include a section on that in the book but I don’t know if we still have time to cram that in.

The best solution is to run your apps on a dedicated Linux machine and use IIS to pass requests to that machine, or even better, use Linux only.

If you have to use SQL Server, that makes things a little more complicated on the Linux side just because of the connection libraries you need to install. Wut we’ve had very few problems with SQL Server that we can’t get around, and those have been related to our databases themselves.

Daniel Smedegaard Buus wrote:

The system, feature-wise, is pretty much Amazon, and the data that's backing it is a catalog of more than 2½ million items.

[SNIP]

So, the data is immense, and data handling is quite significant, too.

Assuming you'll be able to push most of the heavy lifting data-wise to the DBMS, and therefore won't have to retrieve and process millions of rows in the application layer, your amount of data shouldn't be a big problem.

The system will be running on an IIS server with an MS SQL Server. That's just how it is, as this is the setup the business is using, and they're not interested in migrating.

Sucks to be them - and you, for that matter. If Windows as a deployment platform is an absolutely fixed requirement I doubt Ruby and Rails is the right tool for the job (I'd also doubt that Amazon could be run on Windows, but I digress). It's not impossible, though.

Running Rails on IIS is generally a big no-go. However, as others have pointed out, running Rails on Mongrel behind IIS might be feasible. You might be able to do this until you start running into performance issues and then you can start considering scaling to, say, a Linux based cluster.

SQL Server isn't going to pose terribly many problems for you. We're currently rewriting our legacy VBScript application in Rails and use SQL Server as the DBMS and it's working just fine. We're developing and deploying the application to Linux servers, however.

In 2005, there was an average of 760 visitors per day, so the odds of having something like 100 hits at the same time are marginal. 2,5+ million items plus all the candy, yes, but not an extreme amount of users.

So is RoR fast enough?

Sure, for those kind of numbers Rails will treat you just fine. Clever use of caching will cover growth for quite a while.

Daniel Buus, Denmark :slight_smile:

If you're in the Copenhagen area, you should check out http://copenhagenrb.dk/, subscribe to the mailing list and we can hook up at the next meetup (in roughly a month, I reckon). We'd love to see you there :slight_smile:

Jason Norris wrote:

I would recommend building a sample app that pulls plenty of data from your MSSQL server and benchmarking it. From the tests I've done. RoR can run very quickly, especially if you implement action and fragment caching. I'm running on lighttpd, so I don't know about implementing Rails with IIS (are you sure you can't convince them to switch web servers?), but I've never seen any stability problems with a properly written app.

Good to know. I think the web server is possible to negotiate about, but the DB is not. I used IIS and ASP once back in 2000 (I still get a claustrophobic feeling reminiscing about it), and since then I've used nothing but Apache and PHP on Linux or BSD servers. My personal development and test server is a Debian Sarge installation, and using it (did anyone say stability?) finally made my constant distro switching on my laptop settle with the Debian-based Kubuntu distro. Up untill then I had only used Red Hat-based distros (started with Red Hat back in '02, then Mandrake, SuSE, Fedora, etc.), but never really god that "solid" feeling that Debian strains give you. But that's an entirely different discussion, sorry! :smiley:

As far as ecommerce, you may want to check out Substruct <http://dev.subimage.com/projects/substruct&gt;\.

I found that earlier, made me happy :slight_smile: Then I got a personal mail with lots of goodies, among which an invitation to check out this new RoR-based eCommerce site: www.gbposters.com. This made me even more happy :slight_smile:

Jason

Thanks, Jason, for your time :slight_smile:

Mathieu Chappuis wrote:

A) > The system, feature-wise, is pretty much Amazon, and the data that's > backing it is a catalog of more than 2½ million items. > So, the data is immense, and data handling is quite significant, too.

+

What does that mean?

B) > The system will be running on an IIS server with an MS SQL Server. > That's just how it is, as this is the setup the business is using, and > they're not interested in migrating.

IMHO, and with real experience with ROR+IIS in few words :

"Happy Russian Roulette!"

This is - in other words - what I'm basically reading out of the posts here. For my records - to use for argumentation about migrating the web server if need be - could you elaborate about your experiences?

Thanks :slight_smile:

Tom Ward wrote:

> But still, I need to know about scalability. Can anyone with sufficient > insight help me out with this? Although this system should mimick > Amazon's feature list, the userbase is significantly lower. In 2005, > there was an average of 760 visitors per day, so the odds of having > something like 100 hits at the same time are marginal. 2,5+ million > items plus all the candy, yes, but not an extreme amount of users.

> So is RoR fast enough?

As most of your heavy lifting (dealing with the 2.5 million items, creating 'Other users looked at', etc) will be done on the db, should be able to handle the rest of the requirements (fetching, displaying and putting data) as well as any web app framework. It's unlikely to be the fastest possible solution, but should easily handle your needs.

These are exactly my thoughts. If I had no one to answer to, I'd just go ahead and do this thing in RoR and deal with whatever consequences. It's just that creepy feeling that you get when you know that if it turns out otherwise, with terrible performance to show, every one will be looking at you because you were the one originally advocating the system. I basically just need to hear as many people say that performance is not an issue, so that I'm not basing my advice on my own gut feeling and conviction, but a lot of independent people's experiences and qualified advice.

In addition, rails provides very easy-to-use caching functionality that can help reduce load on your db.

I've just looked a bit at this, and it looks good! Also, I hear from another source that it might very well be needed, as performance differences with and without caching are immense.

> Someone mentioned problems with MSSQL with a lot of concurrent users > (amateur that I am, I forgot to bookmark it). Now, this guy may just > have made crappy code, his network was congested, his hardware crappy, > I don't know... But are there any known problems with the MSSQL > "wrapper" in RoR? And how does it handle scaling in general?

I maintain the SQL Server adapter, so if you find that link again I'd be very keen to look at it. In general we've had few issues using SQL Server for long-running but low-volume internal applications. I would recommend you use ODBC and steer clear of the ADO mode though - the ADO dbi drivers are not nearly as mature as the ODBC ones. Using ODBC, I've never encountered any real stability issues.

Cool, so you're the guy behind it! :slight_smile: I've used the ADO version, it seems. Never used MS SQL before, so I kinda just followed the first link on Google. Will try to find another one for the ODBC version now then. Didn't have any problems with the ADO driver while tutorialling, though, but that's not really production-grade behavior anyway :wink: But I should probably switch to ODBC when performance testing then.

I tried looking in my history, and I _think_ this might be the post I was thinking about: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/211423

But this guy is also using ADO anyway. And I have no way near enough knowledge about Ruby or this guy's particular problem to know why his code fails.

> So is RoR stable enough?

The one ingredient in your setup I'm unfamiliar with is IIS, so I can't really tell you how it will all fit together. I'd suggest setting up a test environment (iis, mongrel cluster, db, app that hits the database in some trivial way) and hammering it with a testing tool to see how many requests it can handle. If you could take difficult request from your existing app and try and trivially model that, it should provide you with all the information you'd need.

Okay, I gotta know now. Why exactly would one keep IIS instead of just running the Webrick server? Does it scale badly, or...? Do I get any kind of extra features by integrating with IIS that I'd otherwise miss out on?

Tom

PS Information on setting up IIS and mongrel can be found here - I've never done it (we use Apache) but it looks fairly straightforward.

http://www.napcsweb.com/howto/rails/deployment/railsonIISWithMongrel.pdf

I'll check it out, although I get the feeling I'd be silly to go with IIS :smiley:

Thanks for your time and your help!

Daniel

Richard Conroy wrote:

> > Hey there :slight_smile: > > I'm about to start development on a project to build a new online book > store for a small retailer and publisher. > > The system, feature-wise, is pretty much Amazon, and the data that's > backing it is a catalog of more than 2½ million items.

Have a look at http://shopify.com, an e-commerce hosting provider, written in Rails.

Thanks for the link, I just skimmed it, will check it out properly in a little while. Is it just me, though, or are RoR apps generally more nicely designed than "regular" web apps?

My gut feeling and the herd opinion too is that Rails is perfectly scalable beyond your needs. As long as you dont burn CPU frivolously, your individual web servers will never be CPU pegged - you will max out your database first in a properly architected system.

Thanks for saying that. I need as many opinions like this as humanly possible to put my mind at rest :slight_smile:

To get to Amazon levels, no web framework will suit your really, you will have to write your own architecture from scratch.

I've heard this before. And I assume that when we talk about "Amazon levels" it's in regards to user load, and not functionality, right?

Thank you :slight_smile:

Daniel

Zed A. Shaw wrote:

> <snip> > So, the data is immense, and data handling is quite significant, too. > The system will be running on an IIS server with an MS SQL Server. > That's just how it is, as this is the setup the business is using, and > they're not interested in migrating. > > We actually already decided on TYPO3 as our base system, after a longer > research into different CMS'es, CMF's and eCommerce systems. >

Stop right there. You've got an immense amount of data in an existing backend system and the customer has already constrained you to use IIS and MS SQL server and you're already going with TYPO3.

Adding rails to the mix would be an integration nightmare. It doesn't run as well on windows, it doesn't really like legacy databases, it is limited behind IIS (but talk to Brian Hogan), and Ruby is just dog slow on windows. Throw in nastiness like trying to get IIS to properly bounce between TYPO3, your app, and whatever legacy stuff they have now and you're in for some serious trouble.

Oh, it's not supposed to interoperate with TYPO3, it's supposed to be used instead of it.

A couple of the other points would be really nice to have some elaboration on, though, as they could either disqualify RoR as an option, or have us migrate the entire platform. So, could you possible elaborate on "It doesn't run as well on Windows", "It doesn't really like legacy databases", and "Ruby is just dog slow on Windows". Thank you :slight_smile:

If you go with Rails in your case it'll take you much longer than if you just went with straight up Microsoft tools. Sorry to say it, but Rails is totally the wrong tool for the job.

Take a look at the monorail project for .NET if you really want to be fancy: http://www.castleproject.org/index.php/MonoRail

I checked it out, but off-hand it's disqualified for being beta. There's no way I could ever sell that to the big guys upstairs. Also, personally, I'd prefer not to use anything Microsoft, as my experiences with their languages are not so good, to put it kindly (except for C# which is really really nice, so if some sweet framework à la RoR exists in C#, that would be interesting, too).

Thanks for your time, Zed, and I hope you'll elaborate on the points above :slight_smile:

Daniel

Tom Ward wrote:

> > > > <snip> > > So, the data is immense, and data handling is quite significant, too. > > The system will be running on an IIS server with an MS SQL Server. > > That's just how it is, as this is the setup the business is using, and > > they're not interested in migrating. > > > > We actually already decided on TYPO3 as our base system, after a longer > > research into different CMS'es, CMF's and eCommerce systems. > > > > Stop right there. You've got an immense amount of data in an existing > backend system and the customer has already constrained you to use IIS > and MS SQL server and you're already going with TYPO3.

I'd read it that he'd decided on TYPO3, then stumbled across RoR, and was considering using RoR *instead* of TYPO3. If the plan is to integrate the two together (alongside backend systems and what-not), then everything Zed wrote is correct - don't do it, just go with the Microsoft flow.

You were quite correct in your understanding :slight_smile:

If (as I'd originally assumed) the question was whether it's feasible to run *just* RoR on an IIS/SQL Server setup (with no other integration), I'd say it was worth looking into. We've successfully integrated legacy SQL Server DBs into windows hosted RoR apps with few major issues. On the whole, the process has been very pleasant (particularly now we have Mongrel - thanks Zed).

Good to hear! Although, it would be fantastic if you would share one or two of those "few major issues", so that I know what to expect.

Tom

Thank you, Tom! :slight_smile:

Daniel

Brian Hogan wrote:

I can help with regards to IIS...

Rails will not work with IIS. You need to get something else behind it. There are ways to hack it, but you will find very quickly that it does not hold up well, if at all.

The solution is simple. In the upcoming Rails Deployment book, I outline a few possible solutions where you run a Rails app server behind IIS and use a combination of a reverse-proxy plugin for IIS along with a custom plugin I developed.

It definitely seems to be a no-brainer that I'll have to do some persuasion to ditch IIS if RoR turns out to be the way to go. I don't think it'll be the biggest problem though.

I will have a look at Mongrel, maybe for the test session (really need to get on that very soon! :slight_smile:

Cheers, Daniel

I tried looking in my history, and I _think_ this might be the post I was thinking about: http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/211423

But this guy is also using ADO anyway. And I have no way near enough knowledge about Ruby or this guy's particular problem to know why his code fails.

He's opening a second statement against a connection while he already has one open. That's not supported by dbi drivers, and not used within rails.

Okay, I gotta know now. Why exactly would one keep IIS instead of just running the Webrick server? Does it scale badly, or...? Do I get any kind of extra features by integrating with IIS that I'd otherwise miss out on?

Webrick is really just a toy server. It's great for development, but no way near stable/secure enough for production.

In place of Webrick many people use Mongrel, which is extremely easy to set up and use and excels at serving rails applications (though not great at static content). It also comes with excellent documentation, and a very helpful userbase. However, Ruby on Rails is not thread safe, so when running rails, only a single concurrent request can be handled. This is a rails issue - Mongrel itself can handle concurrent requests.

To get around this, most deployments run many mongrel processes on each server, each on a different port. To distribute requests between each mongrel instance, some form of load-balancer or proxy is used (there seem to be loads to choose from, though I don't know what exactly what your options are on windows). One common solution is to use Apache; it may be possible to use IIS but doesn't seem exactly recommended :wink:

Tom

Jakob Skjerning wrote:

Daniel Smedegaard Buus wrote: > The system, feature-wise, is pretty much Amazon, and the data that's > backing it is a catalog of more than 2½ million items. [SNIP] > So, the data is immense, and data handling is quite significant, too.

Assuming you'll be able to push most of the heavy lifting data-wise to the DBMS, and therefore won't have to retrieve and process millions of rows in the application layer, your amount of data shouldn't be a big problem.

Exactly my opinion and thoughts. Good to hear more collaboration on that :slight_smile:

> The system will be running on an IIS server with an MS SQL Server. > That's just how it is, as this is the setup the business is using, > and they're not interested in migrating.

Sucks to be them - and you, for that matter. If Windows as a deployment platform is an absolutely fixed requirement I doubt Ruby and Rails is the right tool for the job (I'd also doubt that Amazon could be run on Windows, but I digress). It's not impossible, though.

Why exactly? (The former opinion about Windows as a deployment platform) Please elaborate.

Running Rails on IIS is generally a big no-go. However, as others have pointed out, running Rails on Mongrel behind IIS might be feasible. You might be able to do this until you start running into performance issues and then you can start considering scaling to, say, a Linux based cluster.

SQL Server isn't going to pose terribly many problems for you. We're currently rewriting our legacy VBScript application in Rails and use SQL Server as the DBMS and it's working just fine. We're developing and deploying the application to Linux servers, however.

Good to hear. So far, it's been almost exclusively positive feedback about the MS SQL integration. Very important! :slight_smile:

> In 2005, there was an average of 760 visitors per day, so the odds of > having something like 100 hits at the same time are marginal. 2,5+ > million items plus all the candy, yes, but not an extreme amount of > users. > > So is RoR fast enough?

Sure, for those kind of numbers Rails will treat you just fine. Clever use of caching will cover growth for quite a while.

I'll be caching away :wink:

> Daniel Buus, Denmark :slight_smile:

If you're in the Copenhagen area, you should check out http://copenhagenrb.dk/, subscribe to the mailing list and we can hook up at the next meetup (in roughly a month, I reckon). We'd love to see you there :slight_smile:

Sounds great! I signed up earlier today. See you there, I guess :slight_smile:

Cheers, Daniel

We had a few problems when starting out ~18 months ago, mainly because the deployment options were very flaky (Apache + fastcgi), and the sql server adapter was unloved. In the time since, the community has put loads of work into both. We wouldn't have the same problems if we started out now.

Tom