Upcoming app > Research and Prep

Hey everyone.

I'm a one person team and in the next month need to produce and deploy
an application. I've been playing with rails for a little over a year
now and recently deployed my first app as a private contract.

I'm now planning development on a start up company.

The app should be expecting high volumes (not like Twitter but high
for sure. Its no social media app) My target audience is very specific
but in certain locals the app has been requested by many users who are
in fact waiting for the app to deploy.

Smaller Volumes actually uploading files and adding objects to the db.
Many people surfing the site. (Probably the same sort of RATIOS as
cragslist (posters:readers, but not near the volume.)

My current plan is and I need some help with this:

Start on shared hosting unlimited traffic/storage. Will move to
dedicated-virtual if shared hosting causes performance issues when
traffic picks up.
Behavior Drive Development via RSpec & Cucumber.
Build the App in rails with ActiveRecord (I read somewhere
ActiveRecord can be swapped out for something faster, Should I use
something else?)
Use multiple MySQL databases 1 master for writing, at *least* 2 slaves
for reading. (Any info on this would be good.)
Run multiple mongrel instances via mongrel_cluster. (How many is
sufficient? I think currently I have 4 available on my hosting, can
upgrade later if needed. So that would be 1 main instance and 3
extras.)
Paypal via Active_Merchant for all payment processing. (May accept
money order and check but that's not app related.)
Amazon S3 services for file/image upload & hosting (Via paperclip.) I
may start by hosting all images from local server as I have unlimited
space but I feel as the app grows my hosting may not appreciate the
traffic and I'll move to Amazon S3.
Deploy via Git & Capistrano

I'm going to do all the base functionality for the application myself.
Get it running get people signed up get them using the app. Hopefully
the multiple db's and mongrel instances should supply enough speed to
start. At which point if everything goes as planned (it never does)
I'll outsource and get added security. Get queries optimized and work
on overall performance via back end processes etc that will be over my
head for sure hence the outsource or hire of private team. I know at
this point it will be necessary to do these things and costs for such
have been accounted for.

Any help with gathering resources where I can find more information
about the above topics or personal experiences would be greatly
appreciated.

Hey everyone.

I'm a one person team and in the next month need to produce and deploy
an application. I've been playing with rails for a little over a year
now and recently deployed my first app as a private contract.

I'm now planning development on a start up company.

The app should be expecting high volumes (not like Twitter but high
for sure. Its no social media app) My target audience is very specific
but in certain locals the app has been requested by many users who are
in fact waiting for the app to deploy.

Smaller Volumes actually uploading files and adding objects to the db.
Many people surfing the site. (Probably the same sort of RATIOS as
cragslist (posters:readers, but not near the volume.)

My current plan is and I need some help with this:

Start on shared hosting unlimited traffic/storage. Will move to
dedicated-virtual if shared hosting causes performance issues when
traffic picks up.

Stay away from "shared hosting"; you'll have a bad time unless you're
running on at least a VPS or equivalent (EC2 instance, Heroku dyno,
etc.) Any host that's offering "unlimited bandwidth / unlimited
storage" is probably so oversold as to be useless.

Behavior Drive Development via RSpec & Cucumber.
Build the App in rails with ActiveRecord (I read somewhere
ActiveRecord can be swapped out for something faster, Should I use
something else?)
Use multiple MySQL databases 1 master for writing, at *least* 2 slaves
for reading. (Any info on this would be good.)

This is a fairly sizable bucket of hurt, especially to start with. I'm
also unsure how this squares with the "shared hosting" above. You're
better off starting with a single MySQL instance, and dealing with the
expansion when the load demands it.

Run multiple mongrel instances via mongrel_cluster. (How many is
sufficient? I think currently I have 4 available on my hosting, can
upgrade later if needed. So that would be 1 main instance and 3
extras.)

The "standard" deployment stack (if such a thing can be said to exist)
uses Passenger in preference to Mongrel. There's considerably less
Apache fiddling to get a Passenger setup running.

Finally, you might want to take a look at some of the "cloud"
offerings out there; Heroku is pretty popular, and is apparently very
easy to deploy to and scale. Amazon EC2 is another possibility; they
also offer pre-configured MySQL server instances on some fairly big
iron.

Hope this helps!

--Matt Jones

Wow I thought Passenger was just another deploy option not a whole
serving option.

So I can eliminate the need for multiple mongrel instances by
deploying with Passenger it will load balance start/stop instances
based on load needs. How does it actually serve? Is it a passenger
instance the app is served from?

A hearty +1 to the idea of using Heroku. For someone with a lot of
plain old development work on his/her plate, the ease of deployment
to Heroku is a real time-saver. And as you get more traffic, all you
have to do is buy the dynos :slight_smile:

I wish I had known about this place before I paid for my current
hosting =S . Unfortunately I think I'll be staying with my current
host until I see problems or at least start getting customers.
As long as I keep scalability in ind through the whole process I
should be okay. Prepping for multiple db's instances, and backend
queuing I should be able to move anywhere and scale accordingly...
right?

Life may be much simpler for you when you scale if you use a nosql option, at least for some things - redis for example is a great replacement for large join tables (picture tags, user friends, etc.). Also I’ve found that good row-level caching (on one of our projects, 98.5% hit rate) is far superior to multiple replicas. In other words, you might try to find ways to lighten up mysql if not replace it altogether.

Cheers,

Marc

What nonsql option would you recommend?

At this point I really don’t see any reason to be thinking about multiple databases. For one thing, if you are using shared hosting, you won’t be able to effectively implement it anyway. After all, the purpose of multiple databases is to spread the load over several machines. Keep in mind, a single DB can still serve millions of pages a day.

What you should probably be thinking about more is effective use of caching. If you can, use page caching, it is orders of magnitude faster than the alternatives.

For NoSQL integration, MongoMapper is a very mature interface to MongoDB. I have experimented a bit with this and really like it. That said, ActiveRecord+MySQL isn’t as slow as people make it out to be.

Hope this helps,

-Jer

One other thing. Passenger is now the “Best Practice” way to deploy Rails apps, but that is only because deployment is easier in some respects. If you have strong system administration skills lighty/nginx plus mongrel/tiny is just as easy to configure. I’m pretty sure you can use Passenger with nginx also, which may be a passable alternative to Apache.

Also, you were asking about the number of Mongrels; this really depends on how powerful your server is. If you have 8 cores then you probably want more mongrels than on a 4 core machine. You also need to take into account how much RAM you have available to you.

Well, if you like ActiveRecord and SimpleDB (aws), I recommend http://github.com/appoxy/simple_record

However, and I don’t mean this to inflame anyone, but ActiveRecord may fundamentally oppose NoSQL - while ActiveRecord makes joins easy (for the programmer, not so much for the db), those joins don’t scale well. [To other people’s points, a single MySQL can take you a very long way with minimal joins and good row-level caching.] The NoSQL way is to get (minimal) results from a single table at a time, then if necessary, do other queries to get connected data. In other words, instead of sending joined queries to the db layer, you unwrap (efficiently) them in your code.

marc

Thanks,

I was actually just thinking this myself. That my focus should be on
caching more then on multiple db's. I will be on shared hosting at
first but like mentioned most likely plan to move to dedicated if/when
people sign up.

So even with Passenger taking care of some load balancing it's still
required (or better) to run multiple mongrels?

The ‘mongrel’ is a single ruby process. So if each request takes on average 100ms to process, that means you can only process 10 requests a second. If you have 4 mongrels you can then process 40 requests a second; assuming you don’t hit some other bottleneck.

I am not familiar enough with Passenger to say how it deals with load balancing.

Life may be much simpler for you when you scale if you use a nosql option,

What nonsql option would you recommend?

Hi, this really depends on your overall requirements. Personally, I have used Maglev more so in my development.

Good luck,

-Conrad