Scaling/optimizing a slow ruby on rails application.

Hi
everyone,

My
name is Michael Solovyov and I’m the co-founder & CTO of a
rails-based start-up. We have built a product for other businesses
to help their users adopt their products.

We’ve
built the application, delivered to a few customers and now we’ve hit
scaling and performance issues.

I’d
love your feedback on how we proceed as we try to resolve them:

These are the problems we’ve encountered:

The
main user dashboard takes between 7-15 seconds to load. The maximum
requests per second that we can support are 100-150.

Why we think they’re happening:

We’re using
couchdb with an ORM that has our data modeled relationally. This
causes the ORM to launch an extraordinary amount of http requests to
couchdb to load all our data. The more data and users in the project,
the longer the delay. There’s also potentially inefficient code,
extra loops etc. We’re not using caching to it’s fullest extent.

What we’re using/what we’ve done:

Our architecture:
rails ‘3.2.12’, ruby 1.9.3p374, nginx + unicorn (main server),
elasticsearch (separate server), couchdb (separate server). All on
AWS.

We’ve added some
view level caching where we can get away with stale data.

We’ve tried using
higher tiered/ssd aws servers but it doesn’t look to be our main
bottleneck.

How we plan on completing it:

We’re mostly done
a rewrite to migrate our data and data model to activerecord. We’ve
hit a snag in migrating s3 attachment code built into the couchdb ORM
that we’re using. Following the completion of the migration to
sql/activerecord we are going to do the following in an attempt to
increase performance:

A. Use
percono mysql server.

B.
Run tools such as bullet to analyze performance and where we might
have bad queries/lookups.

C.
Update to latest Rails and Ruby versions.

D.
Look into Puma and paralelization of DB access
calls.

E.
Break apart the application into separate services.

What we don’t know right now:

Whether switching
to Activerecord or any of the above ideas will inherently give us any
performance benefits.

Whether we will
have any other unexpected surprises or loss of functionality due to
switching to ActiveRecord.

How long this all
will take and whether there’s other areas that we need to focus to
get sub 1 second performance that we’re missing.

Do
you all have any feedback/ideas/insights completing projects like
these?

In
particular, I would be curious about:

a)
How would you approach it?

b)
Anything that’s worked for you in the past on suitable projects?

c)
Anything to avoid doing?

d)
Where to find good additional team members & advisors to help us
complete the project?

e)
How long to expect it to take?

Here
are our application rake stats:
http://pastie.org/private/qrxkytk4uur9odndydv5dq

We've
built the application, delivered to a few customers and now we've hit
scaling and performance issues.

2)
Why we think they're happening:

We're using
couchdb with an ORM that has our data modeled relationally. This
causes the ORM to launch an extraordinary amount of http requests to
couchdb to load all our data. The more data and users in the project,
the longer the delay. There's also potentially inefficient code,
extra loops etc. We're not using caching to it's fullest extent.

First off storing relational data in couchdb doesn't sound right - it's not really what it is designed for. I used couchdb on a project once many moons ago and things got a lot easier once we stopped trying to use it as a relational datastore.

It seems to me that your first step is to establish a performance baseline: write some automata le performance tests (I don't mean tests in the sense of rspec - just something that exercises your website in a reasonably realistic way). Use tools such as stack_profiler to find out where the time is really going. Tools such as new relic or rack-mini_profiler can also provide insight into what real users are seeing.

3)
What we're using/what we've done:

Our architecture:
rails '3.2.12', ruby 1.9.3p374, nginx + unicorn (main server),
elasticsearch (separate server), couchdb (separate server). All on
AWS.

In my experience elasticsearch is blazing fast, scales well horizontally and doesn't constrict you in quite the way couchdb does (or at least did - couchdb has almost certainly changed since I last used it). You can also use it as a straight up document oriented datastore - I remember reading about some of the early moving elasticsearch folk replacing couchdb with elasticsearch

We've added some
view level caching where we can get away with stale data.

We've tried using
higher tiered/ssd aws servers but it doesn't look to be our main
bottleneck.

4)
How we plan on completing it:

We're mostly done
a rewrite to migrate our data and data model to activerecord. We've
hit a snag in migrating s3 attachment code built into the couchdb ORM
that we're using. Following the completion of the migration to
sql/activerecord we are going to do the following in an attempt to
increase performance:

A. Use
percono mysql server.

It may be a bit late but I would seriously consider postgres. Its richer set of data types (arrays, json etc) might ease the transition and it has a pretty awesome feature set.

B.
Run tools such as bullet to analyze performance and where we might
have bad queries/lookups.

C.
Update to latest Rails and Ruby versions.

Never hurts - also consider GC tuning parameters.

D.
Look into Puma and paralelization of DB access
calls.

E.
Break apart the application into separate services.

5)
What we don't know right now:

Whether switching
to Activerecord or any of the above ideas will inherently give us any
performance benefits.

Whether we will
have any other unexpected surprises or loss of functionality due to
switching to ActiveRecord.

How long this all
will take and whether there's other areas that we need to focus to
get sub 1 second performance that we're missing.

Do
you all have any feedback/ideas/insights completing projects like
these?

In
particular, I would be curious about:

a)
How would you approach it?

b)
Anything that's worked for you in the past on suitable projects?

c)
Anything to avoid doing?

d)
Where to find good additional team members & advisors to help us
complete the project?

e)
How long to expect it to take?

I think this is largely unanswerable by someone unfamiliar with your app. If you're forcing relational style data into a document oriented datastore like couchdb then storing it in a relational db should make it easier/faster. On the other hand if you do play to couchdb's strengths in other ways then that might make the process more difficult (eg if some of your documents store arbitrary key/pairs of data then that can be a pain in the relational world)

Where possible, break the problem down so that you can prove some of these points early (breaking up into services could be a strategy here, if it makes it easier for you to move part of your data across without having to rewrite everything). Writing a prototype that handles some of your hot paths could also provide some key early validation.

I wouldn't obsess too much about tools like bullet - thinking about your data access patterns before this conversion will probably make the biggest difference (although nothing wrong with using it to mop things up afterwards if you find yourself having to include 27 tables then you should probably rethink your design)

Fred

Hi everyone,

My name is Michael Solovyov and I'm the co-founder & CTO of a
rails-based start-up. We have built a product for other businesses to help
their users adopt their products.

We've built the application, delivered to a few customers and now we've
hit scaling and performance issues.

I'd love your feedback on how we proceed as we try to resolve them:

1) These are the problems we've encountered:

The main user dashboard takes between 7-15 seconds to load. The maximum
requests per second that we can support are 100-150.

2) Why we think they're happening:

We're using couchdb with an ORM that has our data modeled relationally.
This causes the ORM to launch an extraordinary amount of http requests to
couchdb to load all our data. The more data and users in the project, the
longer the delay. There's also potentially inefficient code, extra loops
etc. We're not using caching to it's fullest extent.

3) What we're using/what we've done:

Our architecture: rails '3.2.12', ruby 1.9.3p374, nginx + unicorn (main
server), elasticsearch (separate server), couchdb (separate server). All on
AWS.

one easy way you can improve is to move to a more recent version of ruby.
there's a big improvement between
2.0 and 1.9.3 (and it's already 2.1.2) so you might want to look into that
first since it's easier than improving the
code performance.

Without know the internal workings of your application, it is impossible to give specific answers…but if you are experiencing 15 second delays with only a few users, I would take a hard look at the data storage layer. As F. Cheung mentioned, I don’t think you have your data correctly architected with CouchDB. Consider Postgres as your DB, it is robust and has some great features. With only a few users, I doubt you will see much speed improvement upgrading your Ruby or Rails versions. Your immediate concern should be the data architecture.