Potential concurrency issues with a global hash to reduce DB

Frank Poo wrote:

Hi folks:

I'm building something akin to Google Analytics and currently I'm doing
real time database updates. However this isn't really scaling anymore,
so I need to stop doing synchronous DB operations. In the short term,
I'm trying to reduce DB writes, so I'm thinking about a global hash (say
declared in environment.rb) that is accessible from my controllers and
models that I can write to in lieu of writing to the DB. [...]
1) Does this sound reasonable?

No! Databases should scale as much as you need them to -- that's what
they are designed for. If your DB is not scaling, there is something
wrong, either in your DB setup or in the way the app is using the DB.
Find the problem and fix it rather than ripping out the DB.

Best,

The query is just an UPDATE with a COALESCE keyword in it. All it does
is increment counters:

UPDATE `statistics_api` SET `count_request` = COALESCE(`count_request`,
?) + ? WHERE (`id` = ?)

This one took 8,553 ms according to New Relic. The weird part is that
there is an index on 'id' (as well as a few other columns) on this
table. I just can't figure out why the query's taking so long ...

Marnen Laibow-Koser wrote:

This query is lightning fast when my server is NOT under load. Just dead
slow when I'm seeing a lot of load. You're right, the first placeholder
is probably redundant..

Marnen Laibow-Koser wrote:

Marnen Laibow-Koser wrote:

Frank Poo wrote:

The query is just an UPDATE with a COALESCE keyword in it. All it does
is increment counters:

UPDATE `statistics_api` SET `count_request` = COALESCE(`count_request`,
?) + ? WHERE (`id` = ?)

This one took 8,553 ms according to New Relic. The weird part is that
there is an index on 'id' (as well as a few other columns) on this
table. I just can't figure out why the query's taking so long ...

8.5 seconds? For that?!? You might want to ask someone who knows more
about mySQL configuration than I do, but clearly your DB setup has
problems. If you run that query when your DB is not otherwise being hit
by your app, how long does it take?

Frank Poo wrote:
This query is lightning fast when my server is NOT under load. Just dead
slow when I'm seeing a lot of load. You're right, the first placeholder
is probably redundant..

That would likely indicate that the DB is pretty dern busy fulfilling reads (though someone should analyse that to prove it). When doing this kind of logging it often becomes necessary to have a second database which is optimized for writes to take the continual updates so that your read/write-infrequent data set is left alone. Those log writes should also be fired off as an asyncronous task so that the rest of the page/request is not slowed down waiting for the write to occur.

I agree that yanking the DB in favor of in-memory workload doesn't seem appropriate yet.

Frank Poo wrote:

Greg, what do you recommend to do async writes? I'm a newbie to async
tasks in Rails, but doesn't something like delayed_job ALSO us the
database to keep track of jobs?

It may be possible to use the database's own clustering features to do
this and still let your app treat it as if it were one simple DB. If
not, would the masochism plugin help?

Best,

Which MySQL engine do you use for this table? MyISAM is faster for
read operations, but locks the whole table for an update or an insert
and this might affect the performance a lot. InnoDB engine uses row-
level locking. Note that PostgreSQL does too.

Have you thought about using document DBs like Tokyo Cabinet, MongoDB
or others? They can handle huge loads.

We use InnoDB. I did some more research on this, and it turns out that
my app was literally overloading the db (non-clustered) with UPDATEs.
Since all these updates do is increment counters, I cached the
increments in memory and flush them every once in a while. This
dramatically raised my app's performance, without the need for a
complicated database setup.

Wojciech Piekutowski wrote: