Potential concurrency issues with a global hash to reduce DB

Frank Poo wrote:

Hi folks:

I'm building something akin to Google Analytics and currently I'm doing real time database updates. However this isn't really scaling anymore, so I need to stop doing synchronous DB operations. In the short term, I'm trying to reduce DB writes, so I'm thinking about a global hash (say declared in environment.rb) that is accessible from my controllers and models that I can write to in lieu of writing to the DB. [...] 1) Does this sound reasonable?

No! Databases should scale as much as you need them to -- that's what they are designed for. If your DB is not scaling, there is something wrong, either in your DB setup or in the way the app is using the DB. Find the problem and fix it rather than ripping out the DB.

Best,

The query is just an UPDATE with a COALESCE keyword in it. All it does is increment counters:

UPDATE `statistics_api` SET `count_request` = COALESCE(`count_request`, ?) + ? WHERE (`id` = ?)

This one took 8,553 ms according to New Relic. The weird part is that there is an index on 'id' (as well as a few other columns) on this table. I just can't figure out why the query's taking so long ...

Marnen Laibow-Koser wrote:

This query is lightning fast when my server is NOT under load. Just dead slow when I'm seeing a lot of load. You're right, the first placeholder is probably redundant..

Marnen Laibow-Koser wrote:

Marnen Laibow-Koser wrote:

Frank Poo wrote:

The query is just an UPDATE with a COALESCE keyword in it. All it does is increment counters:

UPDATE `statistics_api` SET `count_request` = COALESCE(`count_request`, ?) + ? WHERE (`id` = ?)

This one took 8,553 ms according to New Relic. The weird part is that there is an index on 'id' (as well as a few other columns) on this table. I just can't figure out why the query's taking so long ...

8.5 seconds? For that?!? You might want to ask someone who knows more about mySQL configuration than I do, but clearly your DB setup has problems. If you run that query when your DB is not otherwise being hit by your app, how long does it take?

Frank Poo wrote: This query is lightning fast when my server is NOT under load. Just dead slow when I'm seeing a lot of load. You're right, the first placeholder is probably redundant..

That would likely indicate that the DB is pretty dern busy fulfilling reads (though someone should analyse that to prove it). When doing this kind of logging it often becomes necessary to have a second database which is optimized for writes to take the continual updates so that your read/write-infrequent data set is left alone. Those log writes should also be fired off as an asyncronous task so that the rest of the page/request is not slowed down waiting for the write to occur.

I agree that yanking the DB in favor of in-memory workload doesn't seem appropriate yet.

Frank Poo wrote:

Greg, what do you recommend to do async writes? I'm a newbie to async tasks in Rails, but doesn't something like delayed_job ALSO us the database to keep track of jobs?

It may be possible to use the database's own clustering features to do this and still let your app treat it as if it were one simple DB. If not, would the masochism plugin help?

Best,

Which MySQL engine do you use for this table? MyISAM is faster for read operations, but locks the whole table for an update or an insert and this might affect the performance a lot. InnoDB engine uses row- level locking. Note that PostgreSQL does too.

Have you thought about using document DBs like Tokyo Cabinet, MongoDB or others? They can handle huge loads.

We use InnoDB. I did some more research on this, and it turns out that my app was literally overloading the db (non-clustered) with UPDATEs. Since all these updates do is increment counters, I cached the increments in memory and flush them every once in a while. This dramatically raised my app's performance, without the need for a complicated database setup.

Wojciech Piekutowski wrote: