Using the DB for logging / tracking app statistics

Hi all,

I am building an administrative front-end to a rails app and the requirements seem awfuly write heavy to me... The client wants to see things like:

*Total logins (per day) *New accounts created (per day) *Page views for signup form (per day) *Last login / Last IP per user

etc...

All of this data should be available through the administrative application over the web - no grepping through log files or running log analyzers.

I have set up Google analytics for the page view information and such, but I guess I'm going to have to write some logging for the rest.

The immidiate solution for this is to create a 'logs' table in the database (MySQL) and have rails write records for the relevant statistics to the db. But I'm concerned with slowing down the site too much making all these calls to the DB.

Has anyone got a better idea?

Imho it’s better to think about scale early and often rather than assume there will be time for it later. The biggest scale problem in rails apps is database writes. Readonly dbs can easily be built as needed. But scaling the write operation is much tougher. Better to tackle it up front.

Another concept to incorporate here is that of a Data Warehouse. Generally a data warehouse is a separate database. In this case I would at least try to make this a separate database.

What would scale even better is this: Use syslog over udp. In this case, the mongrels can simply “fire and forget” these messages into a single shared log, which may then be parsed to inject data into the (separate) data warehouse db, which may then be analyzed to thy heart’s content without bogging down the production system.

I speak from recent and relevant experience: We currently “log” all searches to the database, and since this is a write operation, it must go to the master, bogging down the entire site. The sad irony is that the search log table is so unweildy that we can’t even use it (unless/until we give it its own replica readonly db).

Good Luck!

Marc

And because these logs are separate from the public-facing app, "grepping" the log from your administrative control panel would not negatively affect the app's performance. Well, it wouldn't if you dedicated a mongrel just for administrative use, otherwise you'll have to put a mongrel out of service until these statistics are retrieved.

It's possible that information your client considers critical right now, and is willing to pay to have, won't be such a big deal once the site is humming right along. If you can push the whole thing out of the way of the main app, then you are free to back your data accumulation off to a cron job that executes hourly or less frequently.