Logging API Calls

Hello,

I am building an API and I was wondering if there are any best
practices out there for logging and monitoring the calls. Ideally I'd
like to track:

* What types of calls
* Which user account is being hit
* Which developer/application is making the call
* Status codes (200 ok, 500 interal server error, 400 not authorized,
etc..)
* Time stamps

The basics really. I know I could dump this to a custom log file, but
is there a better way? I would like to provide each developer with
stats from her application and ideally in realtime.

Issues:
If I do this via a log file I won't be able provide realtime stats
If I log to a database, it will grow very quickly

Anyone have any experience or learnings in this area? What are the
performance issues? What are the storage issues? What are any
database issues?

Thanks,
Jason

Jason Amster wrote:

Hello,

Hi!

I am building an API and I was wondering if there are any best
practices out there for logging and monitoring the calls. Ideally I'd
like to track:

[...]

If I log to a database, it will grow very quickly

Of course any text file would also grow very quickly if you log every single call to an application that gets hit quite a lot. So I'm not exactly sure how this would be an argument for/against writing your log to a database table as opposed to writing to a normal text file.

You might find RRDtool [http://oss.oetiker.ch/rrdtool/] useful for at least some of your needs. You could write all calls to the db and then "move" this information out of there once it is older than x weeks. RRDtool is perfect for aggregating this information into more general stats such as API key XYZ has hit the application 40 times on this day, 380 times on this day, etc.

At some point you will either have to eliminate specific data out of something like "API key XYZ has access user profile FOO at 5:15pm, the request was processed in 0.24 seconds" in order to be able to aggregate multiple calls together à la "API key XYZ accessed 50 user profiles on the 10th of March, the calls were processed on avg in 0.12s" OR keep a high level of detail and face an ever-growing mountain of logs.

Should you use MySQL, you might also be interested in using the ARCHIVE storage engine. This engine compressed data as it is inserted. Of course this means it needs to be extracted whenever you want to read it out so this clearly is a storage/cpu-power trade-off. I does not seem to suit your scenario in which there presumably will be quite a lot of reads.

Cheers,
Niels

Really? Why not? You may be right, but it's an option worth exploring
properly before rejecting it. If you want to show your users all logs
received in the last five minutes (for example) a query on a table
indexed by time stamp may be just the ticket...

If size is an issue you can always archive the older stuff
periodically.

Graham

Thanks Niels,

With the log files, I can always rotate them and generate reports.
But this give the hastle of having to generate the reports dayily,
weekly, monthly or however I granularly I want it. But the benefit is
that i can compress and archive older log files. I suppose I can take
data older than x number of time units out and dump them to a log file
as well which will keep my DB at somewhat of a fixed size based on a
time window.

I'll look into RRDtool. Seems like it might be a good solution.

Thanks for the suggestion!

Hey Graham,

I'd like to give each developer a dashboard giving them useful stats
on how their app is performing. I think parsing text files may be a
bit cumbersome in a realtime approach, but wouldn't be so problematic
if it were done in daily batches, and hence no realtime.

I think the solution is to use the a DB and then archive it after a
time.

Thanks for the suggestion!

Jason