Absolutely! We’re looking at OpenCensus (https://opencensus.io) integration, which seems to be leapfrogging OpenTracing in standardization and adoption.
So now there are two standards?
Is there clarity on where things are going? The point of a standard would be that we’d only need to support one, and not have an extra layer of abstraction.
Current Ruby integration, including early Rack and Rails support: https://github.com/census-instrumentation/opencensus-ruby
Datadog exporter: https://github.com/DataDog/opencensus-go-exporter-datadog
Stackdriver exporter: https://github.com/census-ecosystem/opencensus-ruby-exporter-stackdriver
Zipkin/Jaeger exporter: https://github.com/census-ecosystem/opencensus-ruby-exporter-zipkin
Or use a local collector/relay: https://github.com/census-instrumentation/opencensus-service
Compared to OpenTracing, is the ecosystem mature enough to warrant us going all-in on this? I definitely see the theoretical point of a unified stats and trace standard, especially seeing as Statsd has fragmented somewhat, but is it a horse we want to bet on? I’m fine with either, as long as there are working, scalable solutions today for getting things working in a variety of languages and without duct tape. For instance, it seems like the Datadog exporter only supports Go?
Production use is fantastic, but I’d particularly love to see a collector and built-in visualization for local app development and tests.
Me too but that’s probably not going to be my initial focus, I’m meanly looking at the instrumentation side of things.
We have an existing ActiveSupport::Notifications API which works much like typical parent-span instrumentation, but it doesn’t propagate or report trace context.
For deeper Rails integration, we could adapt the AS::N design to more directly map to OpenCensus, or introduce ActiveSupport::Tracing if there’s too much mismatch or compatibility concern.
I’ve done a bunch of work on AS::N in the past (I’m the one who created the original ActiveSupport::Subscriber base class) and feel pretty confident that it can form the basis for this work. We’d probably want to instrument more places, e.g. each middleware invocation and maybe filters in controllers, but otherwise it’s a good starting point.
I do think we need to have a specific mapping from AS::N to the tracing backend, selecting which payload keys should be propagated and maybe formatting some of the values, so it’s probably not just a matter of copying everything verbatim. It sounds like you’re doing the verbatim thing at Basecamp though – how is that working out? Would you be in favor of that?
We’ve seen issues when tracing gets too granular or too much data is captured, so I’d like to be a bit conservative.
That’d allow these libraries to plug in directly without needing to carefully instrument Rails on their own. Rails should be able to participate in distributed tracing out of the box, report stats out of the box, show traces and stats in development mode, and flip between production APM vendors without specialized integration.
Yup, that’s my goal as well. APM vendors should not compete on their quality of instrumentation, but on the quality of their product. One thing I want to emphasize though is that I think we need to push for standardization beyond Rails. AS::N could have been a great standard if it wasn’t tied to AS – it hasn’t seen widespread adoption because gem authors are unwilling to add a dependency on AS, I think. So we should think holistically about the entire ecosystem and what would make sense for Ruby as a whole.
At Basecamp, we have a home-grown StatsD setup, similar to Datadog, that hooks Active Support notifications (https://signalvnoise.com/posts/3091-pssst-your-rails-application-has-a-secret-to-tell-you). We also parse logs from Kafka to reconstruct some traces. We’d love to extract this and rely on Rails to natively export traces and stats.
I’d love to hear what you’re doing at Zendesk, where you’re headed, and whether this sketch aligns well. And anyone else who’s working in this area!
We’re currently all-in on Datadog, and I’ve helped improve their instrumentation. However, I keep running into ad-hoc instrumentation being brittle, which is why I’m interested in first-class support. I think the only sustainable path forward is that gems natively support some form of tracing, either through AS::N (which would need to be extracted) or directly with a standardized tracing gem.
How would you feel about extracting AS::N, actually? Then gems could adopt it for pub/sub and it would be a lot simpler to plug in a tracing subscriber.