Optimizing ActiveSupport with native code

I've had this idea kicking around in my head for a while, and had some time yesterday to start playing around with it: ActiveSupport is heavily used in both the Rails library code and in the application layer of a typical Rails stack. Certain parts of AS would be much more efficiently implemented in C rather than Ruby; that optimization could potentially have some noticeable, positive effects on a Rails app's performance. So, why not write a library that swaps out the appropriate ActiveSupport methods with native C implementations?

So I'm looking for feedback from the group on that idea. Some questions that come to mind:

* Is this a new idea? I did some googling around and didn't find anything, but I don't want to reinvent the wheel here. * Is this a bad idea? Of course AS itself wouldn't want to restrict itself to a particular Ruby interpreter, but I don't see any harm in an add-on library that optimizes it for MRI. Am I missing anything? * How widely applicable is it? The ActiveSupport::Inflector singleton provides some *very* low-hanging fruit - unscientific benchmarking is suggesting 10x speed improvements in #underscore, #camel_case, etc. But native implementations are probably only useful for methods that perform non-trivial work that has no direct relation to the Ruby space - string manipulation and arithmetic being the obvious candidates. Is it worth trying to provide a comprehensive suite of ActiveSupport native implementations? * How about using existing bindings to C libraries? So far I've focused on simply reimplementing individual AS methods using pure C, but, for instance, it might be worth reimplementing ActiveSupport's XML support using Nokogiri, etc.

If folks think this is a worthwhile idea, I'd love to get as many people as possible involved. But for now, I'd appreciate any feedback y'all have.

The tinkering I did yesterday is here. It's basically just a single C file with several native implementations of AS methods, and a single Ruby file which benchmarks the methods against their pure-Ruby equivalents and also checks that the output of the corresponding methods is the same across a set of inputs:

Looking forward to thoughts, comments, criticism, stinging insults, etc. if you've got 'em.

Mat

This looks great, but have you considered the complexities of i8n on a native implementation?

That's a great point - I hadn't thought of that specifically, but even just looking at the Inflections module, it became clear that the approach has its limits. For instance, the #pluralize and #singularize methods (which are called internally by several other inflectors) basically iterate over a collection of regular expressions, which are user-definable, looking for one that matches the input string. That's pretty inefficient, but it's not immediately clear to me how to do it better while maintaining the flexibility needed for I18n (and customization generally), or how a native code implementation would improve the situation. So I guess the idea would be to focus on methods whose behavior is not locale-dependent first, to get the biggest bang for the coding buck.

That said, it does occur to me that Inflector#ordinalize should be locale-aware - as I recall the implementation in the ActiveSupport::Inflector module is not, but I'm assuming it's overridden somewhere by the I18n module. I'd say there are various potential approaches to this problem - avoiding locale-dependent methods as mentioned above, but also potentially providing locale-specific C implementations, which could be selected among in the Ruby layer, with a fallback to the pure-Ruby implementation. Just thinking out loud here, but I don't think it has to be a deal-breaker.

Thanks for the feedback!

Mat

Steve Ross wrote:

Well, lots. Look at Mat's prototype implementation and the ramifications will become immediately apparent.

--steve