Ruby Summer of Code Project

(Sending this to talk and core, I apologize if you get two copies of this email).

Hi,

As part of my Ruby SOC proposals I'm thinking of working in the CI idea posted in http://wiki.rubyonrails.org/rubysoc/2010/ideas

Ideally, I'd like to do a two step proposal: first one would be writing custom CI tool tailored specifically for Rails where we can see more relevant information. The second one would be providing a way for users to push their results upstream to get more data about different environments.

I'd like to know what you guys think of this, and what features do you think this system in order to make it worth the time and money efforts.

Thanks!

PS: I understand it's kinda late to send this (especially with the list's moderation options) but most of the time was consumed in another proposal related to ActiveRecord and DataObjects integration.

Hey Federico,

Have you tried the performance test suite built in to Rails apps?

You write performance tests in the style of integration tests then `rake test:benchmark` or `rake test:profile`.

The benchmark task benches each performance test and saves the results in a CSV file with info about your environment, the git revision, etc.

So you can do simple CI with a git commit hook or a cron job, then check the CSV for regressions and improvements.

I think improving the tools here would make a great project, as would sharing results from open-source apps tested on different platforms and hardware.

Good luck! jeremy

Hi Jeremy,

Yes, good points. I've already talked with Federico offline. I believe his original interest was in improving the rails test suite itself, and allowing collaboration and submission of results by users.

I think it is mostly orthogonal to the problem I'm tackling - which is to have a performant, stable, easily reproducible "production" CI environment which runs the main build script (http://github.com/rails/rails/blob/master/ci/ci_build.rb) against the major interpreters.

I don't want to make the "production" CI environment any more complicated than necessary - and adding the variable of user-submitted input from random environments would definitely complicate things. It's hard enough to get most people to care about the failing CI at all - we need to keep it simple, stable, and reproducible. That's why I'm taking the approach of a Rails CI AMI which anyone can run on EC2, and the generation of that AMI will also be completely automated. That hopefully eliminates any possibility of "non-reproducible" build failures, which are the bane of any CI system. I've said "reproducible" four times, so you get my point :slight_smile:

However, as long as any improvements/innovations in this area (rake tasks, additional/alternative test/benchmark scripts to ci_build.rb, etc) conform to basic conventions - stdout/stderr, generated csv/html flat file artifacts, and proper return codes - there's no reason we can't integrate it into the main CI environment (or any other CI environment) in the future. Even if it isn't integrated, the information gathered can drive improvements to the main test suite or build script. However, I'd prefer avoid calling this work "CI" or lumping it into the "CI" effort, for the reasons mentioned above, and also because it would be more "user driven" than "continuous". In fact, I've already updated the wiki to say "Crowdsourced Rails environment compatibility metrics" instead of continuous integration.

Thanks, -- Chad

Continuous performance integration is crowdsourced metrics with crowd size 1.

All this stuff is reusable.

jeremy

As Chad mentioned, we've been exchanging emails outside the list and it's been helpful to ground my proposal. Some of my ideas can fall outside the current scope of Chad's work on a CI platform to build a pristine copy of Rails everytime but I think we can definately collaborate on some of the stuff to make sure all this is usable for the rails-core guys and for the outsider who just wants to see what's wrong with X revision on his machine.

My idea is a bit more biased to what Jeremy initially noted but I'll keep working on it today/tomorrow to send a copy to the ML looking for input once it's done.

Thanks for comments guys.

Hey Federico,

  I was talking with Jeremy Kemper regarding to benchmarks and he points me to try to benchmark an existing open source app. For example port redmine or gemcutter to rails 3 (Jeremy already port redmine http://github.com/jeremy/redmine/commits/rails3 but it's from november) and benchmark one of this (perhaps Gemcutter is simpler) with Rails 2.3 and Rails 3.0 and build a comparison. What do you think?.

You can find me on IRC, i'm spastorino if you want we can exchange more ideas and thoghts about this.

Best, Santiago.

Hey Federico,

I was talking with Jeremy Kemper regarding to benchmarks and he

points me to try to benchmark an existing open source app.

For example port redmine or gemcutter to rails 3 (Jeremy already port

redmine http://github.com/jeremy/redmine/commits/rails3 but it’s from

november) and benchmark one of this (perhaps Gemcutter is simpler)

with Rails 2.3 and Rails 3.0 and build a comparison. What do you

think?.

You can find me on IRC, i’m spastorino if you want we can exchange

more ideas and thoghts about this.

Best,

Santiago.

Santiago,

Now, that’s a really good idea.

Anuj @andhapp

Below you'll find my proposal, please contact me if you have any questions or suggestions:

My proposal is to improve the current developer tools available in Rails to :

1. Deliver more information to the user about his current Rails dev setup. 2. Collect these results to create reports on the different versions and environments of Rails and upload them to a central server containing all this information. Think of "isitruby1.9" but for the Rails project (not for individual Rails apps).

In the past several people have started projects to keep CC.rb instances of Rails running to keep developers notified of the changes to the project, but as far as I know there is not a "canonical" Rails CI to provide this information.

Chad Woolley has been working on an EC2 backed tool to provide this service to the community (http://github.com/thewoolleyman/railsci) but his current scope goes up to providing a tool we can install on a pristine server to see what the current build status of Rails is. Although this is immensely helpful by itself, I think we can gather more information than the test results to see "How's Rails today".

My idea is to implement two tools that provide this information:

1. Improve the current Rails tools to get information about the current version builds on developers machines. This would include test results, performance information (running times, memory usage, GC-info, etc). Right now there are a couple of rake tasks that provide this information to the user when he runs it on his own applications (rake test:profile, rake test:benchmark) but as far as I know there are no tools to do this specifically on the Rails code base. We'd have to write them basing our work on the existing tools.

A simple patch to correct typos in the documentation or a refactoring effort of a couple of methods might not need to use these tools, but certain things like testing different database connectors on different Ruby implementations (REE vs. JRuby vs. MRI) could greatly benefit from having access to this information. On one hand the developers can find the slow points in the system and try to fix them, and in the other they'll be able to say "X implementation is probably better suited to do Y or Z tasks. When a user provides a bug report of Ruby 1.8.7 on OS X running too slow we could just check the results on our machines and see if it's a global issue or if it's a local problem.

Once we have this information on the developer machines we could push it to a centralized repository. We could see how Rails behaves on different platforms and under different environments. This is especially useful when doing changes to the low level stuff (the first example I could think of is playing with the send_file method) since these kind of changes are really dependant on the platform/web servers/OS used.

With this online tool we could see how a change affected the Rails performance in general, and as an added bonus we will also have access to the actual test results. This is also good since we can trivially see how the code changes in different environment:

    Say that Arel decides to move the database connector code to a DataObject backend. I write     this code, test it on my local machine and it works just fine, but I don't have access to a     Windows system to try these tests on. I could just send a link to the ML asking for someone to     please run a rake task or two and it will automatically test everything, compile the results and     upload them to the website. There I would be able to see how everything went and if there was a     problem I'll have instant access to that information.

I currently don't know how far Chad plans to take his project, but from what I understand his idea is to setup several EC2 images with different OS to try the code under all the supported platforms helping himself with RVM. If it works like this we'll be able to integrate these results into a "master" copy of the results (stuff we got from "official" servers running clean versions of Rails) so we have a baseline to compare against. If he's not planning to support all the OS combinations then we would still be able to get the results on all the platforms (provided there are enough users on those systems willing to run these tests).

I'd love to hear any feedback from you and your ideas on the project!

Below you'll find my proposal, please contact me if you have any

This all sounds great.

questions or suggestions: I currently don't know how far Chad plans to take his project, but from what I understand his idea is to setup several EC2 images with different OS to try the code under all the supported platforms helping himself with RVM. If it works like this we'll be able to integrate these results into a "master" copy of the results (stuff we got from "official" servers running clean versions of Rails) so we have a baseline to compare against. If he's not planning to support all the OS combinations then we would still be able to get the results on all the platforms (provided there are enough users on those systems willing to run these tests).

One clarification - I don't have multiple OS's in my current scope for CI, just multiple interpreters. All my CI setup scripts are based around Ubuntu AMIs. However, we could definitely run your rake tasks in the ubuntu environment on multiple interpreters - especially if they are smart enough to run RVM automatically to install the desired interpreter.

The AMIs that I am building should be useful for your effort too, since they will have RVM, Chef, and all the dependencies the rails build needs. Also, even though I'm only focusing on Ubuntu for now to keep things simple, my scripts are pure Bash with the goal of just getting to the point of having RVM and Chef installed, at which point the Chef scripts can take over. Theoretically, they should be transferrable to any platform/distro which supports Bash, RVM, and Chef by just changing the package manager calls and directory locations.

I'd love to hear any feedback from you and your ideas on the project!

This all sounds like a great idea. Like I said, I'm real busy, but let me know if you can't find a sponsor.

-- Chad