Ruby Summer of Code Project

(Sending this to talk and core, I apologize if you get two copies of
this email).

Hi,

As part of my Ruby SOC proposals I'm thinking of working in the CI
idea posted in http://wiki.rubyonrails.org/rubysoc/2010/ideas

Ideally, I'd like to do a two step proposal: first one would be
writing custom CI tool tailored specifically for Rails where we can
see more relevant information. The second one would be providing a way
for users to push their results upstream to get more data about
different environments.

I'd like to know what you guys think of this, and what features do you
think this system in order to make it worth the time and money
efforts.

Thanks!

PS: I understand it's kinda late to send this (especially with the
list's moderation options) but most of the time was consumed in
another proposal related to ActiveRecord and DataObjects integration.

Hey Federico,

Have you tried the performance test suite built in to Rails apps?

You write performance tests in the style of integration tests then
`rake test:benchmark` or `rake test:profile`.

The benchmark task benches each performance test and saves the results
in a CSV file with info about your environment, the git revision, etc.

So you can do simple CI with a git commit hook or a cron job, then
check the CSV for regressions and improvements.

I think improving the tools here would make a great project, as would
sharing results from open-source apps tested on different platforms
and hardware.

Good luck!
jeremy

Hi Jeremy,

Yes, good points. I've already talked with Federico offline. I
believe his original interest was in improving the rails test suite
itself, and allowing collaboration and submission of results by users.

I think it is mostly orthogonal to the problem I'm tackling - which is
to have a performant, stable, easily reproducible "production" CI
environment which runs the main build script
(http://github.com/rails/rails/blob/master/ci/ci_build.rb) against the
major interpreters.

I don't want to make the "production" CI environment any more
complicated than necessary - and adding the variable of user-submitted
input from random environments would definitely complicate things.
It's hard enough to get most people to care about the failing CI at
all - we need to keep it simple, stable, and reproducible. That's why
I'm taking the approach of a Rails CI AMI which anyone can run on EC2,
and the generation of that AMI will also be completely automated.
That hopefully eliminates any possibility of "non-reproducible" build
failures, which are the bane of any CI system. I've said
"reproducible" four times, so you get my point :slight_smile:

However, as long as any improvements/innovations in this area (rake
tasks, additional/alternative test/benchmark scripts to ci_build.rb,
etc) conform to basic conventions - stdout/stderr, generated csv/html
flat file artifacts, and proper return codes - there's no reason we
can't integrate it into the main CI environment (or any other CI
environment) in the future. Even if it isn't integrated, the
information gathered can drive improvements to the main test suite or
build script. However, I'd prefer avoid calling this work "CI" or
lumping it into the "CI" effort, for the reasons mentioned above, and
also because it would be more "user driven" than "continuous". In
fact, I've already updated the wiki to say "Crowdsourced Rails
environment compatibility metrics" instead of continuous integration.

Thanks,
-- Chad

Continuous performance integration is crowdsourced metrics with crowd size 1.

All this stuff is reusable.

jeremy

As Chad mentioned, we've been exchanging emails outside the list and
it's been helpful to ground my proposal. Some of my ideas can fall
outside the current scope of Chad's work on a CI platform to build a
pristine copy of Rails everytime but I think we can definately
collaborate on some of the stuff to make sure all this is usable for
the rails-core guys and for the outsider who just wants to see what's
wrong with X revision on his machine.

My idea is a bit more biased to what Jeremy initially noted but I'll
keep working on it today/tomorrow to send a copy to the ML looking for
input once it's done.

Thanks for comments guys.

Hey Federico,

  I was talking with Jeremy Kemper regarding to benchmarks and he
points me to try to benchmark an existing open source app.
For example port redmine or gemcutter to rails 3 (Jeremy already port
redmine http://github.com/jeremy/redmine/commits/rails3 but it's from
november) and benchmark one of this (perhaps Gemcutter is simpler)
with Rails 2.3 and Rails 3.0 and build a comparison. What do you
think?.

You can find me on IRC, i'm spastorino if you want we can exchange
more ideas and thoghts about this.

Best,
Santiago.

Hey Federico,

I was talking with Jeremy Kemper regarding to benchmarks and he

points me to try to benchmark an existing open source app.

For example port redmine or gemcutter to rails 3 (Jeremy already port

redmine http://github.com/jeremy/redmine/commits/rails3 but it’s from

november) and benchmark one of this (perhaps Gemcutter is simpler)

with Rails 2.3 and Rails 3.0 and build a comparison. What do you

think?.

You can find me on IRC, i’m spastorino if you want we can exchange

more ideas and thoghts about this.

Best,

Santiago.

Santiago,

Now, that’s a really good idea.

Anuj
@andhapp

Below you'll find my proposal, please contact me if you have any
questions or suggestions:

My proposal is to improve the current developer tools available in Rails to :

1. Deliver more information to the user about his current Rails dev setup.
2. Collect these results to create reports on the different versions
and environments of Rails and
upload them to a central server containing all this information. Think
of "isitruby1.9" but for the
Rails project (not for individual Rails apps).

In the past several people have started projects to keep CC.rb
instances of Rails running to keep
developers notified of the changes to the project, but as far as I
know there is not a "canonical"
Rails CI to provide this information.

Chad Woolley has been working on an EC2 backed tool to provide this
service to the community
(http://github.com/thewoolleyman/railsci) but his current scope goes
up to providing a tool we can install
on a pristine server to see what the current build status of Rails is.
Although this is immensely
helpful by itself, I think we can gather more information than the
test results to see "How's Rails
today".

My idea is to implement two tools that provide this information:

1. Improve the current Rails tools to get information about the
current version builds on developers
machines. This would include test results, performance information
(running times, memory usage,
GC-info, etc). Right now there are a couple of rake tasks that provide
this information to the user
when he runs it on his own applications (rake test:profile, rake
test:benchmark) but as far as I
know there are no tools to do this specifically on the Rails code
base. We'd have to write them
basing our work on the existing tools.

A simple patch to correct typos in the documentation or a refactoring
effort of a couple of methods might not
need to use these tools, but certain things like testing different
database connectors on different
Ruby implementations (REE vs. JRuby vs. MRI) could greatly benefit
from having access to this
information. On one hand the developers can find the slow points in
the system and try to fix them,
and in the other they'll be able to say "X implementation is probably
better suited to do Y or Z
tasks. When a user provides a bug report of Ruby 1.8.7 on OS X running
too slow we could just check
the results on our machines and see if it's a global issue or if it's
a local problem.

Once we have this information on the developer machines we could push
it to a centralized
repository. We could see how Rails behaves on different platforms and
under different
environments. This is especially useful when doing changes to the low
level stuff (the first example
I could think of is playing with the send_file method) since these
kind of changes are really
dependant on the platform/web servers/OS used.

With this online tool we could see how a change affected the Rails
performance in general, and as an
added bonus we will also have access to the actual test results. This
is also good since we can
trivially see how the code changes in different environment:

    Say that Arel decides to move the database connector code to a
DataObject backend. I write
    this code, test it on my local machine and it works just fine, but
I don't have access to a
    Windows system to try these tests on. I could just send a link to
the ML asking for someone to
    please run a rake task or two and it will automatically test
everything, compile the results and
    upload them to the website. There I would be able to see how
everything went and if there was a
    problem I'll have instant access to that information.

I currently don't know how far Chad plans to take his project, but
from what I understand his idea
is to setup several EC2 images with different OS to try the code under
all the supported
platforms helping himself with RVM. If it works like this we'll be
able to integrate these results
into a "master" copy of the results (stuff we got from "official"
servers running clean versions of
Rails) so we have a baseline to compare against. If he's not planning
to support all the OS
combinations then we would still be able to get the results on all the
platforms (provided there are
enough users on those systems willing to run these tests).

I'd love to hear any feedback from you and your ideas on the project!

Below you'll find my proposal, please contact me if you have any

This all sounds great.

questions or suggestions:
I currently don't know how far Chad plans to take his project, but
from what I understand his idea
is to setup several EC2 images with different OS to try the code under
all the supported
platforms helping himself with RVM. If it works like this we'll be
able to integrate these results
into a "master" copy of the results (stuff we got from "official"
servers running clean versions of
Rails) so we have a baseline to compare against. If he's not planning
to support all the OS
combinations then we would still be able to get the results on all the
platforms (provided there are
enough users on those systems willing to run these tests).

One clarification - I don't have multiple OS's in my current scope for
CI, just multiple interpreters. All my CI setup scripts are based
around Ubuntu AMIs. However, we could definitely run your rake tasks
in the ubuntu environment on multiple interpreters - especially if
they are smart enough to run RVM automatically to install the desired
interpreter.

The AMIs that I am building should be useful for your effort too,
since they will have RVM, Chef, and all the dependencies the rails
build needs. Also, even though I'm only focusing on Ubuntu for now to
keep things simple, my scripts are pure Bash with the goal of just
getting to the point of having RVM and Chef installed, at which point
the Chef scripts can take over. Theoretically, they should be
transferrable to any platform/distro which supports Bash, RVM, and
Chef by just changing the package manager calls and directory
locations.

I'd love to hear any feedback from you and your ideas on the project!

This all sounds like a great idea. Like I said, I'm real busy, but
let me know if you can't find a sponsor.

-- Chad