Rails 1.2.0 RC2 has 4125 duplicate lines

I have been evaluating the excellent and super fast Similarity Analyser by Red Hill Consulting to generate a dupplicated lines reports from Rails 1.2.0 RC2 source code (excluding the tests).

On the 10th of January 2007, Rails 1.2.0 RC2 has 4125 duplicate lines in 793 blocks in 231 files

    * actionmailer has 584 duplicate lines in 107 blocks in 20 files     * actionpack has 718 duplicate lines in 154 blocks in 58 files     * actionwebservice has 241 duplicate lines in 51 blocks in 21 files     * actionrecord has 1529 duplicate lines in 301 blocks in 45 files     * activesupport has 418 duplicate lines in 78 blocks in 44 files     * railties has 635 duplicate lines in 102 blocks in 43 files

Detailled reports on

Now the debate is opened about what to do with these reports !!!

Jean-Michel

Whatever happens, I wouldn't expect it to happen before the release of 1.2. Submit a bug report with the full report and target it to 2.0. Also, a post to rails-core about this would probably be a good idea, though you might want to wait until 1.2 is out.

- Rob

Rob Sanheim wrote:

I have been evaluating the excellent and super fast Similarity Analyser by Red Hill Consulting to generate a dupplicated lines reports from Rails 1.2.0 RC2 source code (excluding the tests).

On the 10th of January 2007, Rails 1.2.0 RC2 has 4125 duplicate lines in 793 blocks in 231 files

    * actionmailer has 584 duplicate lines in 107 blocks in 20 files     * actionpack has 718 duplicate lines in 154 blocks in 58 files     * actionwebservice has 241 duplicate lines in 51 blocks in 21 files     * actionrecord has 1529 duplicate lines in 301 blocks in 45 files     * activesupport has 418 duplicate lines in 78 blocks in 44 files     * railties has 635 duplicate lines in 102 blocks in 43 files

Detailled reports on 21 croissants' Blog: Rails 1.2.1 duplicate lines Simian Report

Now the debate is opened about what to do with these reports !!!

Jean-Michel

Whatever happens, I wouldn't expect it to happen before the release of 1.2. Submit a bug report with the full report and target it to 2.0. Also, a post to rails-core about this would probably be a good idea, though you might want to wait until 1.2 is out.

Looking at the bottom lines of the reports, it appears that Simian regards Rails as being about 90% DRY. Not at all bad.

Now if I were in the core team, I would not appreciate having a load of statistics dumped into Trac and called a defect. Output from tools like Simian requires intelligent interpretation.

I suggest that Jean-Michel should prioritise the reported duplications by the expected saving if the duplicated code were to be refactored out - roughly (N-1)*(M-1) where N is the number of duplicated lines and M is the number of occurrences. The N-1 is because a call to the extracted method will still be required at each place where the code is duplicated, and the M-1 is because M occurrences will be reduced to 1.

Then, in priority order, he should look at each candidate for refactoring and try to come up with an intention-revealing name for the extracted method.

The core team might look kindly on patches which:

- reduce code size

- increase understandability of the code

- don't break any tests.

Just my tuppence worth :slight_smile:

   Justin Forder

But if you actually go to the "duplicated" source lines, you find many of the 4-line duplicates to be

1: end 2: 3: def something 4: @something

in different classes or subclasses. My resonse to "what to do with these reports" is to ignore them. Unless the threshold is made >= 6 (instead of 4), or the tool was sophisticated enough to discount lines that are blank or contain only "end" (or "begin" or "rescue" or "else", etc.), the report is too bloated to be very helpful. It reminds me of debates about whether counting LOC (lines of code) in C should include lines having only "{" or "}".

-Rob

Rob Biedenharn http://agileconsultingllc.com Rob@AgileConsultingLLC.com