Hi all,
The microsecond handling in ActiveRecord::ConnectionAdapters::Column#fast_string_to_time and ActiveRecord::ConnectionAdapters::Column#microseconds fail for some values.
In slightly more than 1% of all possible 6-digit cases, writing a timestamp to a database column and then reading it back in results in a different value being returned to the program.
So, for instance, saving the timestamp
2010-01-12 12:34:56.125014
and then loading it again from the database yields
2010-01-12 12:34:56.125013
The problem occurs when the value read is converted from string form to a Ruby timestamp, so it is largely database independent (the exception being drivers that override the methods, or databases that don't support timestamps at all).
The underlying problem is the use of to_i to convert from floats to ints inside the affected methods. As you know, to_i simply truncates the result and in some cases this causes rounding errors introduced by inherent inaccuracies in the multiplication operations and decimal representation to bubble up and affect the least significant digit.
Here's a simple test that illustrates the problem:
converted = ActiveRecord::ConnectionAdapters::Column.send('fast_string_to_time', "2010-01-12 12:34:56.125014") assert_equal 125014, converted.usec
This test case (and a similar one for #microseconds) fail on plain vanilla Rails 2.3.5.
I guess the best solution would be to change the ISO_DATETIME regex used to extract the microsecond-part from timestamps to not include the decimal point at all and then avoid the to_f and subsequent floating point multiplication completely inside the failing methods. However, these regexes are defined as constants on ActiveRecord::ConnectionAdapters::Column::Format and therefore publicly available, so the impact of changing these is difficult to ascertain.
A simpler solution is to use round() instead of to_i to convert from the intermediate floating point result to int. This works (I have verified that the precision is sufficient for all possible 6-digit cases) but is about 15% slower than the current method. A small price to pay for correctness, in my opinion.
I have attached a tiny patch (against 2.3.5) that switches the code to using round() and a test case that verifies that the method works for a few problematic cases that fail without the patch.
I have also created a Lighthouse ticket #3693:
Could some of you please take a look and see if the patch is acceptable and maybe carry it into the code base?
Cheers Jacob
fix_microsecond_conversion.diff (2.93 KB)