ActiveRecord::Calculations#sum returning a string..

Is there any reason that the value passed to type_cast_using_column during an ActiveRecord::Calculation#sum defaults to the string '0' when there are no rows?

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation/calculations.rb#L307

Theoretically, '0' will get type cast to the column type, but if the type cannot be inferred from the table information, then you get '0' back, which seems odd to me.

Surely it should be 0 rather than '0' passed to type_cast_using_column, no?

Tekin Suleyman

Just bumped into this. It’s on line 323 now:

https://github.com/rails/rails/blob/master/activerecord/lib/active_record/relation/calculations.rb#L323

I tried to go a way back like beginning of 2010 and it was still there. Not sure why a sum() would return a String, true. I currently converting it to Float (my case) with …sum(:column_name).to_f

I remember a long time ago I had a patch for average that would trust what the DB returned, in my case integers.

https://github.com/rails/rails/commit/95d5d9b6c48c08f1fba0c77ecbc97b62b2603824

It would be nice if calculations could always rely on the raw_connection to return data types converted to ruby primitives correctly. Not sure if that ‘0’ should be a 0 or not, maybe be helpful to see what fails if changed?

  • Ken

That ‘0’ likely just predates the raw_connection returning rich data types. I believe sql server and oracle have done so for ages, but it’s relatively new for the other adapters.

In preparation for 4.0 perhaps we could do an audit of the various database drivers and make sure they return the correct types. After that we could remove the custom typecasting we do?

That '0' likely just predates the raw_connection returning rich data types. I believe sql server and oracle have done so for ages, but it's relatively new for the other adapters.

Yea, our Ruby ODBC does not return proper types, but our TinyTDS which has been out for a year or more now does. Just like MySQL2 which it was modeled after.

In preparation for 4.0 perhaps we could do an audit of the various database drivers and make sure they return the correct types. After that we could remove the custom typecasting we do?

I would be for that!

- Ken

It depends on the level of API people want to use. For example, PG always returns strings. That means we need to know the types for which we're querying. Say someone does this against pg:

  connection.execute 'select 1.2 + 20'

we can't know to cast that to a float. Even doing regexp tests will be wrong since someone may have tried to select a string.

We can do automatic casting, but it's going to break down at some point depending on the level of abstraction and the database used.

Actually, you can if you use the mysql, pg, postgres, and postgres-pr drivers (and most other drivers that return strings for everything). When you get the results of the query, you can query the metadata of the results to get the types of the columns. Sequel uses this information to typecast such results correctly, and there is no reason ActiveRecord couldn't do the same:

  irb(main):001:0> DB['SELECT 1'].single_value   => 1   irb(main):002:0> DB["SELECT '1'"].single_value   => "1"   irb(main):003:0> DB["SELECT 1.2"].single_value   => #<BigDecimal:20e8334b0,'0.12E1',18(18)>   irb(main):004:0> DB["SELECT 1.2::real"].single_value   => 1.2

Jeremy

It depends on the level of API people want to use. For example, PG

always returns strings. That means we need to know the types for which

we’re querying. Say someone does this against pg:

connection.execute ‘select 1.2 + 20’

we can’t know to cast that to a float. Even doing regexp tests will be

wrong since someone may have tried to select a string.

Most database protocols include type information along with the values themselves, the drivers should be able to do introspection and ‘magically’ cast them completely removing our casting from the picture.

We can do automatic casting, but it’s going to break down at some point

depending on the level of abstraction and the database used.

For a 4.x release though I think we can expect the database drivers to step up and handle that conversion natively. Unless there are any supported databases whose network protocol doesn’t include enough information for the driver to do it itself.

I'm not sure if the pg "over the wire" protocol contains that information, but I can tell you that libpg doesn't provide functions for casting to C types (that I can find).

See PQgetvalue:

  PostgreSQL: Documentation: 6.4: Query Execution Functions

PQftype returns a type id integer that represents the database type. The ruby postgres drivers (pg, postgres, postgres-pr) expose this information via the type or ftype method on the PGresult class. The common database types have fixed type id integers you can statically map to ruby classes (16 -> boolean, 700 and 701 -> float, etc.).

You may be correct that libpq lacks a function that casts to C types. But it's fairly simple to cast the C string directly to the desired ruby type using a mapping based on the type id integer.

Jeremy

I monkey patched the PG adapter to type cast calculated values returned from views. I think this solves the problem in this thread.

https://gist.github.com/1624535

If this looks like a solution to the original poster I can expand it to map more datatypes and submit a pull request.

Cheers,

Anthony Richardson

Yes, please. I will merge it. Only two suggestions:

1) Assign the OIDs to constants 2) Break out the giant case / when to a method

Thanks!