Unsigned integers for MySQL patch

Hi all,

I posted this patch the other day, now it's time to drum up some support for it :smiley: http://rails.lighthouseapp.com/projects/8994/tickets/300-unsigned-integers-for-mysql

To summarise, it does the following:

* changes the primary key data type for the MySQL adapter to be "int(11) unsigned"

* changes the references/belongs_to 'sexy' migration methods to generate unsigned foreign key fields by default when using MySQL (other adapters remain unchanged)

* adds an :unsigned option (defaults to false) to table creation and modification migrations

* updates the SchemaDumper to understand unsigned integer fields

If this sounds good to you, please take a look!

Hi,

Why do you need unsigned integers and what happens if the values go outside the range of signed integers, will rails fall back to BigDecimal?

One of the main advantages would be that it allows you to optimise your database structure, for example, if you want to store a value up to 4294967295 in an integer field you can either do it with a signed bigint (which needs 8 bytes of storage) or in an unsigned integer (which only needs 4 bytes). While a 4 byte saving may not sound huge, it can quickly add up and can also have a knock-on effect for indices on those fields.

Of course this applies to other integer types in MySQL too: for example a 2 byte signed smallint vs a 1 byte unsigned tinyint to store values up to 255, etc.

Rails will use a Bignum class for values above 2147483647, it'll continue to use Fixnum for smaller unsigned attributes.

As a practical example: I originally started work on this because I wanted to store IPv4 addresses in my database, and to do that I really needed an unsigned 32 bit integer.

Hope this helps...

Yes, I agree that for IPv4 it is indeed useful. But I do have a few more questions.

1. If unsigned was made default and a project that was built before this started using the newer migration code that would mean that part of their database uses unsigned and part uses signed integers.

2. Does mysql support changing the type of a column (including a primary key column) from signed to unsigned? Would rails' change_column be able to do it?

3. I think mysql does not support FK's between different integer types, so wouldn't making unsigned the default cause problems for all new FK fields (and their constraints) that refer to existing tables (with signed int primary keys)?

Apart from the possibility of storing IPv4 addresses the usefulness of this seems quite limited, most applications probably do not have tables that touch the limit of 32bit signed integers and even if they did a two-fold increase in the number of keys would not be a good long-term solution.

So basically I'm concerned about the fact that making this the default seems to not give great benefit yet unless I'm wrong about the above things could cause issues if done so for an existing project. So perhaps the default should be configurable and at least for older applications should not be changed?

Thank you,

1. That would be true only for the automatically created primary key fields. For all other integer fields the default for unsigned is false - so nothing changes.

2. Yes and yes. For example, if you move to a version of Rails with unsigned support, you could write a migration that does things like this:

change_column :suppliers, :category_id, :integer, :unsigned => true

or

change_table :products do |t|   t.change :id, :integer, :unsigned => true end

3. That's correct, you can't have a signed foreign key referencing an unsigned primary key (or vice-versa). In this situation you'll have two choices:

- keep your signed primary key, and pass :unsigned => false when creating your new foreign key - update your existing keys to make use of unsigned integers

For example, if we already have a suppliers table with a signed primary key, and are now creating a products table that will reference it, we can either create a signed foreign key:

create_table :products do |t|   # various columns go here   t.references :supplier, :unsigned => false end

Or we can change the existing key to be unsigned, like this:

change_table :suppliers do |t|   t.change :id, :integer, :unsigned => true end

create_table :products do |t|

  # No need to be signed!   t.references :supplier

end

So to summarise:

- signed integers are still the default - *except* for auto-generated primary keys, and foreign keys created using the 'sexy' references/belongs_to syntax - unsigned foreign keys can be created by passing :unsigned => false to references/belongs_to - existing columns (including primary keys) can be changed to/from unsigned using change_column

While IPv4 addresses were my inspiration for the patch (as well as seeing it come up as a requirement on various Rails forums from time to time) there are probably other situations where having the option to go unsigned will be helpful. Hopefully we'll hear from some people who have some other, more interesting, examples :slight_smile:

Thanks for the feedback.

Hello,

I am personally in favor of the unsigned option. I think it is a
better default for primary keys, since those won't ever be -something.

Just my 2 cents.

Regards, Jan De Poorter http://workswithruby.com

I'm still not sure why you think this should be made the default, sure there are benefits in some specific cases but in the general case it would just create unnecessary work for those who have an existing project already using signed integers.

Basically everyone on an existing database will have to do a lot of work to convert their integer types in order for their usual workflow for creating migrations to not have to change. (In case one already has FK constraints on signed types how easy would it be in mysql to convert both columns to an unsigned type while also keeping the FK constraint?)

Or they will have to keep track of which tables have signed and which have unsigned primary keys and when to make an FK column signed, which is not fun because until you add the FK constraint it will not be clear that the types differ so you may even get partially ran migrations because of this (and since mysql can not run migrations inside a transaction this makes it even more difficult to recover).

How would you justify this?

Note that I don't use mysql so I shouldn't really care much about this, but if I did I would not like to have to do either of the above for no immediate benefit. I however have nothing against supporting :unsigned option or making it possible to configure the default type of primary key to be an unsigned int.

Thank you,

FYI, you can currently store IPv4 in the database using a signed field.

MIN_INT4_UNSIGNED = -2147483648

I have a method like this that converts an ip string to an integer,
then adds the offset above to make it signed.

   def self.ip_to_int4(an_ip)      an_ip.split('.').inject(0){|sum, i| (sum << 8) + i.to_i } +
Integer::MIN_INT4_UNSIGNED    end

To convert it back you subtract the offset when you load it from the db.

   def self.int4_to_ip(int4)      int4 -= Integer::MIN_INT4_UNSIGNED      r =      4.times{        r << int4 % 256        int4 >>= 8      }      r.reverse.join('.')    end

I still support the unsigned integer idea though. Perhaps a setting
in environment.rb could explicitly turn it on, then projects could
decide whether or not they wanted to use it. This would be handy for
both existing apps (not to break anything) and new ones (which can
rake db:migrate:reset without penalty).

- Steve

Does that screw up foreign keys ? mysql is rather nit picky about the foreign key and the id having the same type (eg you can't have a foreign key constraint connecting a bigint column on one table to an int column on another: they both have to be bigints or both ints)

Fred

Sigh. it's been a long day. the rest of your mail trickled through. I do share tarmo's concerns (eg when I create a new table with a user_id I have to remember that my users table is old and has a signed primary key)

Fred

Hehe, I'll try and keep this reply a bit shorter :slight_smile:

@Steven: won't that prevent you from using the IPs in some kind of range query? e.g. SELECT * FROM some_ips WHERE ip >= start_ip AND ip <= end_ip (maybe not a great example, but you get the idea)

My main reason for making primary keys unsigned by default is that this is the convention for MySQL databases and in my opinion Rails should follow this convention. A less important reason is that it saves me extra work when creating new migrations as I get an unsigned key out of the box!

I appreciate this would require developers to give some thought when making the transition from pre-patch to post-patch versions of Rails, however a similar scenario occurred when this change was committed: http://github.com/rails/rails/commit/a37546517dad9f6d9a7de6e1dba4d960909d71e8 (for example what were once created as 4 byte integers could become 1, 2, 4 or 8 byte integers after this change).

I may be wrong but won't this old/new migration issue only arise in situations where you're adding a new foreign key that references an existing primary key, and even then only if you choose to use 'sexy' syntax instead of add_column? And shouldn't you really do a quick double check on a column type before adding an FK constraint, just in case it's not the simple numeric key you were expecting?

Anyway, I think there are currently three options:

1. Change the default to unsigned for primary keys Requires developers choose between updating their existing keys or specifying :unsigned => false when creating new foreign keys that reference old signed primary keys.

2. Add a :primary_key_unsigned => true|false setting to create_table (default to false) Requires developers manually specify when they want an unsigned key in each new migration.

3. Add some kind of config setting (I'm not exactly sure where it'd go) that determines the default for primary keys when using MySQL Developers would then be able to choose the global default, this could be combined with option 2 to allow the default to be overridden in individual migrations. Seems like a few too many settings to me though...

Not if you convert your ip to the compressed signed integer format
first.

I think your option 3 below is the best choice.

Just a quick bump to say that I posted an updated patch the other day - this one does not touch primary keys or references/belongs_to.

http://rails.lighthouseapp.com/projects/8994/tickets/300-unsigned-integers-for-mysql#ticket-300-2

I'll revisit the primary key options at a later date (I've a couple of other related changes I'd like to investigate) - didn't seem worth holding up this patch while I do it though...

Thanks.