Hello. I'm new to Ruby & Rails, though a veteran at engineering large- scale distributed systems.
I have a new project which requires a REST API and simple web UI and after reading (superficially) about RoR on and off over recent years, I thought it was time I took it for a spin for a new project. It is a dream 'ground-up' project with no legacy requirements.
However, I've hit a speed-bump and I'm unsure if it is a limitation of Rails or just my lack of in-depth understanding of the code base.
When using the generator to generate a new model class, Rails chooses an auto-increment int id as the primary key by default(!). This is obviously pretty poor form for numerous reasons, such as: 1. Completely at odds with scalability and distributed implementation of a DB since it introduces an unnecessary need for centralization 2. Depending on the DB engine, you might run out of primary keys as soon as you hit 2^32 rows 3. A security vulnerability waiting to happen - unless you pay close attention, it would be easy to expose ids to the public in a multi- user environment for which guess-ability of some resource ids is bad practice
By 1, I'm talking about indefinitely scalable distributed implementations (since the term 'scalability' is used to mean a wide variety of things, from vertically scaling a web app performance by adding memory to a server to horizontally scaling with limits where adding resources, such as servers, eventually becomes a case of diminishing returns). An easy way to check if your architecture is fully distributed and performance of operations is independent of data size etc., is to do a quick thought experiment where you imaging to have a ridiculous amount of data, users etc. For example, would the performance for a user be significantly effected if your database was so large it needed to be spread over a trillion servers? If the answer is yes, then your architecture is not indefinitely scalable as there is some centralization introducing a dependency between performance and data set size, of user-base size or whatever.
So, if you had a trillion DB servers, auto_increment could never work because to determine which is the next id would require querying them all to figure out what the largest existing id is (or, alternatively, keeping the 'next id' stored in a central place - which will be a performance bottleneck when a trillion servers have to hit it up for every insert). (for the purists, notice I said "significantly" above. For example, consider the design of the DNS system and imagine if records had no TTL - living on indefinitely. The load on the root servers would be vanishingly small and it would hardly matter if they were out of service for short periods).
Obviously, nobody has a trillion servers, but engineering systems to be highly-scalable isn't hard and is good practice anyway (- in case your client's service becomes the next Facebook, in which case you won't have to touch anything - just spool up more and more cloud servers and sit back rather than watch as their business fails due to users leaving a sinking ship of slow or failed page-loads ).
Now, I've surfed around the web for information about how to use custom ids or other primary key columns in Rails, but have only found confusion (ignoring people who ask why and/or say not to do it). Examples given seem to differ (perhaps due to changes before Rails 3?) and I can't get any of the ideas to work.
For example, supposing I wish to use UUIDs for primary keys. I've tried variations on:
class CreateItems < ActiveRecord::Migration def self.up create_table :items, {:id => false} do |t| t.string :id, :null => false, :limit => 36, :primary => true
t.timestamps end
end
def self.down drop_table :items end end
However, the :primary doesn't seem to work (perhaps is invalid) and the table generated doesn't have a primary key. I can use add_index to add a :unique index, but it isn't primary. Obviously, I'll need some hooks to generate the UUIDs - I've delved into that part.
So, can Rails really handle this in a clean way and have scaffolding work etc? How? Can someone kindly clue me into what I need in the migration, model class and anywhere else? I'd prefer to avoid DB- specific SQL execution (while I'm testing this on MySQL, that obviously isn't a distributed scalable technology so I'll be using a distributed store ultimately). I'd also like some tables to have natural (domain specific) primary key values, a related though perhaps separate issue (and less critical).
I've achieved similar on another project using Grails by writing a JPA implementation. I'm really hoping Rails can do this without having the source hacked.
Any help or pointers are greatly appreciated.
Cheers, -David.