How well does Rails handle brownfield data in a greenfield application?

Let me give an example: Say I have an e-commerce storefront with data that's provided by a 3rd-party vendor, instead of just a handful of items that I'm selling myself (which seems to be the way most Rails e- commerce sites are set up, for example Shopify). My vendor has a list of about 28,000 product SKUs which they provide to me as a huge tab delimited flat-file containing product, manufacturer and category information (i.e. not relational in any way, shape or form, just a glorified spreadsheet), which I will need to parse several times to extract the fields into different tables. This file has predefined IDs for the primary key for each column, that needs to be used in order to reference items properly (e.g. Product #12543 is made by Manufacturer #17835 and belongs to Category #34324, based on the columns "ProductID", "ManufacturerID" and "CategoryID"). I can't deviate from this structure.

Would Rails have any potential issues with having its ID key predetermined beforehand? In the database it can still be kept as an autoincrementing integer, just the ID gets assigned prior - I'm going to have to write a task of some kind to parse the file out into chunks and then load the product data into the correct models using those chunks of relevant data. If I was dealing with a database directly this might cause some issues, but I'm not sure about Rails itself.

I want to make sure I won't run into any pitfalls before I make an attempt at this.

I would recommend leaving the Rails record IDs intact and simply create

another field that represents the SKU. This could be a non-autoincrement

field in your table but this really depends on how the vendor introduces

new products to the system. In short, your record ID should be different

from your product ID (i.e. SKU) because this would give you a bit more

flexibility in the future in regards to change of the SKU. For example, if the

SKU, 12543, changed to one of the following:

CAT12543

or

ZIP12543

or

BIF12543

You get the idea.

Good luck,

-Conrad

The SKU would be a separate field in and of itself, what I'm saying is the numeric ID (what Rails would set as an autoincrement) is pre- assigned from the data instead of being assigned automatically from Rails.

Be very careful doing this. When you change assumed Rails conventions and behavior, it tends to bite you in unexpected ways. While you can override the ID column, I'd suggest not doing it.

I'd seriously just let Rails add an extra ID column and put the big table in other columns. The cost is small and you are not potentially breaking future Rails stuff.

Brendon Whateley wrote:

Be very careful doing this. When you change assumed Rails conventions and behavior, it tends to bite you in unexpected ways. While you can override the ID column, I'd suggest not doing it.

I'd seriously just let Rails add an extra ID column and put the big table in other columns. The cost is small and you are not potentially breaking future Rails stuff.

+1. Mapping legacy IDs onto default Rails IDs will bite.

Again, I would leave the Rails ID intact to not invalidate the conventions set forth and create

others fields that are needed for your application. The ID used by Rails is used as record

identifier and shouldn’t be used for other purposes like an actual product ID, order ID, and so on.

Good luck,

-Conrad