How would you handle gracefully validating a generated unique code until it's guaranteed to be valid?

Hi,

Imagine having a code field with a validation rule of this:

  validates :code,
    presence: true,
    length: { maximum: 19 },
    format: { with: /\A[A-Z0-9\-]+\z/i, message: "must be a-z|A-Z|0-9|-" },
    uniqueness: true

Basically codes like JX26-BYWY-AFRE are valid.

Now imagine populating this code with:

before_validation { self.code = Generate.random_code }, on: :create

So far so good, the workflow is when you create a new item, its code property gets generated and then validated. It will likely be valid because your app code controls how it’s generated.

There is a very small chance the generated code will not be unique through chance and then a validation error for uniqueness on the code will fail as expected.

I very much want to avoid the form submission from failing and displaying a flash message because the field isn’t editable as an end user when the resource is created. Also it’s during an important workflow where having a user have to re-submit the form wouldn’t be an ideal user experience.

What I’m trying to do is come up with a workflow like:

  • Create item
  • Have its code automatically generated and validated
  • If validation fails because the code is not unique generate a new code to replace the bad one
  • Loop until the code is unique and eventually let the record get saved

I’d like to avoid re-running .save in my controller in a loop or rescuing from errors.

I was thinking maybe on after_validation I could check to see if item.errors includes a specific uniqueness error on the code and if so then generate a new code and this could be an until loop until the code is valid.

Is there a more efficient way?

I do have a uniqueness constraint at the DB level with Postgres so that’s available to catch but the app level Rails validation will prevent this from happening I think.

I suppose alternatively I could not validate the code at all on create and since the code is generated by app code I can trust it’s valid and then if a unique constraint error is thrown re-generate a new code and try to save it again in a loop until it works? This sounds like I’d have to rescue from errors in my controller tho.

I feel like this is a pretty common problem so I was wondering if there’s a “Rails way” I haven’t come across.

Do you need these to look random? Could they be based off an auto-incrementing integer, e.g. the primary key?

Here’s one option that would guarantee uniqueness without needing a loop:

def code
  id
    .to_s(36) # convert the integer to a base-36 string (using letters and numbers)
    .upcase # convert to upcase
    .rjust(12, "0") # pad the left with 0s until it's 12 characters long
    .insert(8, "-") # insert dash after the 8th character
    .insert(4, "-") # insert dash after the 4th character
end

You could convert it back like this

def self.find_by_code(code)
  find(
    code.delete("-").to_i(36)
  )
end

Ah, your solution is almost like a mini version of a hashId, that’s a good idea.

In my case the code isn’t something you can choose when creating but on update you’re allowed to edit the code as long as it hasn’t been used yet. I was aiming to have a code get generated that follows the same format as other codes in the system.

For example there’s discount codes, affiliate codes and team license codes that all have the same format. In this case the team license code is what gets created automatically, I could make a case maybe it’s not necessary to have an option to edit it since it’s only ever shared through an URL in which case generating it off an auto-incrementing id will work.

For example /team_license_checkout?code=JX26-BYWY-AFRE, in this case I’d like to keep the code pretty close to impossible to guess. I know hashIds can be padded, padding it with 32 characters should do the trick.

Are these codes meant to be secret from other users? Or are they being added by admins, and the need to avoid duplicates is a courtesy feature? I ask because one of the things you want to avoid in many systems like this is any means by which someone can validate a code’s existence. It’s why you don’t see error messages like “sorry, that e-mail address is already registered” – because that would be a way to identify other users of that system.

Walter

Are these codes meant to be secret from other users?

A team license is created during a checkout workflow by the end user who makes a purchase.

For example, Walter buys 10 copies of X product. This creates a team_license model row associated with your user account. It has a code field that allows you (the team owner) to send links out to your team mates to access X product for free without any billing details.

In this case it’s important that it’s unique because anyone can visit the team checkout URL with the code in the URL to redeem one of your licenses. Each team license has its own individual usage limit and other attributes.

If you as the team owner wanted to change your code to be something else you could but only if it hasn’t been redeemed yet by anyone. I can technically change this behavior because it’s really not needed, but yeah the uniqueness validation here would make for a friendly error message.

I didn’t want a uniqueness violation during the checkout process to affect the user since that could kill conversion rates on an important transaction. Plus the user at this point in time wouldn’t even know what the code is, the workflow for explaining how to share your team license with team mates is explained after the checkout process.

Can you get away with delayed generation? Add a unique-or-null db constraint and create as null originally, then kick off a background job that spins a loop of generate random, save, rescue db constraint error by retrying.

Then on your purchase landing page you probably render a “generating…” placeholder that can listen for an ActionCable broadcast of the final key once it’s settled in the db.

Should also support the “user can custom edit it later” workflow.

Is this a real issue? Assuming the number of possibilities are high enough that a collision is unlikely but technically possible you are just having to do one lookup on a presumably indexed table. Seems unlikely this would take long enough for the user to even notice (even if you did get a collision and have to do it a few times I still think we are talking about an amount of time nobody would notice). I.E. I wonder if this is a case of premature optimization.


Another option is to assume success but prepare for failure. This is similar to the approach of create_or_find_by only you don’t actually care about the find part. Something like this:

class License < ApplicationRecord
  validates :code,
    presence: true,
    length: { maximum: 19 },
    format: { with: /\A[A-Z0-9\-]+\z/i, message: "must be a-z|A-Z|0-9|-" }

  before_validation :generate_code, on: :create

  def create_ensuring_unique_code!
    create!
  rescue ActiveRecord::RecordNotUnique
    generate_code
    retry
  end

  private

  def generate_code = self.code = Generate.random_code
end

Note that we don’t actually impl the uniqueness validation client-side only in the database server. This sort of violates your goal of avoiding multiple saves or rescuing from exceptions but pushes it down to the model level so the controller can remain pretty clean. The nice thing here is that you sort of lean into the fact that collisions are unlikely.

  1. We don’t bother doing a lookup first under the assumption that it will not collide.
  2. If that assumption is correct (very likely) the record is inserted nice and efficiently.
  3. If the assumption is incorrect the database efficiently tells us and we just keep trying until it does work (unlikely to fail more than once).

Your controller would called create_with_unique_code!. We still impl the setting the first code via a before_validation callback so if anyone inadvertently ever calls the regular create!, save!, etc it should work work most of the time.

1 Like

Don’t fear ugly, obvious code:

before_validation do
  code = nil
  overflow_prevention = 0
  while self.class.where(code: code).any?
    code = Generate.random_code
    overflow_prevention += 1
    if overflow_prevention > 10
      raise "Unable to generate unique code after #{overflow_prevention} attempts - something else may be wrong"
    end
  end
end, on: :create

This is still not bullet proof as another process could generate the same code and you’d have a constraint violation in the database (you did use a unique constraint, right?), so another option is to have the code that’s calling .save do this loop.

(Side note: this is a good example of why putting complex logic in some other class makes things simpler - having to fit this into AR callbacks is awkward)

1 Like

Ok I haven’t tried in production :see_no_evil:, but if we’re really looking to know whether a number already exists that also considers race conditions, maybe we can have a new table License::ReservedNumber and do something like this.

Note: here is a gist with some testing done

class License < ActiveRecord::Base
  validates :number, presence: true, uniqueness: true

  before_validation ->(license) { license[:number] = ReservedNumber.reserve }, on: :create

  def number=(value)
    raise NotImplementedError, 'number cannot be explicitly set'
  end
end

class License::ReservedNumber < ActiveRecord::Base
  validates :number, presence: true, uniqueness: true

  def self.reserve(generator: Generate)
    record = new(number: nil)
    until record.save
      record.number = generator.random_code
    end
    record.number
  end
end

with this schema

ActiveRecord::Schema.define do
  create_table :license_reserved_numbers, id: false, force: true do |t|
    t.string :number, primary_key: true
  end

  create_table :licenses, force: true do |t|
    t.string :number, null: false
    t.index [:number], unique: true
  end

  add_foreign_key :licenses, :license_reserved_numbers, column: :number, primary_key: :number
end

Note: depending on the code length and the generation frequency, you might need to clear unused reserved numbers every now and then. The foreign key would prevent the deletion of a used number in a license.