Proposal for public Arel API

Pitch

As was discussed in What has happened to Arel , Arel is currently not officially documented as it’s still considered private API in Active Record (though there have been notable steps in that direction). I believe the solution is to design a new interface that would become public API. I would like to propose my vision for how public Arel API could look like.

Background

Unlike many Rubyists, I started learning Ruby through Sinatra & Sequel. I quickly switched to Rails as it was much more beginner friendly, but later worked again with Sinatra, Cuba and Roda for several years, before coming back to Rails a few years ago.

During the off Rails period, I’ve grown very fond of the Sequel ORM as I was learning about its inner workings. I’ve written about Sequel over the years, from introductions and how-tos, though deep dives, all the way to Active Record integration.

Proposal

I believe Sequel’s API for building SQL expressions is exactly what Arel needs. Having been studying both Arel and Sequel ASTs, I didn’t see any obstacles in extending Arel this way, Sequel API seems like a superset of Arel API. I expect it to be fully backwards compatible too, new Arel API would build on top of the existing one.

Bringing this interface into Arel would be done in multiple stages, so that it can be reviewed and tweaked incrementally.

First stage

The arel-helpers gem aliases ActiveRecord::Core.arel_table to .[], and many people do this manually in their projects (myself included). Aaron even proposed to upstream it, but it was rejected, partly because some people already override .[] to mean something else. However, I think the core team agrees .arel_table is not a convenient enough entrypoint into Arel.

I propose the Arel module to be the entrypoint. There is already a precedent with Arel.sql, and unlike ActiveRecord it’s very short, making it ripe for having convenience methods. It would allow you to build bare or qualified identifiers.

# shorthand
Arel[:column]                  # "column"
Arel[:table][:column]          # "table.column"
Arel[:schema][:table][:column] # "schema.table.column"

# explicit
Arel.identifier(:column)       # "column"
Arel.qualify(:table, :column)  # "table.column"
Arel[:column].qualify(:table)  # "table.column"

# all columns
Arel[:table].*                 # "table".*

While regular Active Record interface automatically produces qualified identifiers (e.g. Movie.where(name: “Matrix“) produces WHERE "movies"."name =), I think Arel expressions should be interpreted verbatim (e.g. Movie.where(Arel[:name].eq("Matrix")) should produce WHERE "name" =). My attempts to implement auto-qualification increased complexity, and I think we should allow unqualified identifiers.

Other stages

If identifier stage goes well, next steps would be:

  1. boolean operators (&, | and ~)

    Arel[:name].eq("Foo").or(Arel[:description].matches("Foo"))
    # becomes
    Arel[:name].eq("Foo") | Arel[:description].matches("Foo")
    
  2. inequality operators (<, <=, >, >=)

    Arel[:rating].gt(5)
    # becomes
    Arel[:rating] > 5
    
  3. equality operators (=~, !~)

    Arel[:name].eq("Foo")
    # becomes
    Arel[:name] =~ "Foo"
    
  4. function builder:

    Arel::Nodes::NamedFunction.new("lower", :name)
    # becomes
    Arel.function(:lower, :name)
    

Feedback

What are the core team’s thoughts on this direction? I found it to work really well in Sequel for building complex SQL expressions, I don’t remember ever needing to write raw SQL strings for standard SQL.

One downside of Arel.[] API is that, without the model context, it wouldn’t be possible to do things ActiveRecord::Enum value type casting.

Honestly, for me ActiveRecord::Core.[] that Aaron proposed would be ideal. Scopes could then call model[:column]. It wouldn’t break apps as long as Active Record itself calls .arel_table and not .[]. People who have ApplicationRecord.[] overridden to do something else would have to decide whether they want to keep it or have it be the Arel entrypoint; they could even call super for symbol arguments if they never handled symbols in the first place.

Another approach is to first significantly improve ergonomics of the Arel expression building in ways that I proposed (boolean/inequality/equality operators, functions etc), and then circle back to the .arel_table alternative when there is hopefully bigger motivation.