Feature proposal: Relation.update_all but with varying values

Hello hello, I am working on a variation of Relation.update_all that can set varying values per row affected. As I understand it there is no available high-level interface for such a query, the high-level workarounds being to do multiple update queries or use Relation.upsert_all. The interface is similar to Relation.upsert_all but the internals are more like Relation.update_all. current implementation

The update query employs the pattern of joining with a handwritten VALUES table incorporating both conditions and update values in each row. This is supported on all the engines – though certainly not without several annoyances on each one.

More importantly perhaps, checking EXPLAIN on MySQL and PostgreSQL suggests they do run the construct as one would expect, using the correct index for the join condition.

Example construct in PostgreSQL, with a join on the primary key:

UPDATE "books" SET "name" = "data"."column2"
  FROM (VALUES (1, 'Reword'), (2, 'Peopleware')) AS "data" ("column1", "column2")
  WHERE "books".id = "data"."column1";

It would be generated from this code:

# [[conditions, updates], ...] syntax
Book.uniform_update_all([
  [{ id: 1 }, { name: "Reword" }],
  [{ id: 2 }, { name: "Peopleware" }]
])

# shorthand syntax when conditions is the primary key
Book.uniform_update_all({
  1 => { name: "Reword" },
  2 => { name: "Peopleware" },
})

More advanced example where the conditions are not the primary key:

# one-shot migration to combine two columns into one
Book.uniform_update_all [
  [{ written: false, published: false }, { status: :proposed }],
  [{ written: true,  published: false }, { status: :written }],
  [{ written: true,  published: true },  { status: :published }]
]

It is “uniform” in the sense that all entries must use the same keys for conditions and values, just like Relation.insert_all.

Considerably more advanced example from the PostgreSQL docs linked before, which this solution cannot assemble for several reasons (non-trivial SET and JOIN):

UPDATE employees SET salary = salary * v.increase
  FROM (VALUES(1, 200000, 1.2), (2, 400000, 1.4)) AS v (depno, target, increase)
  WHERE employees.depno = v.depno AND employees.sales >= v.target;

Some implementation details: we build the full Arel tree with a custom arel node for the VALUES table. Then we rely on Arel::Crud#compile_update just like Relation.update_all does to assemble the update query.

This approach looks like the most appropriate, as it will inherit where constraints just like update_all, but it will invariably break if the arel visitor later decides it must perform a subselect, as it hides the VALUES table from the SET clause.

This looks like a very general problem, one that other constructs would have as well, so I don’t think it means the implementation is wrong. It does mean that it will support only where and join basically, and none of these: order, limit, offset, group, having.

I am looking for feedback on the idea and implementation etc. Will send a pull request on github once I’ve cleaned the code up a bit more.

1 Like