When optimizing queries like:
select * from foo where id in (select foo_id from bar where amount != 0 and date > '2020-01-01')
It is sometimes (at least in our case with Postgres) useful to rewrite this as:
with mybars as materialized ( select foo_id from bar where amount != 0 and date > '2020-01-01' ) select * from foo where id in (select foo_id from mybars)
materialized here is essential since it forces postgres to do the “inner query” first.
Let’s say you have an
foo_id. The first query can be rewritten to a “normal” join by postgres query planner and postgres might choose to use
index_2 even if this is not the best. The second query forces the “inner query” to be executed first.
I implemented this as follows and it seems to work:
module Arel module Nodes class AsMaterialized < Binary; end end module Visitors class ToSql def visit_Arel_Nodes_AsMaterialized(o, collector) collector = visit o.left, collector collector << " AS MATERIALIZED " visit o.right, collector end # EDIT: Needed to add the new Node here as well (perhaps only here?) def collect_ctes(children, collector) children.each_with_index do |child, i| collector << ", " unless i == 0 case child when Arel::Nodes::As, Arel::Nodes::AsMaterialized name = child.left.name relation = child.right when Arel::Nodes::TableAlias name = child.name relation = child.relation end collector << quote_table_name(name) if child.is_a?(Arel::Nodes::AsMaterialized) collector << " AS MATERIALIZED " else collector << " AS " end visit relation, collector end collector end end end end
Then it can be used like:
bar_table = Bar.arel_table cte_table = Arel::Table.new(:mybars) cte_definition = bar_table .project(:foo_id) .where(bar_table[:amount].not_eq(0)) .where(bar_table[:date].gt('2020-01-01')) puts Arel::Nodes::AsMaterialized.new(cte_table, cte_definition).to_sql
mybars AS MATERIALIZED (select....etc)
Would this be welcome as a pr?