Hi,
When optimizing queries like:
select *
from foo
where id in (select foo_id from bar where amount != 0 and date > '2020-01-01')
It is sometimes (at least in our case with Postgres) useful to rewrite this as:
with mybars as materialized (
select foo_id from bar where amount != 0 and date > '2020-01-01'
)
select *
from foo
where id in (select foo_id from mybars)
The keyword materialized
here is essential since it forces postgres to do the “inner query” first.
Some background:
Let’s say you have an index_1
on amount
and date
and index_2
on foo_id
. The first query can be rewritten to a “normal” join by postgres query planner and postgres might choose to use index_2
even if this is not the best. The second query forces the “inner query” to be executed first.
I implemented this as follows and it seems to work:
module Arel
module Nodes
class AsMaterialized < Binary; end
end
module Visitors
class ToSql
def visit_Arel_Nodes_AsMaterialized(o, collector)
collector = visit o.left, collector
collector << " AS MATERIALIZED "
visit o.right, collector
end
# EDIT: Needed to add the new Node here as well (perhaps only here?)
def collect_ctes(children, collector)
children.each_with_index do |child, i|
collector << ", " unless i == 0
case child
when Arel::Nodes::As, Arel::Nodes::AsMaterialized
name = child.left.name
relation = child.right
when Arel::Nodes::TableAlias
name = child.name
relation = child.relation
end
collector << quote_table_name(name)
if child.is_a?(Arel::Nodes::AsMaterialized)
collector << " AS MATERIALIZED "
else
collector << " AS "
end
visit relation, collector
end
collector
end
end
end
end
Then it can be used like:
bar_table = Bar.arel_table
cte_table = Arel::Table.new(:mybars)
cte_definition = bar_table
.project(:foo_id)
.where(bar_table[:amount].not_eq(0))
.where(bar_table[:date].gt('2020-01-01'))
puts Arel::Nodes::AsMaterialized.new(cte_table, cte_definition).to_sql
This produces mybars AS MATERIALIZED (select....etc)
Would this be welcome as a pr?