Problem Statement We have a scenario where there is a network delay in the database communication because of inter region / datacentre communication. While load_async looks promising to improve app performance, it does add some more load on no of database connections and no of Ruby threads. There is a feature in Postgres 14 of pipelines PostgreSQL: Documentation: 14: 34.5. Pipeline Mode which will help in sending queries to database even before waiting for the response on same connection and can bring performance benefits. If the support for Pipelining is added in ActiveRecord then it will benefit in reducing the latency where network delays are biggest concern.
ruby-pg change Change in Ruby-pg gem has been done as part of the PR Add support for Pipeline mode of PostgreSQL-14 · Issue #401 · ged/ruby-pg · GitHub
Proposed interface
Similar to ActiveRecord::Base.connected_to(role: :reading)
, there can be an interface like
ActiveRecord::Base.connected_with_pipeline(role: :reading) do
# all code in this block will be connected and executed in pipeline connection
end
For an entity with children being loaded as separate DB calls instead of joins (e.g. the case of preload
), the interface could be something like
User.connected_with_pipeline(role: :reading).joins(:addresses).where("addresses.country = ?", "Poland").preload(:addresses)
Challenges / Views
Wanted to understand the impact areas which we need to be aware of before making/start doing this change. Also wanted the community views around this new behaviour to be added and any relevant suggestions on the same
Few of the challenges in implementation that I could think of are :
- In pipeline mode, the results are sent back in the order in which the
send_query
was invoked, therefore some state would be needed to maintain the order in case of block calls, so as to initialise correct entities (Refer PostgreSQL: Documentation: 14: 34.5. Pipeline Mode). - For dependent queries, having a mechanism of waiting if a subsequent query uses a field of a previous query (Similar behaviour in load_async).
- Error response handling for different queries, as if one of the query fails in the pipeline other queries are not processed
- Compatibility with other modes of loading an entity (
load_async, eager_load, includes, preload
)
Do suggest any relevant things to watch out for, or in general. about the change which we are thinking to add.
Thanks