Ruby 3.2 + Rails 6 & 7 leading to gradual memory leak compared to stable memory in Ruby 2.6 + Rails 6

Few months back we upgraded on of our services from Ruby 2.6.6 with Rails 6.0.5.1 to Ruby 3.2.2 with Rails 7 and have noticed a slow but gradual increase of memory without any major change to the underlying application code.

We tried Ruby 3.2.2 with Rails 6 and the result is same as with Ruby 3.2.2 with Rails 7.

Service is simple background scanning service running Sidekiq only with some crons, no web traffic served.

CPU performance did improve

even the base memory usage improved but the memory keeps on increasing slowly every few hours until it reaches the ceiling

With Ruby 2, the service stabilized at a certain memory, with Ruby 3 it keeps on climbing. Then we used Jemalloc and the climbing slowed down but still happens after few days.

  • Enabling and disabling JYIT Ruby 3 doesn’t have any impact.
  • Used both Alpine and base ruby image, doesn’t have any impact

There is no memory bloat, we don’t see sharp surge in sidekiq memory but slow gradual increase

Slow increase with Ruby 3

Stable memory consumption with Ruby 2

Old Gemfile

ource 'https://rubygems.org'
git_source(:github) { |repo| "https://github.com/#{repo}.git" }

ruby '2.6.6'

gem 'dotenv-rails'
# Bundle edge Rails instead: gem 'rails', github: 'rails/rails'
gem 'rails', '~> 6.0.5', '>= 6.0.5.1'
# Use postgresql as the database for Active Record
gem 'pg', '>= 0.18', '< 2.0'
# Use Puma as the app server
gem 'puma', '>= 5.6.4'
# Use SCSS for stylesheets
gem 'sass-rails', '>= 6'
# Transpile app-like JavaScript. Read more: https://github.com/rails/webpacker
gem 'webpacker', '~> 5.2'
# Turbolinks makes navigating your web application faster. Read more: https://github.com/turbolinks/turbolinks
gem 'turbolinks', '~> 5'
# Build JSON APIs with ease. Read more: https://github.com/rails/jbuilder
gem 'jbuilder', '~> 2.10'
# Use Redis adapter to run Action Cable in production
# gem 'redis', '~> 4.0'
# Use Active Model has_secure_password
# gem 'bcrypt', '~> 3.1.7'

# Use Active Storage variant
# gem 'image_processing', '~> 1.2'

gem 'redis', '~> 4.2.5'

gem 'hiredis', '~> 0.6.3'
# Reduces boot times through caching; required in config/boot.rb
gem 'bootsnap', '>= 1.7.3', require: false


gem 'sidekiq', '~> 6.2.1'

# Auto throttle jobs
gem 'sidekiq-throttled', '~> 0.13.0'

gem 'redis-namespace', '~> 1.8.0'
# For interacting with Github API
gem 'octokit', '~> 4.21.0'

gem 'faraday-http-cache', '~> 2.2.0'

# For docker api
gem 'docker-api', '~> 2.0.0'

# Collect metrics
gem 'dogstatsd-ruby', '~> 4.8.2'

# Asana
gem 'asana', '~> 0.10.2'

gem 'httparty', '~> 0.18.1'

gem 'shodanz', '~> 2.0.4'

gem 'redlock', '~> 1.2.1'

gem 'slack-notifier', '~> 2.3.2'

gem 'pagerduty', '3.0.0'

gem 'gitlab', '~> 4.17'

gem 'git', '>= 1.11.0'

gem 'slack-ruby-client', '~> 0.17.0'

gem 'graphql-client', '~> 0.17.0'

gem 'prawn', '~> 2.4'

gem 'prawn-table', '~> 0.2.2'

gem 'newrelic_rpm', '~> 8.9'

gem 'sidekiq-cron', '~> 1.7.0'
group :development, :test do
  # Call 'byebug' anywhere in the code to stop execution and get a debugger console
  gem 'byebug', platforms: [:mri, :mingw, :x64_mingw]
end

group :development do
  # Access an interactive console on exception pages or by calling 'console' anywhere in the code.
  gem 'web-console', '>= 3.3.0'
  gem 'listen', '>= 3.0.5', '< 3.4'
  # Spring speeds up development by keeping your application running in the background. Read more: https://github.com/rails/spring
  gem 'spring'
  gem 'spring-watcher-listen', '~> 2.0.0'
  gem 'rubocop'
end

group :test do
  # Adds support for Capybara system testing and selenium driver
  gem 'capybara', '>= 2.15'
  gem 'selenium-webdriver'
  # Easy installation and use of web drivers to run system tests with browsers
  gem 'webdrivers'
end

# Windows does not include zoneinfo files, so bundle the tzinfo-data gem
gem 'tzinfo-data', platforms: [:mingw, :mswin, :x64_mingw, :jruby]

New Gemfile

source 'https://rubygems.org'
git_source(:github) { |repo| "https://github.com/#{repo}.git" }

ruby '3.2.2'

gem 'dotenv-rails'
# Bundle edge Rails instead: gem 'rails', github: 'rails/rails'
gem 'rails', '~> 7.0', '>= 7.0.4.3'
# Use postgresql as the database for Active Record
gem 'pg', '>= 0.18', '< 2.0'
# Use Puma as the app server
gem 'puma', '>= 5.6.4'
# Use SCSS for stylesheets
gem 'sass-rails', '>= 6'
# Transpile app-like JavaScript. Read more: https://github.com/rails/webpacker
gem 'webpacker', '~> 5.2'
# Turbolinks makes navigating your web application faster. Read more: https://github.com/turbolinks/turbolinks
gem 'turbolinks', '~> 5'
# Build JSON APIs with ease. Read more: https://github.com/rails/jbuilder
gem 'jbuilder', '~> 2.10'
# Use Redis adapter to run Action Cable in production
# gem 'redis', '~> 4.0'
# Use Active Model has_secure_password
# gem 'bcrypt', '~> 3.1.7'

# Use Active Storage variant
# gem 'image_processing', '~> 1.2'

gem 'redis', '~> 4.2.5'

gem 'hiredis', '~> 0.6.3'
# Reduces boot times through caching; required in config/boot.rb
gem 'bootsnap', '>= 1.7.3', require: false


gem 'sidekiq', '~> 6.4.2'

# Auto throttle jobs
gem 'sidekiq-throttled', '~> 0.15.0'

# For interacting with Github API
gem 'octokit', '~> 4.21.0'

gem 'faraday-http-cache', '~> 2.2.0'

# For docker api
gem 'docker-api', '~> 2.0.0'

# Collect metrics
gem 'dogstatsd-ruby', '~> 4.8.2'

# Asana
gem 'asana', '~> 0.10.2'

gem 'httparty', '~> 0.18.1'

gem 'redlock', '~> 1.2.1'

gem 'slack-notifier', '~> 2.4'

gem 'pagerduty', '3.0.0'

gem 'gitlab', '~> 4.17'

gem 'git', '>= 1.11.0'

gem 'slack-ruby-client', '~> 0.17.0'

gem 'graphql-client', '~> 0.18.0'

gem 'graphql', '~> 2.0', '>= 2.0.22'

gem 'prawn', '~> 2.4'

gem 'prawn-table', '~> 0.2.2'

gem 'newrelic_rpm', '~> 9.2', '>= 9.2.2'

gem 'sidekiq-cron', '~> 1.7.0'
group :development, :test do
  # Call 'byebug' anywhere in the code to stop execution and get a debugger console
  gem 'byebug', platforms: [:mri, :mingw, :x64_mingw]
end

group :development do
  # Access an interactive console on exception pages or by calling 'console' anywhere in the code.
  gem 'web-console', '>= 3.3.0'
  gem 'listen', '>= 3.0.5', '< 3.4'
  # Spring speeds up development by keeping your application running in the background. Read more: https://github.com/rails/spring
  gem 'spring'
  gem 'spring-watcher-listen', '~> 2.0.0'
  gem 'rubocop'
end

group :test do
  # Adds support for Capybara system testing and selenium driver
  gem 'capybara', '>= 2.15'
  gem 'selenium-webdriver'
  # Easy installation and use of web drivers to run system tests with browsers
  gem 'webdrivers'
end

# Windows does not include zoneinfo files, so bundle the tzinfo-data gem
gem 'tzinfo-data', platforms: [:mingw, :mswin, :x64_mingw, :jruby]

gem "matrix", "~> 0.4.2"

# For Ruby 3.2 YAML behaviour issues - https://stackoverflow.com/a/71192990
gem 'psych', '< 4'

Apart from ruby upgrade, What could be the possible reason behind this behaviour and did someone else observe something similar ?

1 Like

I think you will probably need to inspect the memory heap to see what is this object (or objects) class is slowly growing.

You can try to use GitHub - Shopify/heap-profiler: Ruby heap profiler.

Are you able to also reproduce the behavior with Ruby 3.0.x? Maybe it would be easier to find the cause if you know for sure which version of Ruby causes this problem as well.

1 Like

Thank you, let me try this.

I did try GitHub - SamSaffron/memory_profiler: memory_profiler for ruby in the past and it pointed to GitHub - github/graphql-client: A Ruby library for declaring, composing and executing GraphQL queries which in turn points to GitHub - rmosolgo/graphql-ruby: Ruby implementation of GraphQL

I tried upgrading this gem [IMPORTANT] Memory Leak issue with graphql 2.0.17 · Issue #4370 · rmosolgo/graphql-ruby · GitHub in hopes of seeing an improvement but didn’t help.

Let me try this profiler gem tomorrow morning and update the findings.

Are you able to also reproduce the behavior with Ruby 3.0.x?

Will try intermediate versions with same Gemfile to find a pattern, thanks.

Just disable VWA to try

Downgraded from Ruby 3.2.2 to Ruby 3.1.4 and eventually to Ruby 3.0.6.

Same behaviour of gradual memory increase, although YJIT in 3.2 helped slow down the climb a bit.

1 Like

@zhuzhu Thanks for the suggestion, It can only be set during the during build, don’t see an option to disable with prebuilt official ruby docker images.

So I did manage to find the issue.

There was a GithubGraphql client which load schema on every initialization and was holding onto memory of about 18 MB per invocation, moved it to a Singleton and now it is initialized once per every Sidekiq process.

---
 lib/github_graphql.rb | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/lib/github_graphql.rb b/lib/github_graphql.rb
index eac193f..81fd230 100644
--- a/lib/github_graphql.rb
+++ b/lib/github_graphql.rb
@@ -6,6 +6,14 @@ class GithubGraphql
   SCHEMA_LOCATION = '/tmp/github_schema.json'
   MAX_ITEMS = 100
 
+  class SchemaLoader
+    include Singleton
+    attr_reader :schema
+    def initialize
+      @schema = GraphQL::Client.load_schema(SCHEMA_LOCATION)
+    end
+  end
+
   def initialize(opts = {})
     @client_opts = opts
     init_client
@@ -40,7 +48,7 @@ def load_schema
       # Don't store schema if there were errors, else one bad token will be persisted and poison rest of the attempts
       File.delete(SCHEMA_LOCATION) if schema['errors'].present?
     end
-    GraphQL::Client.load_schema(SCHEMA_LOCATION)
+    SchemaLoader.instance.schema
   end
 end
 

Heap Profiler dump - Before Singleton (invoked ~10 times)

Total retained: 150.97 MB (886785 objects)

retained memory by gem
-----------------------------------
 133.49 MB  graphql-1.13.19
  13.96 MB  3.2.2/lib
   3.49 MB  graphql-client-0.16.0

Heap Profiler dump - After Singleton (invoked ~10 times)

Total retained: 18.22 MB (157025 objects)

retained memory by gem
-----------------------------------
  13.30 MB  graphql-1.13.19
   3.04 MB  graphql-client-0.16.0
   1.86 MB  3.2.2/lib

However, interesting question is why it didn’t cause the same issue in Ruby 2.x series ?

And why is Ruby 3.0.6 more efficient than 3.1.4 and 3.2.4 in terms of Memory utilisation ?

Another interesting and more shocking was the Sidekiq performance with Ruby 3.0.6

Sidekiq Enqueued was choked and seemed to struggle to drain for Ruby 3.1.x and Ruby 3.2.x

CPU performance was more jittery but performed better for Ruby 2.x and Ruby 3.0.6

Happy to provide more insights, but in a nutshell,

Ruby 3.0.6 is more stable than Ruby 3.1.4 and Ruby 3.2.2

2 Likes

Been curious since you first had posted. Great to see the detailed breakdown of everything, and understand the nature of the issue.

Thank you for this!

2 Likes

cc - @rafaelfranca

Pinging again, If this got lost in the piles of notifications.

@Jatin_Dhankhar Hi! Any update on this? I have the same problem with Rails 6.1.7.4, Puma 5.6 and Ruby 3… As a comment, puma increases memory without any return

Hi @cfgv,

Unfortunately no, we have been running it with Ruby 3.0.6, will attempt to upgrade to the upcoming with 3.3.0 version, to see if there are improvements.

The app in question is backend only with Sidekiq and doesn’t have any web server.

Thank you for your answer, we are having a memory bulk and the server is crashing and we cannot find the reason :frowning:

Unfortunately no, we have been running it with Ruby 3.0.6, will attempt to upgrade to the upcoming with 3.3.0 version, to see if there are improvements.

Tried to upgrade from Ruby 3.0.6 to Ruby 3.3.0. Unfortunately, same issue persists.

Sidekiq Latency and enqueued set size shoots up.

CPU Utilization, notice how during Ruby 3.3, min,max and avg were much closer to each other.

Memory consumption remained consistent.

Something is causing Sidekiq to throttle and perform slowly in Ruby 3.2+ versions