Optimizing Active Record Memory Usage in Large Rails Background Jobs

Mar 15, 2026

When engineering teams scale Ruby on Rails applications, background job processors like Sidekiq, Resque, or Solid Queue frequently become the primary consumers of infrastructure resources. Processing large datasets in these background workers often leads to severe memory bloat, resulting in Out-of-Memory (OOM) crashes and escalating cloud hosting costs.[^1]

Key takeaways for engineering leaders and developers:

The Problem: Active Record is designed for developer ergonomics, not strictly for memory efficiency. Loading thousands of records simultaneously instantiates large, memory-heavy Ruby objects.[^2]
The Mechanism: Each Active Record model tracks its attributes, original database state, and association cache. When allocated in bulk, these objects overwhelm the Ruby Garbage Collector (GC), leading to memory fragmentation and permanent bloat.[^3]
Immediate Mitigation: Replacing standard iterators with batch processing methods like find_each, and utilizing pluck to retrieve scalar values without instantiating full model objects.[^4]
Long-Term Strategy: Upgrading to modern Ruby versions (Ruby 3.2 or 3.3) to leverage Variable Width Allocation and improved garbage collection, which structurally reduces the memory overhead of the entire application.[^5]

The Architecture of Active Record Memory

To understand why a background job consumes so much RAM, we must first look at what happens when Active Record translates a database row into a Ruby object.

When you execute a query, the PostgreSQL or MySQL adapter returns an array of raw strings and integers. Active Record takes this raw data and allocates an ActiveRecord::Base instance for every single row. These instances are inherently heavy. They maintain a hash of the current attributes, a hash of the original attributes (for tracking changes), an association cache, and numerous internal state flags.

If we ask the database for 50,000 records at once, the Ruby virtual machine must request external memory from the operating system using the C malloc function. It creates 50,000 complex objects, holding them all in the ObjectSpace simultaneously.

This leads directly to memory fragmentation. When the job finishes and the objects are eventually garbage collected, the underlying operating system often cannot easily reclaim the fragmented memory space. As a result, the worker process retains a large memory footprint indefinitely, forcing your cloud provider to terminate the container with an OOM error.[^3]

The Danger of Unbounded Queries

The most common anti-pattern in Rails background jobs is the unbounded query. Developers often write code that works perfectly in development with a small dataset, but fails in production due to resource exhaustion.

Consider a daily job that emails inactive users:

```ruby class InactiveUserMailerJob < ApplicationJob def perform users = User.where(active: false, last_login_at: ...30.days.ago)

users.each do |user| UserMailer.reengagement_email(user).deliver_later end end end

<p>
In this example, calling <code>.each</code> on the Active Record relation forces the framework to load the entire result set into memory at once. If you have 100,000 inactive users, you have instantiated 100,000 heavy Ruby objects.
</p>
</section>
<section id="practical-strategies-for-memory-reduction">
<h1>Practical Strategies for Memory Reduction</h1>
<p>
We can mitigate this bloat by changing how we instruct Active Record to fetch and instantiate data.
</p>
<section id="batch-processing-with-find_each">
<h2>1. Batch Processing with <code>find_each</code></h2>
<p>
The most practical, durable solution for iterating over large tables is batch processing. Active Record provides <code>find_each</code> and <code>find_in_batches</code> to handle this automatically.
</p>
```ruby
class InactiveUserMailerJob < ApplicationJob
def perform
  users = User.where(active: false, last_login_at: ...30.days.ago)
  
  # Loads records in batches of 1,000 by default
  users.find_each do |user|
    UserMailer.reengagement_email(user).deliver_later
  end
end
end

Under the hood, find_each uses a cursor-based approach. It orders records by primary key and uses a WHERE id > last_seen_id condition with LIMIT to fetch batches of 1,000 rows. This avoids the performance issues of OFFSET on large datasets, where the database would scan and discard all rows prior to the offset. The garbage collector can easily clean up the previous 1,000 objects before the next batch is loaded, keeping the job's memory footprint flat and predictable.

2. Bypassing Instantiation with `pluck`

If you do not strictly need the full Active Record model, you should avoid creating it entirely. When a background job only needs to trigger an API call or enqueue another job with specific IDs, pluck is the correct tool.

```ruby # Inefficient: Instantiates objects solely to retrieve their IDs user_ids = User.where(active: false).map(&:id)

Highly efficient: Returns an array of integers directly from the adapter

user_ids = User.where(active: false).pluck(:id)

<p>
By using <code>pluck</code>, we skip the Active Record instantiation pipeline completely. The memory required for an array of integers is trivially small compared to an array of model instances.[^4]
</p>
</section>
<section id="discarding-state-with-select">
<h2>3. Discarding State with <code>select</code></h2>
<p>
At times, you must pass a model instance to a service object or mailer, but you know you will only use a few specific columns. By default, <code>SELECT *</code> retrieves every text column, JSONB payload, and integer in the table.
</p>
<p>You can explicitly limit the memory payload by specifying the columns you need:</p>
```ruby
users = User.select(:id, :email, :first_name).where(active: false)

users.find_each do |user|
UserMailer.reengagement_email(user).deliver_later
end

By omitting heavy columns - like a bio text field or a preferences JSON payload - the resulting Ruby objects are significantly smaller.[^1]

4. Releasing Memory with Garbage Collection Tuning

While it is generally recommended to let Ruby manage its own garbage collection, long-running jobs that process millions of records can sometimes outpace the GC’s heuristics.

Note that setting batch = nil inside the block before calling GC.start is ineffective. The batch array is still referenced internally by find_in_batches until the block returns, so the memory cannot be garbage collected during the block's execution. The batch is only eligible for collection after the block completes and the next batch is fetched. Instead of manual GC tuning, focus on reducing batch sizes and using pluck or select to minimize memory allocation per batch.[^3]

The Impact of Ruby Version Upgrades

While application-level optimization is necessary, the most comprehensive solution to memory bloat is often upgrading the underlying language and framework.

Recent versions of Ruby have introduced significant improvements to memory management. Ruby 3.1 and 3.2 implemented Variable Width Allocation (VWA), which allows the VM to store small strings directly inside the RVALUE slot, bypassing the need for external malloc calls entirely.[^5][^6] This significantly reduces memory fragmentation across the entire application.

Furthermore, running your background jobs on older, unsupported versions of Ruby or Rails forces your infrastructure to work harder. By prioritizing a version upgrade, engineering teams often see immediate, structural reductions in memory consumption - directly lowering the required instance sizes on AWS, Heroku, or Render.

Ultimately, writing memory-efficient background jobs requires a deliberate approach to data retrieval. By avoiding unbounded queries, leveraging batch processing, and keeping your infrastructure upgraded, you can ensure your background workers remain stable and cost-effective as your application scales.

Optimizing Active Record Memory Usage in Large Rails Background Jobs

The Architecture of Active Record Memory

The Danger of Unbounded Queries

2. Bypassing Instantiation with pluck

Highly efficient: Returns an array of integers directly from the adapter

4. Releasing Memory with Garbage Collection Tuning

The Impact of Ruby Version Upgrades

You May Also Like

How to Handle Active Record SQLite3 Deprecation Warnings in Rails 8.0

How to Handle Frozen String Literals When Upgrading Legacy Ruby Apps

Implementing Redis Caching to Alleviate Database Load in Legacy Rails Apps

2. Bypassing Instantiation with `pluck`