The go-to resource for upgrading Ruby, Rails, and your dependencies.

Resolving Flaky Tests Caused by Hardcoded Directories in Rails CI


In the 1880s, the US Census Bureau faced a crisis. The 1880 census had taken eight years to complete, and the 1-0 census was projected to take even longer. A solution came from Herman Hollerith, whose tabulating machine automated the count. The machine didn’t do anything the census takers couldn’t do themselves—it just let them do it faster and more accurately.

This is what we’re looking for when we parallelize our tests: we want to do the same thing, but faster. When we discuss Rails test suite speed optimization, the conversation inevitably turns to parallelization. Distributing tests across multiple processes, though, can expose assumptions in the code, leading to unpredictable test suite failures and unreliable Continuous Integration (CI) builds.

Before we get into that, though, let’s establish why this matters. Minitest race conditions directly impact engineering velocity. A flaky test suite forces developers to constantly re-run builds, which reduces trust in the continuous integration process and inflates cloud computing costs. Addressing these underlying architectural flaws is a prerequisite for meaningful CI infrastructure cost reduction.

Why Do Tests Fail in Parallel?

When we configure our Rails test_helper.rb to run tests concurrently using parallelize(workers: :number_of_processors), Rails automatically handles database isolation. It provisions a separate database schema for each worker process — typically named something like test-env-0, test-env-1, and so on. This prevents one worker from accidentally deleting a user record that another worker is currently asserting against.

While the database is neatly isolated, the file system is not. All worker processes share the exact same local file system. If multiple workers attempt to read, write, or delete files in the same hardcoded directory simultaneously, they can corrupt each other’s state.

This lack of file system isolation is often the primary cause of flaky tests when transitioning to parallel CI pipelines.

Identifying the Hardcoded Directory Anti-Pattern

Let’s look at a common example of this issue. Tests that interact with temporary files, such as CSV exports, PDF generators, or image processing queues are a frequent source of this problem. In a serial test suite, writing these files to a static, hardcoded directory is generally safe because only one test executes at a time.

Consider the following test pattern, which is quite common in older applications:

test "generates a monthly financial report" do
  export_dir = Rails.root.join("tmp", "reports")
  FileUtils.mkdir_p(export_dir)
  
  export_path = export_dir.join("financials.csv")
  ReportGenerator.run(export_path)
  
  assert File.exist?(export_path)
  assert_match "Revenue", File.read(export_path)
end

When this test runs in parallel, one worker might begin generating the report. A fraction of a second later, another worker starts a similar test and overwrites the exact same financials.csv file.

This race condition will typically cause intermittent test failures that look something like this:

$ bin/rails test
...snip...
F

Failure:
FinancialReportTest#test_generates_a_monthly_financial_report [/path/to/app/test/models/financial_report_test.rb:10]:
Expected /Revenue/ to match "".

rails test test/models/financial_report_test.rb:4
...snip...

The file may appear empty or contain unexpected data, because the second worker process has begun writing to it while the first process is attempting to read it. Because the failure depends on the exact timing of the operating system’s process scheduler, the test will often pass locally but fail intermittently on your CI server.

The Solution: Environment-Aware File Paths

To resolve these Minitest race conditions, we must ensure that each worker process writes to a completely isolated directory. There are two major approaches to resolving this issue.

The first is utilizing the TEST_ENV_NUMBER environment variable, which requires minimal changes to existing code. The second is using Ruby’s built-in Dir.mktmpdir, which provides stronger guarantees but may require more refactoring. The second option is preferred for new development, though the first option can be very pragmatic when dealing with a large legacy test suite.

Approach 1: Utilizing TEST_ENV_NUMBER

When Rails spawns worker processes, it assigns each an environment variable called TEST_ENV_NUMBER. We can use this variable to dynamically namespace our temporary directories.

test "generates a monthly financial report" do
  worker_id = ENV.fetch("TEST_ENV_NUMBER", "0")
  export_dir = Rails.root.join("tmp", "reports-#{worker_id}")
  FileUtils.mkdir_p(export_dir)
  
  export_path = export_dir.join("financials.csv")
  ReportGenerator.run(export_path)
  
  assert File.exist?(export_path)
  assert_match "Revenue", File.read(export_path)
end

By appending the worker ID to the directory name (e.g., tmp/reports-1, tmp/reports-2), we guarantee that concurrent processes do not interact with the same files.

Approach 2: Using Dir.mktmpdir

While the TEST_ENV_NUMBER approach works well, a more idiomatic and robust solution is to use Ruby’s standard library. The Dir.mktmpdir method creates a unique, isolated temporary directory that the operating system guarantees will not collide with any other process.

When passed a block, Dir.mktmpdir also automatically cleans up the directory and its contents once the block finishes executing. This removes the burden of manual file path management and cleanup.

require "tmpdir"

test "generates a monthly financial report" do
  Dir.mktmpdir do |export_dir|
    export_path = File.join(export_dir, "financials.csv")
    ReportGenerator.run(export_path)
    
    assert File.exist?(export_path)
    assert_match "Revenue", File.read(export_path)
  end
  # The temporary directory is automatically deleted here
end

The Risk of Pessimistic Deletion

A related symptom of hardcoded directories is the use of an aggressive teardown block. Developers will often attempt to maintain a clean state by deleting the hardcoded directory after every test.

class ReportTest < ActiveSupport::TestCase
  teardown do
    # This deletes the directory for all parallel workers
    FileUtils.rm_rf(Rails.root.join("tmp", "reports"))
  end
  
  # ... tests ...
end

This works fine in a serial test suite. In a parallel suite, however, if one worker executes this teardown block while another worker is in the middle of writing a file to that directory, the second worker will crash with an Errno::ENOENT (No such file or directory) error.

By migrating to isolated worker directories — either via TEST_ENV_NUMBER or Dir.mktmpdir — we eliminate the need for global deletion, further stabilizing the test suite.

A Strategy for Finding Hardcoded Paths

Auditing a large test suite for hardcoded paths can seem daunting. A good starting point is to search your test directory for common file system operations.

A command like this can be a good starting point:

$ grep -r "FileUtils" "test/"
$ grep -r "File.open" "test/"
$ grep -r "mkdir" "test/"

This will often reveal tests that are manipulating the file system directly. From there, you can examine each case and apply one of the two solutions we’ve discussed.

Long-Term CI Stability

The goal of a test suite is not merely to accumulate thousands of tests, but to have tests that provide consistent, deterministic feedback. Resolving hardcoded directories requires a deliberate, methodical audit of your codebase. The return on investment, though, is substantial. By eliminating file system race conditions, we unlock the full potential of parallelization, reducing CI execution times, lowering infrastructure costs, and restoring developer trust in the build pipeline. This, in turn, allows us to ship more reliable software, faster.

Sponsored by Durable Programming

Need help maintaining or upgrading your Ruby on Rails application? Durable Programming specializes in keeping Rails apps secure, performant, and up-to-date.

Hire Durable Programming