Reducing P95 Response Times in Rails via Database Table Partitioning

Mar 15, 2026

In 1888, the United States Census Bureau faced a critical data management crisis. The 1880 census had taken eight years to tabulate by hand. With a rapidly growing population, administrators projected the 1890 census would require over a decade to process. The data would be obsolete before the government could publish it. The solution arrived via Herman Hollerith’s electromechanical tabulating machine. Instead of processing a single monolithic stack of records sequentially, operators used the machine to physically sort punch cards into separate, specialized bins based on specific criteria before counting them. This physical partitioning reduced the tabulation time from years to months.

Today, we face a parallel crisis of scale in software engineering. A Ruby on Rails application might launch with a single, large database table for storing event logs, webhooks, or user metrics. Over time, though, that table often grows to hundreds of millions of rows.

When we query these large tables, the database engine struggles to keep the relevant indexes in memory. This degrades query execution times, leading to unacceptable p95 response times.¹ The data processing becomes the primary bottleneck constraining our Rails application’s scalability.

The Economics of Query Latency

Before we get into the mechanical details of database partitioning, though, we should understand the economic and operational context of the problem.

When an Active Record query executes against a table containing hundreds of gigabytes of data, PostgreSQL attempts to use an index to locate the relevant rows. If the active portion of that index exceeds the available RAM on the database server, the engine falls back to reading from disk. Disk operations are orders of magnitude slower than memory access.²

This latency manifests directly in the application layer. When the database is slow, the Ruby web server worker remains occupied waiting for the response. A single slow query can cause a cascade of queued requests, spiking our p95 response times and severely degrading performance.

A common reaction to this performance degradation is infrastructure fine-tuning. We might increase our AWS configuration, provisioning larger database instances with more RAM or adding read replicas. While vertical scaling provides immediate relief, it creates a permanent increase in cloud infrastructure costs without resolving the fundamental architectural issue.³ A thorough performance analysis will frequently identify these monolithic tables as the root cause of the performance bottlenecks.

Conceptualizing Table Partitioning

Database partitioning applies Hollerith’s principle to relational data. Instead of storing all records in one large table, PostgreSQL distributes the data across multiple smaller, specialized tables – known as child tables – based on a specific rule. You can think of this as physically sorting our data into those specialized bins based on specific criteria.

With declarative partitioning, we define a parent table and specify a partition key. For time-series data like audit logs or metrics, the partition key is typically a timestamp column. The database engine then creates child tables representing specific ranges, such as one table per month.

When our Ruby on Rails application executes a query, it interacts exclusively with the parent table. The PostgreSQL query planner evaluates the WHERE clause, identifies which child tables contain the relevant data ranges, and entirely ignores the rest. This mechanism, known as partition pruning, reduces the amount of data the engine needs to scan. The smaller indexes for the individual child tables fit comfortably in memory, ensuring rapid query execution.⁴

Implementing Partitioning in Rails

To implement declarative partitioning in a Rails application, we must leverage native PostgreSQL features through raw SQL in our migrations. Of course, there are third-party gems that provide DSLs for this. Relying on raw SQL, however, guarantees explicit control over the database schema and reduces external dependencies, which is the approach we will be discussing here.

For example, if we had an AuditLog model that tracks user activity, we might partition this table by the created_at timestamp.

First, we create the parent table. Note that we must specify the partition strategy during creation.

class CreatePartitionedAuditLogs < ActiveRecord::Migration[7.1]
  def up
    execute <<-SQL
      CREATE TABLE audit_logs (
        id bigserial NOT NULL,
        user_id bigint NOT NULL,
        action character varying NOT NULL,
        created_at timestamp(6) without time zone NOT NULL,
        updated_at timestamp(6) without time zone NOT NULL
      ) PARTITION BY RANGE (created_at);
    SQL
    
    # We must add a primary key that includes the partition key
    execute <<-SQL
      ALTER TABLE audit_logs ADD PRIMARY KEY (id, created_at);
    SQL
  end

  def down
    drop_table :audit_logs
  end
end

The parent table itself stores no data; it acts as a routing mechanism. Next, we must create the specific child partitions to hold the records.

class CreateAuditLogPartitions < ActiveRecord::Migration[7.1]
  def up
    execute <<-SQL
      CREATE TABLE audit_logs_2024_01 PARTITION OF audit_logs
      FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
    SQL

    execute <<-SQL
      CREATE TABLE audit_logs_2024_02 PARTITION OF audit_logs
      FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
    SQL
  end

  def down
    execute "DROP TABLE audit_logs_2024_01"
    execute "DROP TABLE audit_logs_2024_02"
  end
end

From the perspective of Active Record, the AuditLog model requires no special configuration. Our application logic remains unchanged.

class AuditLog < ApplicationRecord
  # Active Record will seamlessly query the partitioned table
  # PostgreSQL handles the routing internally
end

When we execute AuditLog.where(created_at: '2024-01-15'.all_day), PostgreSQL prunes the audit_logs_2024_02 partition from the execution plan entirely, scanning only the relevant January index.

Evaluating Alternatives and Trade-offs

Partitioning is a structural optimization that alters the underlying storage mechanism. It is not without consequence. Before adopting this approach, we should evaluate the trade-offs and alternative solutions.

There are three major approaches to handling large datasets in a database.

The first approach is table partitioning, as we’ve discussed. A primary constraint of PostgreSQL partitioning, strictly speaking, is that any unique index – including the primary key – must include the partition key. This changes how we define uniqueness guarantees across the dataset. Additionally, foreign key constraints referencing partitioned tables require specific versions of PostgreSQL and can introduce complexity when archiving old data.

The second approach is application-level sharding – routing queries to entirely separate databases based on a tenant ID or region. This provides significant horizontal scaling. However, sharding introduces immense complexity into the application layer, requiring specialized connection management and making cross-shard queries exceedingly difficult.

The third option is to implement an aggressive data archiving strategy. If historical data is rarely accessed, we can move it to cold storage and delete it from the primary database. This keeps the primary tables small but sacrifices immediate read access to older records.

Generally speaking, partitioning is a great middle ground. It keeps our application logic straightforward while ensuring the database engine can manage the data efficiently.

Conclusion

As applications scale, data volume eventually outgrows the capabilities of standard indexing. By implementing PostgreSQL table partitioning, we align our database architecture with actual query patterns. This structural change ensures indexes remain efficient, reducing p95 response times and stabilizing performance.

Rather than perpetually increasing cloud infrastructure costs to mask the symptoms of scale, partitioning resolves the underlying bottlenecks directly at the data layer. By planning for the long-term maintainability of our data architecture, we ensure our applications remain responsive and sustainable for years to come.

¹ USENIX Association. 2018. “Challenges and Opportunities for Web Application Performance Measurement.” Presented at NSDI 18. https://www.usenix.org/conference/nsdi18/presentation/ho.

² PostgreSQL Global Development Group. 2026. “PostgreSQL Documentation: Runtime Configuration.” PostgreSQL. https://www.postgresql.org/docs/current/runtime-config-resource.html.

³ Amazon Web Services. n.d. “Choosing the Right Amazon RDS Instance Type.” AWS Database Blog. https://aws.amazon.com/blogs/database/choosing-the-right-rds-instance-type/.

⁴ Markus Winand. “Use The Index, Luke! – A Guide To Database Performance.” https://use-the-index-luke.com/.

Reducing P95 Response Times in Rails via Database Table Partitioning

The Economics of Query Latency

Conceptualizing Table Partitioning

Implementing Partitioning in Rails

Evaluating Alternatives and Trade-offs

Conclusion

You May Also Like

Implementing Redis Caching to Alleviate Database Load in Legacy Rails Apps

Fixing ActiveRecord::ValueTooLong Errors in Rails

Replacing Heavy React SPAs with Inertia.js and Svelte in Rails Apps