Tuning Your AWS Cloud Infrastructure After a Rails 7 Upgrade

Mar 15, 2026

In the mid-19th century, the transition from sail to steam power revolutionized global shipping. A shipping company, though, could not drop a steam engine into a wooden clipper ship and expect optimal results. The new engines required entirely new infrastructure – specifically, coaling stations across the globe, different hull designs to handle the vibration, and engineers who understood thermodynamics rather than wind patterns. The new propulsion system fundamentally changed the requirements of the vessel and its supporting network.

Similarly, we often treat the deployment of a major framework upgrade as the final step in a modernization project. You merge the pull request, verify the CI/CD pipeline is green, and watch the new Rails 7 application boot in production. However, deploying the code is only the beginning. Moving to Rails 7 and modern Ruby 3.x introduces fundamental shifts in memory management, concurrency, and connection handling. If you leave your existing infrastructure untouched, you risk not only escalating cloud costs but also seeing suboptimal p95 response times.

Before we get into that, though, we must understand why infrastructure fine-tuning is necessary. Modern Ruby introduces the YJIT compiler, which significantly improves execution speed but alters the memory footprint. Furthermore, Rails 7 defaults to modern frontend paradigms like Hotwire, which shift how your application interacts with caches and WebSockets. We need to align your AWS configuration and Ruby VM optimization with these new architectural realities.

Calibrating Memory Limits for YJIT on ECS and EC2

The introduction of the YJIT compiler in Ruby 3.2 is one of the most substantial performance improvements in recent Ruby history. YJIT works by compiling Ruby code into machine code at runtime, which drastically reduces execution time. This performance, however, requires a trade-off in memory consumption. The compiler needs memory to store the generated machine code.¹

One may wonder: if YJIT requires more memory, won’t it immediately crash our containers on boot? The answer is straightforward. Strictly speaking, YJIT does not immediately allocate all of its maximum allowed memory on boot. By default, Ruby allocates a 256 MB heap for YJIT; it then allocates memory pages from this heap dynamically as it compiles methods that are actively called. This means you will observe a gradual increase in memory consumption during the first few hours of a deployment rather than an immediate spike.

Tip: We recommend using a monitoring tool like Datadog or AWS CloudWatch Container Insights to track this memory growth. Set up alerts for when your containers approach their memory limits, so you can proactively adjust your task definitions.

When running your application on Amazon Elastic Container Service (ECS) or directly on Amazon EC2, you should monitor your memory utilization closely after the upgrade. If you previously ran your containers with strict memory limits, the additional overhead from YJIT might trigger Out Of Memory (OOM) kills. This can lead to cascading failures as ECS attempts to restart crashing tasks, overwhelming your remaining healthy tasks. One must assume that any task running near its memory limit before the upgrade will crash after enabling YJIT.

We recommend reviewing your ECS task definitions. You may need to increase the memory reservation for your web and worker containers by 10% to 20% to accommodate the compiler. Conversely, because individual requests complete faster, your application can handle higher throughput per container. This increased efficiency often allows you to reduce the total number of running tasks, ultimately lowering your overall AWS expenditure.

Optimizing Puma Concurrency for AWS Instance Types

Rails 7 ships with a refined Puma configuration that is designed to maximize concurrency. To realize these benefits, you should tune Puma’s worker and thread counts to match the hardware profile of your AWS instances.²

A common pitfall is deploying an application with a hardcoded number of Puma workers to instances with varying vCPU counts. If you use AWS Graviton (ARM64) instances – which we highly recommend for Ruby workloads due to their excellent price-to-performance ratio – the performance characteristics differ from standard x86 architectures.

You should configure Puma to utilize the available hardware dynamically. A standard practice is to match the number of Puma workers to the vCPU count of your ECS task or EC2 instance. This approach ensures that each worker process gets dedicated CPU time, minimizing context switching between processes. For I/O-bound workloads, which are common in web applications, we then use a small number of threads per worker – typically 3 to 5 – to handle concurrent requests while the application waits for network responses.

# config/puma.rb
require 'etc'

# Dynamically set workers based on available vCPUs.
# Falls back to WEB_CONCURRENCY if explicitly set, otherwise uses detected CPU count.
cpu_count = Etc.nprocessors
workers_count = ENV.fetch("WEB_CONCURRENCY", cpu_count).to_i
workers workers_count

# Configure a reasonable thread pool.
# A higher thread count increases concurrency but also memory usage.
max_threads_count = ENV.fetch("RAILS_MAX_THREADS") { 5 }
min_threads_count = ENV.fetch("RAILS_MIN_THREADS") { max_threads_count }
threads min_threads_count, max_threads_count

This configuration ensures your application utilizes the full compute capacity of your AWS infrastructure without causing excessive context switching.

Managing Connection Pooling with Amazon RDS

Rails 7 handles database connections more aggressively, particularly with the introduction of asynchronous queries. As you optimize Puma to handle more concurrent requests, you inherently multiply the number of active database connections.³

Note: Before making changes to your production database configuration, it’s wise to ensure the latest ‘known good’ version of your infrastructure-as-code and application configuration are committed to source control.

Amazon Relational Database Service (RDS), especially when running PostgreSQL, can struggle if the number of active connections exceeds the optimal threshold for your specific instance class. Exhausting the database connection limit will immediately degrade your p95 response times and cause request queuing.

To mitigate this, you should ensure your Active Record connection pool is properly sized relative to your Puma thread count. The traditional rule of thumb is to set the pool size equal to or slightly larger than your maximum Puma threads. However, if your application uses Rails 7’s asynchronous query feature (load_async), you must account for additional connections consumed by background threads executing those async queries. Each async operation requires its own database connection from the pool. In such cases, set the pool size to at least your Puma thread count plus the maximum number of concurrent async queries your application performs (typically the GlobalExecutor concurrency, defaulting to the thread pool size).

# config/database.yml
production:
  <<: *default
  # For standard setups: pool size should match or exceed max threads
  # For async queries: pool size should be threads + async concurrency
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>

Of course, if your architecture scales horizontally to a degree that connection counts still overwhelm the RDS instance, we strongly recommend implementing a connection pooler. There are two major approaches to connection pooling on AWS.

The first is Amazon RDS Proxy, which sits between your application and the database, multiplexing connections and protecting the database from sudden spikes in traffic. This is my preferred method for its seamless integration with RDS and IAM authentication.

The second is deploying PgBouncer. However, it’s critical to understand that a sidecar deployment within each ECS task does NOT solve horizontal scaling issues – as you scale out tasks, you still end up with one PgBouncer per task, resulting in a linear increase in connections to RDS. For horizontal scaling scenarios, you must deploy centralized PgBouncer on dedicated EC2 instances or as a standalone service that all application tasks share. This centralized approach actually pools connections across tasks, reducing the total connection count to RDS.

PgBouncer (centralized deployment) can be more cost-effective for extremely high-throughput workloads and offers more configuration flexibility than RDS Proxy. However, RDS Proxy is simpler to manage as a fully-managed AWS service with built-in high availability and IAM integration.

Scaling Amazon ElastiCache for Hotwire and Action Cable

Rails 7 encourages a fundamental shift in frontend architecture, moving away from heavy JavaScript frameworks in favor of Hotwire and Turbo. For example, Single Page Applications (SPAs) are client-heavy; they download large JavaScript bundles upfront and use JSON APIs to fetch data. Hotwire, on the other hand, is server-heavy; it relies heavily on server-rendered HTML fragments delivered via WebSockets and aggressive caching.

This architectural shift places a new burden on your Redis infrastructure – one that is often underestimated. Action Cable, the engine behind Turbo Streams, requires a robust Pub/Sub mechanism to broadcast updates to connected clients. If you run a single, under-provisioned Amazon ElastiCache for Redis cluster to handle caching, Sidekiq background jobs, and Action Cable, you will likely encounter bottlenecks. A common symptom of this is increased latency in background job processing when many users are interacting with the application via WebSockets.⁴

We suggest separating your Redis workloads. You can monitor metrics like EngineCPUUtilization and SwapUsage in CloudWatch to determine if a single cluster is becoming overloaded. Provision one ElastiCache cluster dedicated to background job processing, and a separate cluster optimized for caching and WebSocket broadcasting. For the caching cluster, review your maxmemory-policy. You typically want an eviction policy like allkeys-lru to ensure the cache does not fill up and reject new keys when serving high volumes of Turbo Stream fragments.

Refining Asset Delivery with Amazon CloudFront

The default asset pipeline in Rails 7 replaces Webpacker with Propshaft, importmap-rails, or jsbundling-rails. This transition changes how your application compiles, hashes, and serves static assets.

While the fundamental mechanism of serving assets remains similar, you should verify your Amazon CloudFront distribution correctly interfaces with the new asset structures. Rails 7 generates assets with robust cache-control headers, allowing you to cache them aggressively at the edge.

You should ensure your CloudFront Cache Behaviors are configured to forward the correct headers and respect the Cache-Control directives set by Rails. You also may notice that importmaps generate many small HTTP requests for individual JavaScript files rather than serving a single large bundle. This approach is designed to improve cacheability and reduce the amount of data browsers need to download when you make small JavaScript changes. The implication here, though, is that a properly configured CloudFront distribution utilizing HTTP/2 or HTTP/3 is critical. It allows the browser to multiplex these requests efficiently without the latency penalty of establishing multiple independent TCP connections.

Securing the Investment

Upgrading a legacy application to Rails 7 requires significant engineering effort. To secure the return on that investment, you must align your cloud infrastructure with the new framework capabilities. Just as a 19th-century shipping company needed coaling stations and new hull designs to truly benefit from steam power, your modern Rails application needs the correct AWS infrastructure to thrive.

By taking these steps, you ensure your application is not only fast and stable but also cost-effective:

Calibrate memory for YJIT to prevent OOM errors and optimize container density.
Tune Puma concurrency to match your AWS instance types.
Manage RDS connections with proper pooling to avoid database bottlenecks.
Scale ElastiCache by separating workloads for caching and background jobs.
Optimize CloudFront to handle modern asset delivery patterns.

Ruby Core Team. 2022. “Ruby 3.2.0 Released.” Ruby Association. https://www.ruby-lang.org/en/news/2022/12/25/ruby-3-2-0-released/ ↩
Puma Development Team. 2023. “Puma: A Ruby Web Server.” puma.io. https://puma.io/ ↩
Amazon Web Services. 2023. “Amazon RDS Proxy.” AWS Documentation. https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/rds-proxy.html ↩
Amazon Web Services. 2023. “Amazon ElastiCache for Redis.” AWS Documentation. https://docs.aws.amazon.com/AmazonElastiCache/latest/redir-ug/Welcome.html ↩

Tuning Your AWS Cloud Infrastructure After a Rails 7 Upgrade

Calibrating Memory Limits for YJIT on ECS and EC2

Optimizing Puma Concurrency for AWS Instance Types

Managing Connection Pooling with Amazon RDS

Scaling Amazon ElastiCache for Hotwire and Action Cable

Refining Asset Delivery with Amazon CloudFront

Securing the Investment

Footnotes

You May Also Like

Transitioning a Monolithic Rails App to Docker and AWS ECS/Fargate

Preparing for Ruby 3.4: New Features and Syntax Changes to Expect

How to Benchmark Cloud Provider Performance for Ruby on Rails Apps