Automating Database Backups And Rollbacks During Cloud Platform Transitions
In late May of 1886, the rail network of the southern United States faced a massive interoperability problem. The southern states operated on a 5-foot broad gauge, while the northern states used a 4-foot-8.5-inch standard gauge. Trains could not cross between the two networks without stopping to unload and reload all cargo, which was slow and expensive. What would they do?
A solution came in the form of a massive, meticulously coordinated effort. Over a single two-day period, tens of thousands of workers physically moved the rails inward by three and a half inches across thousands of miles of track. They prepared extensively, ensuring that if any segment failed, they had contingencies and supplies staged to keep critical traffic moving safely.
Similarly, migrating a Ruby on Rails application’s database across cloud platforms requires the same kind of careful preparation and execution. When moving from one provider to another — from Heroku to AWS, for instance — we need a way to ensure our data makes the journey intact, with a reliable, automated plan to retreat if the migration encounters unexpected issues. It’s not just about moving data; it’s about ensuring that we can safely roll back to a known good state.
Before we get into the mechanics of a tool like pg_dump, though, let’s take a step back and talk about the nature of the data we are moving. The strategies we choose depend entirely on our tolerance for downtime and data loss, and understanding those constraints is the first step in a successful migration.
Exploring Alternatives
When it comes to moving a PostgreSQL database between cloud providers, we have a few well-established methods.
The first is logical replication. This approach involves setting up a publisher on the source database and a subscriber on the destination. It replicates changes continuously, which allows for near-zero downtime migrations. It’s powerful but adds significant operational complexity.
The second option is using a standard backup and restore process via pg_dump and pg_restore. This requires taking the application offline, capturing a complete snapshot of the data, and restoring it on the new platform. It’s a conceptually simpler process that results in a single, portable backup file.
A third option is physical replication. However, since this relies on byte-for-byte disk copying and is rarely supported when moving between different managed cloud providers, we won’t be discussing it at length here.
Generally speaking, logical replication is excellent for very large databases where downtime must be strictly minimized. For small to medium applications, though, the pg_dump and pg_restore approach often makes more sense. It is less complex to configure, easier to verify, and provides a clean, portable artifact that is ideal for migration and rollback scenarios. For these reasons, it’s the method I personally prefer, and it’s the one we will focus on.
The Concept of Point-in-Time Recovery (PITR)
Strictly speaking, a pg_dump file is a snapshot of a specific moment. This is a different approach from Point-in-Time Recovery, or PITR, which is a feature many managed cloud database providers offer. PITR combines a base backup with a continuous archive of transaction logs, known as Write-Ahead Logs (WAL) in PostgreSQL.
This allows administrators to restore a database to any exact second before a catastrophic event. It’s an incredibly powerful tool for operational recovery — for instance, recovering from an accidental data deletion that happened five minutes ago.
Of course, relying solely on a cloud provider’s internal PITR during a platform transition presents a risk. If we are moving from Provider A to Provider B, we need a portable backup that we control, one that is independent of either provider’s proprietary recovery mechanisms. A pg_dump file is exactly that: a self-contained, portable artifact that we can use to restore our database anywhere. We cannot roll back a failed migration on Provider B using Provider B’s PITR if the very data we loaded was corrupted from the start. For a migration, our pg_dump file is our source of truth.
Practical Example: Backup with pg_dump
Let’s walk through how we might script the creation of a reliable, portable backup using pg_dump.
For example, if you had a Rails application connected to a PostgreSQL database, you would want to capture a custom-format dump. This format is compressed and allows for flexible restoration. Here’s a command you might include in a migration script:
$ pg_dump --format=custom --no-owner --no-acl --file=production_migration.dump postgres://user:pass@old-cloud-host:5432/app_db
This command connects to the old cloud host and creates a file named production_migration.dump. We exclude owners and access control lists (ACLs) because the database user roles on the new cloud platform are often designed to be different, and preserving old roles can cause restoration errors. It’s a small detail, but one that prevents a common source of migration friction.
You also may notice that we are passing the connection string directly. In practice, you would use environment variables for credentials rather than typing them directly into your shell. This command gives us a single file, production_migration.dump, that contains everything we need.
Enforcing a Read-Only State
Before executing the final migration backup, we need to ensure our application is not writing new data while pg_dump is running. A common technique is to put the application into a maintenance mode that enforces a read-only state for the database.
One may wonder: how can we do this effectively in Rails? One approach is to create a middleware that intercepts all requests that aren’t GET requests and returns a 503 Service Unavailable status.
Here’s a simple example of what that middleware might look like. You could place this in app/middleware/read_only_mode.rb:
class ReadOnlyMode
def initialize(app)
@app = app
end
def call(env)
request = Rack::Request.new(env)
if request.get?
@app.call(env)
else
[503, { "Content-Type" => "application/json" }, [{ error: "Application is in read-only mode for maintenance." }.to_json]]
end
end
end
You would then enable this middleware in config/application.rb during your migration window, typically controlled by an environment variable. This ensures that no new data can be written, giving you a clean and consistent state for your backup.
Scripting the Rollback
A migration plan is incomplete without an automated and tested rollback procedure. If the new cloud environment fails to perform as expected — perhaps due to unexpected latency or configuration errors — we must be able to revert to the original state quickly and reliably.
A good rollback script needs to halt any data written to the new environment and redirect the application back to the old one. If we kept the old database untouched and in read-only mode during the transition, the rollback can be as simple as pointing the application back to the original database.
For a Heroku application, this might look like this:
#!/bin/bash
# rollback-to-old-db.sh
set -e
echo "Rolling back to original database..."
heroku config:set DATABASE_URL=postgres://user:pass@old-cloud-host:5432/app_db -a our-rails-app
echo "Rollback complete. Application is pointing to the old database."
When run, the output would look something like this:
$ ./rollback-to-old-db.sh
Rolling back to original database...
Setting DATABASE_URL and restarting our-rails-app... done
Rollback complete. Application is pointing to the old database.
Tip: It’s wise to keep the old database running in a read-only state for a period after the migration is deemed successful. This provides a safety net in case subtle issues are discovered hours or even days later.
If, however, the rollback requires restoring the data to a fresh instance (perhaps the old one was decommissioned too quickly), we would use pg_restore:
$ pg_restore --clean --if-exists --no-owner --no-acl --dbname=postgres://user:pass@fallback-host:5432/app_db production_migration.dump
The --clean flag drops existing database objects before recreating them, ensuring we return exactly to the state captured in the backup. This is the key to a consistent recovery.
Read-only Verification
Before executing the migration backup, we need to verify that our application is not writing new data. Let’s see if we can confirm that our database is truly read-only by running an inspection script via the Rails console:
$ rails console
irb(main):001> ActiveRecord::Base.connection.execute("SET default_transaction_read_only = 'on';")
irb(main):002> User.create!(email: "test@example.com")
The expected output confirms the read-only state:
ActiveRecord::StatementInvalid: PG::ReadOnlySqlTransaction: ERROR: cannot execute INSERT in a read-only transaction
As we can see, attempting to write data raises a ReadOnlySqlTransaction error. This confirms that our enforcement is active.
Of course, this raises the question of exactly how we implement this system-wide during the migration window. We would typically alter the database user’s default transaction mode, or, as we saw earlier, use a middleware layer to reject writes. The key is to verify the read-only state before starting the backup.
Conclusion: Confidence Through Automation
Migrating a database between cloud platforms, much like standardizing a rail network, is an exercise in managing complexity. The goal is not just to move data, but to do so with confidence, knowing that you have a reliable and tested plan for both the migration and its potential rollback.
By scripting the process, we turn a high-stakes manual effort into a repeatable, predictable operation. A portable pg_dump backup serves as our source of truth, a read-only state ensures data consistency, and an automated rollback script provides a critical safety net. These elements, when combined, allow us to perform complex platform transitions with the precision and control required for modern infrastructure.
Sponsored by Durable Programming
Need help maintaining or upgrading your Ruby on Rails application? Durable Programming specializes in keeping Rails apps secure, performant, and up-to-date.
Hire Durable Programming