Logo
Technical Article

Migration on Wheel

4 min read

Migrating 50TB at 100k RPS Without Downtime — A Practical System Design Breakdown#

Note: This blog is a summary of the YouTube video "System Design: Migrating 50TB at 100k RPS (The Hard Way)" by FAANG Senior Engineer.


The Problem#

Imagine this:

  • A 50TB Oracle database

  • A monolithic application

  • Handling 100,000 writes per second (100k RPS)

  • Requirement:

    • ✅ Zero downtime
    • ✅ Zero data loss
    • ✅ Migration to microservices

At this scale, the physics of distributed systems start to matter. Small mistakes become catastrophic.


Why Naive Approaches Fail#

1️⃣ "Just Take a Database Dump"#

Let's do the math.

Even with a dedicated 10 Gbps connection, transferring 50TB would take around 11–12 hours. At 100k writes per second:

100,000 × 60 × 60 × 12 ≈ 43 billion writes

That's 43 billion lost writes during downtime.

Offline migration is simply not an option.


2️⃣ "Just Do Dual Writes"#

Another common idea: Modify the app to write to both old and new databases.

This is a trap.

Problems:

  • Distributed transaction complexity (Two Generals Problem)
  • Network blips create inconsistencies
  • 100ms outage at 100k RPS → 10,000 failed writes
  • Latency increases
  • Monolith becomes even more brittle

At this scale, application-level dual writes are dangerous.


First Principles: A Database Is a Log#

The key insight comes from Jay Kreps' paper on The Log.

A database table is not the source of truth. The transaction log is.

  • Oracle → Redo Log
  • Postgres → WAL (Write-Ahead Log)

Every write is appended to a log:

  • Insert X
  • Update Y
  • Delete Z

It's a stream. And that stream becomes the foundation of the migration strategy.


The Real Strategy#

The solution combines three major ideas:

  1. Change Data Capture (CDC)
  2. Kafka as a buffer
  3. Strangler Fig Pattern

Step 1: Build a Change Data Capture Pipeline#

Instead of writing to two databases at the app layer:

  • Use a CDC tool like Debezium
  • It tails the Oracle redo log (like a replica)
  • Converts changes into events (JSON/Avro)
  • Publishes them to Apache Kafka

Why Kafka?#

Because at 100k RPS:

  • You need a durable buffer
  • You need backpressure handling
  • You need fault tolerance

If the new microservice crashes, Kafka absorbs the load. No data loss. No pressure on Oracle.


Step 2: Handle the Existing 50TB (Snapshot + Replay)#

CDC only captures new writes. But what about the existing 50TB?

Debezium can:

  • Take an initial snapshot
  • Read tables chunk-by-chunk
  • While simultaneously tracking log position

Flow:

  1. Snapshot starts at T0
  2. Snapshot may take days
  3. Meanwhile, all new writes go to Kafka
  4. When snapshot completes, the new DB replays Kafka events starting from T0

Result: a near real-time replica, possibly only milliseconds behind.


Step 3: Scaling Concurrency Correctly#

At 100k RPS, one consumer won't work. You must:

  • Partition Kafka topics by user ID (or logical key)

This ensures parallel processing with order preserved per user.


Critical Requirement: Idempotency#

Kafka may deliver messages twice. Your new service must handle that safely.

Safe:

Set balance = 50

Dangerous:

Subtract 10

Prefer full state over deltas whenever possible.


Step 4: Move Traffic — The Strangler Fig Pattern#

Use a proxy (Envoy/Nginx):

  1. Initially → All traffic to monolith
  2. Introduce new microservice
  3. Enable shadow traffic (real request → monolith, copy → microservice async)

This allows performance testing at real scale with zero user impact.


Step 5: Incremental Cutover#

Route traffic gradually:

  • 1% → 10% → 50% → 100%

The Ugly Part: Reverse CDC#

If a legacy monolith job reads user data that's already migrated, Oracle has stale data. Solution: stream changes back from microservice to Oracle temporarily. Remove it once fully cut over.


Final Takeaway#

Migrating 50TB at 100k RPS is like performing a brain transplant while the patient is running.

The strategy:

  • CDC from transaction logs
  • Kafka as a durable buffer
  • Snapshot + replay
  • Idempotent consumers
  • Strangler Fig routing
  • Incremental cutover
  • Temporary reverse CDC

Source#

"System Design: Migrating 50TB at 100k RPS (The Hard Way)" — By FAANG Senior Engineer

Related Posts