MongoDB

Designing a Multi-Region Active-Active Architecture

Build a globally distributed system with active-active multi-region setup — covering traffic routing, data replication, conflict resolution, and latency.

S

srikanthtelkalapally888@gmail.com

Designing a Multi-Region Active-Active Architecture

Active-active multi-region means multiple regions simultaneously handle live traffic, providing highest availability and lowest latency.

Active-Active vs Active-Passive

Active-Passive:
  Primary (US-East): Handles all traffic
  Secondary (EU):    Standby, takes over on failure
  Failover time: 1-5 minutes

Active-Active:
  US-East: Handles US traffic
  EU-West: Handles EU traffic
  AP-South: Handles APAC traffic
  Failover: Instant (DNS re-routing)

Traffic Routing

User (Paris) → GeoDNS → EU-West region
User (NYC)   → GeoDNS → US-East region

Failure in EU-West:
  → GeoDNS shifts traffic to US-East (30s convergence)

Data Replication Challenges

The core challenge: writes in different regions, same data.

US writes: user:123 balance → $100
EU writes: user:123 balance → $90 (simultaneously)
→ CONFLICT!

Conflict Resolution Strategies

Last Write Wins (LWW)

Use logical timestamps; latest write wins.

US: balance=$100, ts=100
EU: balance=$90, ts=105
→ EU wins: $90

Problem: Loses US write.

CRDTs (Conflict-free Replicated Data Types)

Data structures that merge automatically without conflicts.

G-Counter (grow-only): Each region has own counter
Total = SUM of all region counters
US: 50, EU: 30, AP: 20 → Total: 100

Region Affinity

Route user always to their home region:

User registered in EU → All their writes go to EU
Other regions read from EU (with lag)

Global Database Options

CockroachDB:    True global SQL, automatic conflict resolution
Spanner (GCP):  External consistency, globally distributed
DynamoDB Global: Multi-region with last-write-wins
Cassandra:      Multi-region, tunable consistency

Latency Gains

Without multi-region: EU user → US-East = 100ms
With multi-region:    EU user → EU-West = 5ms
20x latency improvement

Conclusion

Active-active reduces latency and eliminates single-region outages. Region affinity and CRDTs minimize conflict complexity. Use managed global databases where possible.

Share this article