Architecting multi-region database disaster recovery for MySQL

Enterprises expect extreme reliability of the database infrastructure that’s accessed by their applications. Despite your best intentions and careful engineering, database errors happen, whether that’s machine crashes or network partitioning. Good planning can help you stay ahead of problems and recover more quickly when issues do occur.

This blog shows one approach of deploying a database architecture that implements high availability and disaster recovery for MySQL on Compute Engine, using regional disks as well as load balancers.

Any database architecture must provide approaches to tolerate errors and recover from those errors quickly without losing data. These approaches are expressed in RTO (recovery time objective) and RPO (recovery point objective), which offer ways to set and then measure how long a service can be unavailable, and how far back data should be saved.

After a database error, a database must recover as fast as possible with an RTO as small as possible, ideally in seconds. There must be as little data loss as possible–ideally, none at all. The desired RPO is the last consistent database state.

From a database architecture and deployment viewpoint, this can be accomplished with two distinct concepts: high availability and disaster recovery. Use both at the same time in order to achieve an architecture that’s prepared for the widest range of errors or incidents.

Creating a resilient database architecture

A high-availability database architecture has database instances in two or more zones. If a server on a zone fails, or the zone becomes inaccessible, the instances in other zones are available to continue the processing. The figure below shows two instances, one in zone zn1, and one in zone zn2. The load balancer in front supports directing traffic to a healthy database instance available for read and write queries.

A disaster recovery architecture adds a second high-availability database setup in a second region. If one of the regions becomes inaccessible or fails, the other region takes over. The figure below shows two regions, primary and DR. Data is replicated from the primary to the DR region so that the DR region can take over from the latest consistent database state. The load balancer in front of the regions directs traffic to the region in charge of the read and write traffic. Here’s how this architecture looks:

Architecting multi-region database disaster recovery for MySQL

Leave a Reply Cancel reply

The Good, the Bad and the Ugly in Cybersecurity – Week 10

Microsoft Azure Training Day: Developers Guide to AI

The Good, the Bad and the Ugly in Cybersecurity – Week 10

Microsoft Azure Training Day: Developers Guide to AI

Architecting multi-region database disaster recovery for MySQL

Creating a resilient database architecture

Related posts

News you can use: What we announced in AI this month

What’s new with Google Cloud

What’s new with Google Cloud

Leave a Reply Cancel reply

The Good, the Bad and the Ugly in Cybersecurity – Week 10

Microsoft Azure Training Day: Developers Guide to AI

The Good, the Bad and the Ugly in Cybersecurity – Week 10

Microsoft Azure Training Day: Developers Guide to AI