As we learned in the introduction, synchronous replication uses eager replication, where the nodes keep all other nodes synchronized by updating all replicas in a single transaction. In other words, when a transaction commits, all nodes have the same value. This takes place by using writeset replication over group communication.
The Galera Cluster replication architecture software entities are
The entities above are depicted in the figure below and explained in more detail in the chapters below:
Replication API
The wsrep API is a generic replication plugin interface for databases. The API defines a set of application callbacks and replication plugin calls.
The wsrep API is used in a replication model where an application, such as a database server, has a state. In practice, the state refers to the contents of the database. When the database is used, clients modify the database contents and the database state changes. This changing of the state is represented as a series of atomic changes (transactions). In a database cluster, all nodes always have the same state, which they synchronize with each other by replicating and applying state changes in the same serial order.
From a more technical perspective, the state change process is as follows:
At the receiving end, the application process takes place by high priority transaction(s).
To keep the state identical on all nodes, the wsrep API uses a Global Transaction ID (GTID), which is used to both:
The GTID consists of:
By using the GTID, you can
In a human-readable format, the GTID might look like this:
45eec521-2f34-11e0-0800-2a36050b826b:94530586304
Galera Replication Plugin implements the wsrep API and operates as the wsrep provider. From a more technical perspective, it consists of:
The group communication framework provides a plugin architecture for various group communication systems.
Galera Cluster is built on top of a proprietary group communication system layer which implements virtual synchrony QoS. Virtual synchrony unifies the data delivery and cluster membership service, which provides clear formalism for message delivery semantics.
Virtual Synchrony guarantees consistency, but not temporal synchrony, which is required for smooth multi-master operation. For this purpose, Galera implements its own runtime-configurable temporal flow control, which keeps nodes synchronized to a fraction of second.
The group communication framework also provides total ordering of messages from multiple sources, which is used to build Global Transaction IDs in a multi-master cluster.
At the transport level, Galera Cluster is a symmetric undirected graph, where all database nodes are connected with each other over a TCP connection. By default, TCP is used for both message replication and the cluster membership service, but also UDP multicast can be used for replication in a LAN.