This is a guest post by Markus Opolka (martialblog on GitHub).
Galera Cluster is a cluster solution for MariaDB. It provides a simple way to configure an active-active MySQL Cluster using MySQL-wsrep. Further details can be found on the MariaDB documentation site.
In this article, we will go through:
- Setup of the environment and prerequisites
- Installation and configuration of MariaDB 10.3 Galera Cluster using rsync with TLS
- Starting and testing the cluster
- What happens with openssl rehash
Before you consider a Galera Cluster solution, you should carefully read the Known Limitations.
wsrep provides the cluster functionality, all available configuration options start with wsrep_ (Full list of wsrep options).
The terms Server and Node will be used interchangeably. Galera specifies two types of nodes:
- Donor, with the current state of the data
- Joiner, that requires data from a Donor
Data can be transfered in two ways:
- State Snapshot Transfer (SST) provisions the nodes by transferring a full data copy from a Donor to a Joiner
- Incremental State Transfer (IST) transfers only missing data from a Donor to a Joiner
We’ll be using Vagrant to simulate a Galera Cluster with three nodes. As a general rule, you want an odd number of nodes in any cluster to ensure a quorum of active nodes and avoid a split-brain scenario.
You can use the following
Vagrantfile to recreate the environment:
Since we don’t want to setup a DNS server for name resolution, we simply add all nodes to each
In order to use TLS for the state transfer, we will setup a simple Root CA and certificates for each server. To ensure the private keys can be read by MariaDB, we will create a group that owns these files.
# On all servers $ mkdir /ca $ groupadd tls-cert
The Root CA can be created outside the VMs:
Each server will have its own private key and certificate:
Each private key and certificate as well as the Root CA certificate need to be available on the servers. Once the files are copied, they will be owned by the tls-cert group.
The servers should be in the following state:
/etc/hostscontains all other servers to ensure name resolution.
/cacontains the respective private key, certificate and Root CA certificate, owned by the tls-cert group.
From version 10.1 onwards, MariaDB contains the wsrep API, so we can just install it from the Debian Buster main repository:
Another option is to add the official MariaDB Apt repository and install from there.
Now we can setup the MariaDB Service.
First, we add the mysql user to the tls-cert group, so that it can read the TLS files:
galera requires you to rehash the certificates. See below what happens when you don’t do this.
But why? Rehash creates a sort of hash table implemented using symbolic links, which speeds up file lookup. Doesn’t apply here with just two files, but in general.
Now the wsrep options can be added to the MariaDB configuration:
Starting the Cluster
After all the configuration files are in place, we can start the Galera Cluster. The command
galera_new_cluster is used to bootstrap a new Galera Cluster when all nodes are down. Simply using
systemctl start mariadb will fail in that scenario.
Once the first node is running, we simply need to start the MariaDB Service on the other nodes:
Check the MariaDB error log to see what’s going on:
Validating the Cluster
Creating a Database and Content
To test the replication, we can simply add a database and some content:
If the option wsrep_replicate_myisam is enabled, MyISAM tables can also be replicated:
Removing Nodes from the Cluster
Nodes can simply be removed by stopping the MariaDB Service:
When the node is shutdown successfully, it will store its current state position in
/var/lib/mysql/grastate.dat. Should changes happen on the still active nodes, a state transfer will occur when the node rejoins the cluster.
To rejoin a node, just restart the MariaDB Service:
What happens without openssl rehash?
Strange things can happen if you don’t run openssl rehash. At first the cluster seems to operate just fine, all nodes can join and replicate data. However, things start to get messy when wsrep_replicate_myisam is enabled and the MyISAM tables need to be replicated on a Joiner node.
When you first start the cluster and another node joins, everything looks fine:
[Note] WSREP: SSL handshake successful, remote endpoint ssl://10.0.0.22:44694 local endpoint ssl://10.0.0.11:4567 cipher: TLS_AES_256_GCM_SHA384 compression: none [Note] WSREP: (8b251666, 'ssl://0.0.0.0:4567') connection established to 95f163bf ssl://10.0.0.22:4567 [Note] WSREP: (8b251666, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers: [Note] WSREP: declaring 95f163bf at ssl://10.0.0.22:4567 stable [Note] WSREP: New cluster view: global state: a775dd51-7fb2-11ea-88cc-07369f4708d8:153, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 3
Creating new InnoDB and MyISAM tables works just as expected, see above.
Things start to go bad, when you take one node offline:
And then add a new entry into a MyISAM table on the still active node:
Trying to rejoin the other node results in an error:
root@server2:~# systemctl start mariadb.service Job for mariadb.service failed because a fatal signal was delivered to the control process. root@server2:~# cat /var/log/mysql/error.log [...] [Note] WSREP: declaring 8b251666 at ssl://10.0.0.11:4567 stable [Warning] WSREP: Gap in state sequence. Need state transfer. LOG3: SSL_accept: 14094412: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate [Warning] WSREP: 0.0 (server1): State transfer to 1.0 (server2) failed: -5 (Input/output error) [ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():737: Will never receive state. Need to abort.
At the time of writing, the only way I could resolve this was to:
- Shutdown the still active Cluster node (server1)
- Disable the MariaDB TLS configuration (all nodes)
- Bootstrap the Cluster (server1)
- Join all other nodes so that they are replicated
- Shutdown the Cluster again
- Run openssl rehash (all nodes)
- Re-enable the MariaDB TLS configuration (all nodes)
- Bootstrap the Cluster (server1)
- Join all other nodes