Configuring a TLS-secured MariaDB Galera Cluster

This is a guest post by Markus Opolka (martialblog on GitHub).


Galera Cluster is a cluster solution for MariaDB. It provides a simple way to configure an active-active MySQL Cluster using MySQL-wsrep. Further details can be found on the MariaDB documentation site.

In this article, we will go through:

Before you consider a Galera Cluster solution, you should carefully read the Known Limitations.

Terminology

wsrep provides the cluster functionality, all available configuration options start with wsrep_ (Full list of wsrep options).

The terms Server and Node will be used interchangeably. Galera specifies two types of nodes:

  • Donor, with the current state of the data
  • Joiner, that requires data from a Donor

Data can be transfered in two ways:

  • State Snapshot Transfer (SST) provisions the nodes by transferring a full data copy from a Donor to a Joiner
  • Incremental State Transfer (IST) transfers only missing data from a Donor to a Joiner

Environment

We’ll be using Vagrant to simulate a Galera Cluster with three nodes. As a general rule, you want an odd number of nodes in any cluster to ensure a quorum of active nodes and avoid a split-brain scenario.

You can use the following Vagrantfile to recreate the environment:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
$ vagrant --version
Vagrant 2.2.6

$ cat Vagrantfile
Vagrant.configure("2") do |config|
  config.vm.define "server1" do |server1|
    server1.vm.box = "debian/buster64"
    server1.vm.hostname = "server1.example.com"
    server1.vm.network :private_network, ip: "10.0.0.1"
  end
  config.vm.define "server2" do |server2|
    server2.vm.box = "debian/buster64"
    server2.vm.hostname = "server2.example.com"
    server2.vm.network :private_network, ip: "10.0.0.2"
  end
  config.vm.define "server3" do |server3|
    server3.vm.box = "debian/buster64"
    server3.vm.hostname = "server3.example.com"
    server3.vm.network :private_network, ip: "10.0.0.3"
  end
end

Prerequisites

Since we don’t want to setup a DNS server for name resolution, we simply add all nodes to each /etc/hosts file.

1
2
3
4
5
6
7
# On all servers

# /etc/hosts
127.0.0.1	localhost
10.0.0.11	server1.example.com     server1
10.0.0.22	server2.example.com     server2
10.0.0.33	server3.example.com     server3

In order to use TLS for the state transfer, we will setup a simple Root CA and certificates for each server. To ensure the private keys can be read by MariaDB, we will create a group that owns these files.

1
2
3
# On all servers
$ mkdir /ca
$ groupadd tls-cert

The Root CA can be created outside the VMs:

1
2
3
4
5
6
# Once on Host (outside the VM)

# Generate an example Root CA:
$ openssl genrsa -aes256 -out ca.key 2048
$ openssl req -new -x509 -days 7 -key ca.key -sha256 -extensions v3_ca -out ca.crt
Common Name (e.g. server FQDN or YOUR name) []:RootCA

Each server will have its own private key and certificate:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# For each server on Host (outside the VM), change the hostnames accordingly

# Generate the domain key:
$ openssl genrsa -out server1.example.com.key 2048

# Generate the certificate signing request
$ openssl req -sha256 -new -key server1.example.com.key -out server1.example.com.csr
Common Name (e.g. server FQDN or YOUR name) []:server1.example.com

# Sign the request and generate a certificate
$ openssl x509 -sha256 -req -in server1.example.com.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out server1.example.com.crt -days 7

# Verify the certificate
$ openssl verify -CAfile ca.crt server1.example.com.crt
server1.example.com.crt: OK

Each private key and certificate as well as the Root CA certificate need to be available on the servers. Once the files are copied, they will be owned by the tls-cert group.

1
2
3
4
5
6
7
8
# On all servers
$ cd /ca
$ chgrp tls-cert $(hostname -f).crt $(hostname -f).key ca.crt

$ ls -l /ca
-rw------- root tls-cert  server1.example.com.key
-rw-r--r-- root tls-cert  server1.example.com.crt
-rw-r--r-- root tls-cert  ca.crt

The servers should be in the following state:

  • /etc/hosts contains all other servers to ensure name resolution.
  • /ca contains the respective private key, certificate and Root CA certificate, owned by the tls-cert group.

Installation

From version 10.1 onwards, MariaDB contains the wsrep API, so we can just install it from the Debian Buster main repository:

1
2
3
4
5
6
7
# On each server
$ apt update
$ apt install -y mariadb-server mariadb-client stunnel4

$ apt info mariadb-server
Source: mariadb-10.3
APT-Sources: http://deb.debian.org/debian buster/main amd64 Packages

Another option is to add the official MariaDB Apt repository and install from there.

Configuration

Now we can setup the MariaDB Service.

First, we add the mysql user to the tls-cert group, so that it can read the TLS files:

1
2
3
4
# On each server
$ usermod -aG tls-cert mysql
$ groups mysql
mysql : mysql tls-cert

galera requires you to rehash the certificates. See below what happens when you don’t do this.

But why? Rehash creates a sort of hash table implemented using symbolic links, which speeds up file lookup. Doesn’t apply here with just two files, but in general.

1
2
3
4
5
6
7
8
# On each server
$ openssl rehash /ca
$ ls -l /ca
-rw-r--r-- root tls-cert ca.crt
-rw-r--r-- root tls-cert server1.example.com.crt
-rw------- root tls-cert server1.example.com.key
lrwxrwxrwx root root     5d4c0321.0 -> ca.crt
lrwxrwxrwx root root     feea7b6c.0 -> server1.example.com.crt

Now the wsrep options can be added to the MariaDB configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
# On each server
$ systemctl stop mariadb.service

$ cat /etc/mysql/mariadb.conf.d/50-server.cnf

# Path to TLS Key and Certificate for SST
[sst]
tkey = /ca/server1.example.com.key
tcert = /ca/server1.example.com.crt

[server]

[mysqld]

# MariaDB Defaults
user                    = mysql
pid-file                = /run/mysqld/mysqld.pid
socket                  = /run/mysqld/mysqld.sock
basedir                 = /usr
datadir                 = /var/lib/mysql
tmpdir                  = /tmp
lc-messages-dir         = /usr/share/mysql
query_cache_size        = 16M
expire_logs_days        = 10
character-set-server  = utf8mb4
collation-server      = utf8mb4_general_ci
log_error = /var/log/mysql/error.log

bind-address            = 127.0.0.1

# Galera Cluster Options
wsrep_on = ON
wsrep_provider = /usr/lib/galera/libgalera_smm.so

# Name of the Cluster, this nodes address and list of other nodes
wsrep_cluster_name = my-test-cluster
# Change to server's IP
wsrep_node_address = 10.0.0.11
wsrep_cluster_address = gcomm://server1.example.com,server2.example.com,server3.example.com

# SST Method and TLS Configuration
wsrep_sst_method = rsync
# Change to server's hostname
wsrep_provider_options="socket.ssl_cert=/ca/server1.example.com.crt;socket.ssl_key=/ca/server1.example.com.key;socket.ssl_ca=/ca/ca.crt"

# Experimental Feature to Replicate MyISAM Tables
wsrep_replicate_myisam = ON

# Recommended Configuration
binlog_forma = ROW
default_storage_engine = innodb
innodb_autoinc_lock_mode = 2
innodb_doublewrite = 1

Starting the Cluster

After all the configuration files are in place, we can start the Galera Cluster. The command galera_new_cluster is used to bootstrap a new Galera Cluster when all nodes are down. Simply using systemctl start mariadb will fail in that scenario.

1
2
# On server1.example.com
$ new_galera_cluster

Once the first node is running, we simply need to start the MariaDB Service on the other nodes:

1
2
# On other servers
$ systemctl start mariadb.service

Check the MariaDB error log to see what’s going on:

1
$ tail -f /var/log/mysql/error.log

Testing

Validating the Cluster

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
$ mysql -e "SHOW STATUS LIKE 'wsrep_cluster_size'";
+--------------------+-------+
| Variable_name      | Value |
+--------------------+-------+
| wsrep_cluster_size | 3     |
+--------------------+-------+

$ mysql -e "SHOW STATUS LIKE 'wsrep_cluster_status'";
+----------------------+---------+
| Variable_name        | Value   |
+----------------------+---------+
| wsrep_cluster_status | Primary |
+----------------------+---------+

$ mysql -e "SHOW STATUS LIKE 'wsrep_%'";

Creating a Database and Content

To test the replication, we can simply add a database and some content:

1
2
3
4
5
6
7
CREATE DATABASE IF NOT EXISTS foobar;
USE foobar;

CREATE TABLE IF NOT EXISTS inno ( id INT UNSIGNED  NOT NULL AUTO_INCREMENT, name VARCHAR(30) NOT NULL DEFAULT '', PRIMARY KEY  (id) );

INSERT INTO inno (name) VALUES ('Hello');
INSERT INTO inno (name) VALUES ('World');

If the option wsrep_replicate_myisam is enabled, MyISAM tables can also be replicated:

1
2
3
4
5
6
7
CREATE DATABASE IF NOT EXISTS foobar;
USE foobar;

CREATE TABLE IF NOT EXISTS isam ( id INT UNSIGNED  NOT NULL AUTO_INCREMENT, name VARCHAR(30) NOT NULL DEFAULT '', PRIMARY KEY  (id) ) ENGINE = MYISAM;

INSERT INTO isam (name) VALUES ('Hello');
INSERT INTO isam (name) VALUES ('World');

Removing Nodes from the Cluster

Nodes can simply be removed by stopping the MariaDB Service:

1
systemctl stop mariadb.service

When the node is shutdown successfully, it will store its current state position in /var/lib/mysql/grastate.dat. Should changes happen on the still active nodes, a state transfer will occur when the node rejoins the cluster.

1
2
3
4
5
6
$ cat /var/lib/mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid:    a775dd51-7fb2-11ea-88cc-07369f4708d8
seqno:   151
safe_to_bootstrap: 0

To rejoin a node, just restart the MariaDB Service:

1
systemctl start mariadb.service

What happens without openssl rehash?

Strange things can happen if you don’t run openssl rehash. At first the cluster seems to operate just fine, all nodes can join and replicate data. However, things start to get messy when wsrep_replicate_myisam is enabled and the MyISAM tables need to be replicated on a Joiner node.

When you first start the cluster and another node joins, everything looks fine:

1
2
3
4
5
[Note] WSREP: SSL handshake successful, remote endpoint ssl://10.0.0.22:44694 local endpoint ssl://10.0.0.11:4567 cipher: TLS_AES_256_GCM_SHA384 compression: none
[Note] WSREP: (8b251666, 'ssl://0.0.0.0:4567') connection established to 95f163bf ssl://10.0.0.22:4567
[Note] WSREP: (8b251666, 'ssl://0.0.0.0:4567') turning message relay requesting on, nonlive peers:
[Note] WSREP: declaring 95f163bf at ssl://10.0.0.22:4567 stable
[Note] WSREP: New cluster view: global state: a775dd51-7fb2-11ea-88cc-07369f4708d8:153, view# 2: Primary, number of nodes: 2, my index: 0, protocol version 3

Creating new InnoDB and MyISAM tables works just as expected, see above.

Things start to go bad, when you take one node offline:

1
root@server2:~# systemctl stop mariadb.service

And then add a new entry into a MyISAM table on the still active node:

1
2
3
4
5
root@server1:~# mysql

MariaDB [(none)]> USE testing;
MariaDB [testing]> INSERT INTO isam (name) VALUES ('new_value_while_node_offile');
Query OK, 1 row affected (0.013 sec)

Trying to rejoin the other node results in an error:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
root@server2:~# systemctl start mariadb.service
Job for mariadb.service failed because a fatal signal was delivered to the control process.

root@server2:~# cat /var/log/mysql/error.log
[...]
[Note] WSREP: declaring 8b251666 at ssl://10.0.0.11:4567 stable
[Warning] WSREP: Gap in state sequence. Need state transfer.
LOG3[0]: SSL_accept: 14094412: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certificate
[Warning] WSREP: 0.0 (server1): State transfer to 1.0 (server2) failed: -5 (Input/output error)
[ERROR] WSREP: gcs/src/gcs_group.cpp:gcs_group_handle_join_msg():737: Will never receive state. Need to abort.

At the time of writing, the only way I could resolve this was to:

  • Shutdown the still active Cluster node (server1)
  • Disable the MariaDB TLS configuration (all nodes)
  • Bootstrap the Cluster (server1)
  • Join all other nodes so that they are replicated
  • Shutdown the Cluster again
  • Run openssl rehash (all nodes)
  • Re-enable the MariaDB TLS configuration (all nodes)
  • Bootstrap the Cluster (server1)
  • Join all other nodes