This patch introduces new cluster status "reboot" which is set by leader node hence other nodes will start mysql without "--wsrep-new-cluster" option. Before this following situation took place: All pods go down one by one with some offset; First and second nodes have max seqno; The script on the first node detects there are no active backends and starts timeout loop; The script on the second node detects there are no active backends and starts timeout loop (with approx. 20 sec offset from first node) ; Timeout loop finishes on first node, it checks highest seqno and lowest hostname and wins the ability to start cluster. Mysql is started with “--wsrep-new-cluster” parameter. Seqno is set to “-1” for this node after mysql startup; Periodic job syncs values from grastate file to configmap; Timeout loop finishes on second node. It checks node with highest seqno and lowest hostname and since seqno is already “-1” for first node, the second node decides that it should lead the cluster startup and executes mysql with “--wsrep-new-cluster” option as well which leads to split brain Change-Id: Ic63fd916289cb05411544cb33d5fdeed1352b380
openstack-helm/mariadb
By default, this chart creates a 3-member mariadb galera cluster.
This chart leverages StatefulSets, with persistent storage.
It creates a job that acts as a temporary standalone galera cluster.
This host is bootstrapped with authentication and then the WSREP
bindings are exposed publicly. The cluster members being StatefulSets
are provisioned one at a time. The first host must be marked as
Ready before the next host will be provisioned. This is
determined by the readinessProbes which actually validate that MySQL is
up and responsive.
The configuration leverages xtrabackup-v2 for synchronization. This may later be augmented to leverage rsync which has some benefits.
Once the seed job completes, which completes only when galera reports that it is Synced and all cluster members are reporting in thus matching the cluster count according to the job to the replica count in the helm values configuration, the job is terminated. When the job is no longer active, future StatefulSets provisioned will leverage the existing cluster members as gcomm endpoints. It is only when the job is running that the cluster members leverage the seed job as their gcomm endpoint. This ensures you can restart members and scale the cluster.
The StatefulSets all leverage PVCs to provide stateful storage to
/var/lib/mysql.
You must ensure that your control nodes that should receive mariadb
instances are labeled with openstack-control-plane=enabled,
or whatever you have configured in values.yaml for the label
configuration:
kubectl label nodes openstack-control-plane=enabled --all