diff --git a/README.md b/README.md index 2a9b2d1..9cf5c36 100644 --- a/README.md +++ b/README.md @@ -58,6 +58,13 @@ for mysql can be retrieved using the following command: Root user DB access is only usable from within one of the deployed units (access to root is restricted to localhost only). +## Cold boot + +When machines hosting the percona-cluster units are started in order for the +application to assume a clustered and healthy state particular steps are +required to be taken. This is documented in the [OpenStack Charms Deployment +Guide][cdg-percona-startup]. + ## Limitations Note that Percona XtraDB Cluster is not a 'scale-out' MySQL solution; reads @@ -202,101 +209,6 @@ Upstream documentation is also available: * [Percona XtraDB Cluster In-Place Upgrading Guide: From 5.5 to 5.6][upstream-upgrading-55-to-56] * [Galera replication - how to recover a PXC cluster][upstream-recovering] -## Cold Boot - -In the event of an unexpected power outage and cold boot, the cluster will be -unable to reestablish itself without manual intervention. - -The cluster will be in scenario 3 or 6 from the upstream [Percona Cluster -documentation](https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/) -Please read the upstream documentation as it provides context to the steps -outlined here. In either scenario, it is necessary to choose a unit to become -the bootstrap node. - -### Determine the node with the highest sequence number - -This information can be found in the -`/var/lib/percona-xtradb-cluster/grastate.dat` file. The charm will also display -this information in the juju status. - -Example `juju status` after a cold boot of `percona-cluster` - - Unit Workload Agent Machine Public address Ports Message - keystone/0* active idle 0 10.5.0.32 5000/tcp Unit is ready - percona-cluster/0 blocked idle 1 10.5.0.20 3306/tcp MySQL is down. Sequence Number: 355. Safe To Bootstrap: 0 - hacluster/0 active idle 10.5.0.20 Unit is ready and clustered - percona-cluster/1 blocked idle 2 10.5.0.17 3306/tcp MySQL is down. Sequence Number: 355. Safe To Bootstrap: 0 - hacluster/1 active idle 10.5.0.17 Unit is ready and clustered - percona-cluster/2* blocked idle 3 10.5.0.27 3306/tcp MySQL is down. Sequence Number: 355. Safe To Bootstrap: 0 - hacluster/2* active idle 10.5.0.27 Unit is ready and clustered - -*Note*: An application leader is denoted by any asterisk in the Unit column. - -In the above example all the sequence numbers match. This means we can -bootstrap from any unit we choose. - -In the next example the percona-cluster/2 node has the highest sequence number -so we must choose that node to avoid data loss. - - Unit Workload Agent Machine Public address Ports Message - keystone/0* active idle 0 10.5.0.32 5000/tcp Unit is ready - percona-cluster/0* blocked idle 1 10.5.0.20 3306/tcp MySQL is down. Sequence Number: 1318. Safe To Bootstrap: 0 - hacluster/0* active idle 10.5.0.20 Unit is ready and clustered - percona-cluster/1 blocked idle 2 10.5.0.17 3306/tcp MySQL is down. Sequence Number: 1318. Safe To Bootstrap: 0 - hacluster/1 active idle 10.5.0.17 Unit is ready and clustered - percona-cluster/2 blocked idle 3 10.5.0.27 3306/tcp MySQL is down. Sequence Number: 1325. Safe To Bootstrap: 0 - hacluster/2 active idle 10.5.0.27 Unit is ready and clustered - -### Bootstrap the node with the highest sequence number - -Run the `bootstrap-pxc` action on the node with the highest sequence number. In -this example, it is unit percona-cluster/2, which happens to be a non-leader. - - juju run-action --wait percona-cluster/2 bootstrap-pxc - -### Notify the cluster of the new bootstrap UUID - -In the vast majority of cases, once the `bootstrap-pxc` action has been run and -the model has settled the output to the `juju status` command will now look -like this: - - Unit Workload Agent Machine Public address Ports Message - keystone/0* active idle 0 10.5.0.32 5000/tcp Unit is ready - percona-cluster/0* waiting idle 1 10.5.0.20 3306/tcp Unit waiting for cluster bootstrap - hacluster/0* active idle 10.5.0.20 Unit is ready and clustered - percona-cluster/1 waiting idle 2 10.5.0.17 3306/tcp Unit waiting for cluster bootstrap - hacluster/1 active idle 10.5.0.17 Unit is ready and clustered - percona-cluster/2 waiting idle 3 10.5.0.27 3306/tcp Unit waiting for cluster bootstrap - hacluster/2 active idle 10.5.0.27 Unit is ready and clustered - -If you observe the above output ("Unit waiting for cluster bootstrap") then the -`notify-bootstrapped` action needs to be run on a unit. There are two -possibilities: - -1. If the `bootstrap-pxc` action was run on a leader then run - `notify-bootstrapped` on a non-leader. -2. If the `bootstrap-pxc` action was run on a non-leader then run - `notify-bootstrapped` on the leader. - -In the current example, the first action was run on a non-leader so we'll run -the second action on the leader, percona-cluster/0: - - juju run-action percona-cluster/0 notify-bootstrapped --wait - -After the model settles, the output should show all nodes in active and ready -state: - - Unit Workload Agent Machine Public address Ports Message - keystone/0* active idle 0 10.5.0.32 5000/tcp Unit is ready - percona-cluster/0* active idle 1 10.5.0.20 3306/tcp Unit is ready - hacluster/0* active idle 10.5.0.20 Unit is ready and clustered - percona-cluster/1 active idle 2 10.5.0.17 3306/tcp Unit is ready - hacluster/1 active idle 10.5.0.17 Unit is ready and clustered - percona-cluster/2 active idle 3 10.5.0.27 3306/tcp Unit is ready - hacluster/2 active idle 10.5.0.27 Unit is ready and clustered - -The percona-cluster application is now back to a clustered and healthy state. - # Bugs Please report bugs on [Launchpad][lp-bugs-charm-percona-cluster]. @@ -317,6 +229,7 @@ For general charm questions refer to the [OpenStack Charm Guide][cg]. [upstream-recovering]: https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/ [juju-docs-actions]: https://jaas.ai/docs/actions [cdg-percona-migration-to-mysql8]: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-series-upgrade-specific-procedures.html#percona-cluster-charm-series-upgrade-to-focal +[cdg-percona-startup]: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-managing-power-events.html#id22 [mysql-router-charm]: https://jaas.ai/mysql-router [mysql-innodb-cluster-charm]: https://jaas.ai/mysql-innodb-cluster [cdg-procedures]: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-series-upgrade-openstack.html#procedures