fuel-docs/pages/frequently-asked-questions/0030-add-remove-nodes-without-downtime

95 lines
2.9 KiB
Plaintext

.. This is not ready for inclusion into docs
.. _add-remove-nodes-without-downtime:
Add or Remove Controller and Compute Nodes Without Downtime
===========================================================
This document will help you become familiar with the process around lifecycle
management of controller and compute nodes within an OpenStack environment
deployed by Fuel. There are some specific details to note, so reading this
document is highly recommended.
1. The addition of compute nodes works seamlessly - just specify its
IPs in `site.pp` file (if needed) and run puppet agent
2. Deletion of computes is the same as deletion from OpenStack: migrate
machines, shutdown node, delete from database (Puppet class needed)
3. Addition of controllers works as soon as puppet deploys. Which
testing is needed:
- Test that altering the `site.pp` (adding new controller) and
running puppet consequently on all controllers does not break anything
4. Deletion of controllers should work as soon as quorum for Galera and
pacemaker is OK. What to test:
- Altering of quorum in case of node deletion
- If we need to alter quorum - we need to write classes for this
Node deletion
^^^^^^^^^^^^^
Disclaimer: BTW, this is not included into folsom as cinder service
disabling is not supported: it is going to be there in grizzly. Even
though, cinder deletion and migration of volumes is not going to be there.
Artem Andreev says, they are working on cinder volume migration for Dell M2
and it will be ready by the end of the May.
Deletion of the node assumes the following sequence of actions:
1. Backup databases (MySQL, AMQP(???), pacemaker) up
2. Disable services hosted on this node:
a. shutdown nova-compute and disable it (for computes)
b. shutdown nova-api and other services residing on controllers
c. move all the stuff from node being deleted:
i. migrate VMS
ii. migrate cinder volumes (HOW?????)
iii. migrate singleton pacemaker services (if any)
iv. avoid starting of pacemaker services on the node in case of
multistate primitive
d. Disable all the services for the node (by use of database or by use
of client)
3. For controllers:
a. Delete node from galera cluster (HOW???)
b. Delete node from pacemaker (crm node delete ???)
4. For swift storages:
a. remove node from the ring (for storage)
b. rebuild ring
c. rebalance
d. rsync new rings to nodes
5. For swift proxies:
a. shutdown the node
b. update haproxy backends by site.pp edit and puppet run
6. Cleanup the database
Necessity of special resources considerations
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Looks like we need to create special puppet resource called "node" with
different providers "controller/storage/proxy/etc." for it and specify the
:ensure property for them. Running the resource with corresponding property
generates corresponding resources and does all the steps needed.