Added actions that enable clean removal of nova-compute unit from model

List of added actions:
* disable
* enable
* remove-from-cloud
* register-to-cloud

More detailed explanation of the process added to the README.md

Closes-Bug: #1691998
Change-Id: I45d1def2ca0b1289f6fcce06c5f8949ef2a4a69e
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/470
This commit is contained in:
Martin Kalcok 2020-11-23 16:14:55 +01:00
parent afdf2329d1
commit ceb8a68868
15 changed files with 688 additions and 4 deletions

View File

@ -1,5 +1,4 @@
- project:
templates:
- python35-charm-jobs
- openstack-python3-ussuri-jobs
- openstack-python3-charm-jobs
- openstack-cover-jobs

View File

@ -265,17 +265,47 @@ In addition this charm declares two extra-bindings:
Note that the nova-cloud-controller application must have bindings to the same
network spaces used for both 'internal' and 'migration' extra bindings.
## Cloud downscaling
Removing a nova-compute unit from an OpenStack cloud is not a trivial
operation and it needs to be done in steps to ensure that no VMs are
accidentally destroyed:
1. Ensure that there are no VMs running on the `nova-compute`
unit that's about to be removed. Running juju action `disable` will ensure
that `nova-scheduler` wont start any new VMs on this unit. Then either
destroy or migrate any VMs that are running on this unit.
2. Run juju action `remove-from-cloud`. This will stop nova-compute
service on this unit and it will unregister this unit from the
nova-cloud-controller application, thereby effectively removing it from the
OpenStack cloud.
3. Run the `juju remove-unit` command to remove this unit from
the model.
### Undoing unit removal
If the third step (`juju remove-unit`) was not executed, the whole process
can be reverted by running juju actions `register-to-cloud` and `enable`.
This will start `nova compute` services again and it will enable
`nova-scheduler` to run new VMs on this unit.
## Actions
This section lists Juju [actions][juju-docs-actions] supported by the charm.
Actions allow specific operations to be performed on a per-unit basis. To
display action descriptions run `juju actions ceph-mon`. If the charm is not
display action descriptions run `juju actions nova-compute`. If the charm is not
deployed then see file `actions.yaml`.
* `disable`
* `enable`
* `hugepagereport`
* `openstack-upgrade`
* `pause`
* `register-to-cloud`
* `remove-from-cloud`
* `resume`
* `hugepagereport`
* `security-checklist`
# Bugs

View File

@ -1,3 +1,15 @@
disable:
description: Disable nova-compute unit, preventing nova scheduler to run new VMs on this unit.
enable:
description: Enable nova-compute-unit, allowing nova scheduler to run new VMs on this unit.
remove-from-cloud:
description: |
Stop and unregister nova-compute from nova-cloud-controller. For more info see
README.md, section 'Cloud downscaling'.
register-to-cloud:
description: |
Start and register nova-compute service with nova-cloud-controller. For more info see
README.md, section 'Cloud downscaling'.
openstack-upgrade:
description: Perform openstack upgrades. Config option action-managed-upgrade must be set to True.
pause:

164
actions/cloud.py Executable file
View File

@ -0,0 +1,164 @@
#!/usr/bin/env python3
#
# Copyright 2020 Canonical Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import os
import sys
from enum import Enum
sys.path.append('lib')
sys.path.append('hooks')
import nova_compute_hooks
from nova_compute import cloud_utils
from charmhelpers.core.host import (
service_pause,
service_resume,
)
from charmhelpers.core.hookenv import (
DEBUG,
function_set,
function_fail,
INFO,
log,
status_get,
status_set,
WORKLOAD_STATES,
)
UNIT_REMOVED_MSG = 'Unit was removed from the cloud'
class ServiceState(Enum):
"""State of the nova-compute service in the cloud controller"""
enabled = 0
disabled = 1
def _set_service(state):
"""
Set state of the nova-compute service in the nova-cloud-controller.
Available states:
- ServiceState.enabled: nova-scheduler can use this unit to run new VMs
- ServiceState.disabled : nova-scheduler wont schedule new VMs on this
unit
:type state: ServiceState
"""
nova = cloud_utils.nova_client()
hostname = cloud_utils.service_hostname()
if state == ServiceState.disabled:
log('Disabling nova-compute service on host {}'.format(hostname))
nova.services.disable(hostname, 'nova-compute')
elif state == ServiceState.enabled:
log('Enabling nova-compute service on host {}'.format(hostname))
nova.services.enable(hostname, 'nova-compute')
else:
raise RuntimeError('Unknown service state')
def disable():
"""Disable nova-scheduler from starting new VMs on this unit"""
_set_service(ServiceState.disabled)
def enable():
"""Enable nova-scheduler to start new VMs on this unit"""
_set_service(ServiceState.enabled)
def remove_from_cloud():
"""
Implementation of 'remove-from-cloud' action.
This action is preparation for clean removal of nova-compute unit from
juju model. If this action succeeds , user can run `juju remove-unit`
command.
Steps performed by this action:
- Checks that this nova-compute unit can be removed from the cloud
- If not, action fails
- Stops nova-compute system service
- Unregisters nova-compute service from the nova cloud controller
"""
nova = cloud_utils.nova_client()
if cloud_utils.running_vms(nova) > 0:
raise RuntimeError("This unit can not be removed from the "
"cloud because it's still running VMs. Please "
"remove these VMs or migrate them to another "
"nova-compute unit")
nova_service_id = cloud_utils.nova_service_id(nova)
log("Stopping nova-compute service", DEBUG)
service_pause('nova-compute')
log("Deleting nova service '{}'".format(nova_service_id), DEBUG)
nova.services.delete(nova_service_id)
status_set(WORKLOAD_STATES.BLOCKED, UNIT_REMOVED_MSG)
function_set({'message': UNIT_REMOVED_MSG})
def register_to_cloud():
"""
Implementation of `register-to-cloud` action.
This action reverts `remove-from-cloud` action. It starts nova-comptue
system service which will trigger its re-registration in the cloud.
"""
log("Starting nova-compute service", DEBUG)
service_resume('nova-compute')
current_status = status_get()
if current_status[0] == WORKLOAD_STATES.BLOCKED.value and \
current_status[1] == UNIT_REMOVED_MSG:
status_set(WORKLOAD_STATES.ACTIVE, 'Unit is ready')
nova_compute_hooks.update_status()
function_set({
'command': 'openstack compute service list',
'message': "Nova compute service started. It should get registered "
"with the cloud controller in a short time. Use the "
"'openstack' command to verify that it's registered."
})
ACTIONS = {
'disable': disable,
'enable': enable,
'remove-from-cloud': remove_from_cloud,
'register-to-cloud': register_to_cloud,
}
def main(args):
action_name = os.path.basename(args.pop(0))
try:
action = ACTIONS[action_name]
except KeyError:
s = "Action {} undefined".format(action_name)
function_fail(s)
return
else:
try:
log("Running action '{}'.".format(action_name), INFO)
action()
except Exception as exc:
function_fail("Action {} failed: {}".format(action_name, str(exc)))
if __name__ == '__main__':
main(sys.argv)

1
actions/disable Symbolic link
View File

@ -0,0 +1 @@
cloud.py

1
actions/enable Symbolic link
View File

@ -0,0 +1 @@
cloud.py

1
actions/register-to-cloud Symbolic link
View File

@ -0,0 +1 @@
cloud.py

1
actions/remove-from-cloud Symbolic link
View File

@ -0,0 +1 @@
cloud.py

4
bindep.txt Normal file
View File

@ -0,0 +1,4 @@
libxml2-dev [platform:dpkg test]
libxslt1-dev [platform:dpkg test]
build-essential [platform:dpkg test]
zlib1g-dev [platform:dpkg test]

View File

@ -134,6 +134,9 @@ BASE_PACKAGES = [
'xfsprogs',
'nfs-common',
'open-iscsi',
'python3-novaclient', # lib required by juju actions
'python3-neutronclient', # lib required by juju actions
'python3-keystoneauth1', # lib required by juju actions
]
PY3_PACKAGES = [

View File

View File

@ -0,0 +1,128 @@
# Copyright 2020 Canonical Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import configparser
import socket
from keystoneauth1 import loading, session
from novaclient import client as nova_client_
from charmhelpers.core.hookenv import (
log,
DEBUG,
WARNING
)
def _nova_cfg():
"""
Parse nova config and return it in form of ConfigParser instance
:return: Parsed nova config
:rtype: configparser.ConfigParser
"""
nova_cfg = configparser.ConfigParser()
nova_cfg.read('/etc/nova/nova.conf')
return nova_cfg
def _os_credentials():
"""
Returns Openstack credentials that Openstack clients can use to
authenticate with keystone.
:return: Openstack credentials
:rtype: dict
"""
nova_cfg = _nova_cfg()
auth_section = 'keystone_authtoken'
auth_details = [
'username',
'password',
'auth_url',
'project_name',
'project_domain_name',
'user_domain_name',
]
return {attr: nova_cfg.get(auth_section, attr) for attr in auth_details}
def service_hostname():
"""Returns hostname used to identify this host within openstack."""
nova_cfg = _nova_cfg()
# This follows same logic as in nova, If 'host' is not defined in the
# config, use system's hostname
return nova_cfg['DEFAULT'].get('host', socket.gethostname())
def nova_client():
"""
Creates and authenticates new nova client.
:return: Authenticated nova client
:rtype: novaclient.v2.client.Client
"""
log('Initiating nova client', DEBUG)
loader = loading.get_plugin_loader('password')
credentials = _os_credentials()
log('Authenticating with Keystone '
'at "{}"'.format(credentials['auth_url']), DEBUG)
auth = loader.load_from_options(**credentials)
session_ = session.Session(auth=auth)
return nova_client_.Client('2', session=session_)
def nova_service_id(nc_client):
"""
Returns ID of nova-compute service running on this unit.
:param nc_client: Authenticated nova client
:type nc_client: novaclient.v2.client.Client
:return: nova-compute ID
:rtype: str
"""
hostname = service_hostname()
service = nc_client.services.list(host=hostname, binary='nova-compute')
if len(service) == 0:
raise RuntimeError('Host "{}" is not registered in nova service list')
elif len(service) > 1:
log('Host "{}" has more than 1 nova-compute service registered. '
'Selecting one ID randomly.'.format(hostname), WARNING)
return service[0].id
def running_vms(nc_client):
"""
Returns number of VMs managed by the nova-compute service on this unit.
:param nc_client: Authenticated nova client
:type nc_client: novaclient.v2.client.Client
:return: Number of running VMs
:rtype: int
"""
# NOTE(martin-kalcok): Hypervisor list always uses host's fqdn for
# 'hypervisor_hostname', even if config variable 'host' is set in
# the nova.conf
hostname = socket.getfqdn()
# NOTE(martin-kalcok): After the support for trusty (and by extension
# mitaka) is dropped, `hypervisors.list()` can be changed to
# `hypervisors.search(hostname, detailed=True) to improve performance.
for server in nc_client.hypervisors.list():
if server.hypervisor_hostname == hostname:
log("VMs running on hypervisor '{}':"
" {}".format(hostname, server.running_vms), DEBUG)
return server.running_vms
else:
raise RuntimeError("Nova compute node '{}' not found in the list of "
"hypervisors. Is the unit already removed from the"
" cloud?".format(hostname))

View File

@ -39,12 +39,14 @@ tests:
- ceph:
- zaza.openstack.charm_tests.nova.tests.CirrosGuestCreateTest
- zaza.openstack.charm_tests.nova.tests.LTSGuestCreateTest
- zaza.openstack.charm_tests.nova.tests.CloudActions
- zaza.openstack.charm_tests.nova.tests.NovaCompute
- zaza.openstack.charm_tests.nova.tests.SecurityTests
- zaza.openstack.charm_tests.ceph.tests.CheckPoolTypes
- zaza.openstack.charm_tests.ceph.tests.BlueStoreCompressionCharmOperation
- zaza.openstack.charm_tests.nova.tests.CirrosGuestCreateTest
- zaza.openstack.charm_tests.nova.tests.LTSGuestCreateTest
- zaza.openstack.charm_tests.nova.tests.CloudActions
- zaza.openstack.charm_tests.nova.tests.NovaCompute
- zaza.openstack.charm_tests.nova.tests.SecurityTests

View File

@ -0,0 +1,233 @@
# Copyright 2020 Canonical Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import sys
from unittest import TestCase
from unittest.mock import MagicMock, patch
sys.modules['nova_compute_hooks'] = MagicMock()
import cloud
del sys.modules['nova_compute_hooks']
class _ActionTestCase(TestCase):
NAME = ''
def __init__(self, methodName='runTest'):
super(_ActionTestCase, self).__init__(methodName)
self._func_args = {}
self.hostname = 'nova.compute.0'
self.nova_service_id = '0'
def setUp(self, to_mock=None):
"""
Mock commonly used objects from cloud.py module. Additional objects
can be passed in for mocking in the form of a dict with format
{module.object: ['method1', 'method2']}
Example usage:
```python
class MyTestCase(unittest.TestCase):
def setUp(self, to_mock=None):
additional_mocks = {
actions.os: ['remove', 'mkdir'],
actions.shutil: ['rmtree'],
}
super(MyTestcase, self).setUp(to_mock=additional_mocks)
```
:param to_mock: Additional objects to mock
:return: None
"""
to_mock = to_mock or {}
default_mock = {
cloud: {'function_set',
'function_fail',
'status_get',
'status_set',
},
cloud.cloud_utils: {'service_hostname',
'nova_client',
'nova_service_id',
'running_vms',
}
}
for key, value in to_mock.items():
if key in default_mock:
default_mock[key].update(value)
else:
default_mock[key] = value
self.patch_all(default_mock)
cloud.cloud_utils.service_hostname.return_value = self.hostname
cloud.cloud_utils.nova_service_id.return_value = self.nova_service_id
cloud.cloud_utils.running_vms.return_value = 0
cloud.cloud_utils.nova_client.return_value = MagicMock()
def patch_all(self, to_patch):
for object_, methods in to_patch.items():
for method in methods:
mock_ = patch.object(object_, method, MagicMock())
mock_.start()
self.addCleanup(mock_.stop)
def assert_function_fail_msg(self, msg):
"""Shortcut for asserting error with default structure"""
cloud.function_fail.assert_called_with("Action {} failed: "
"{}".format(self.NAME, msg))
def call_action(self):
"""Shortcut to calling action based on the current TestCase"""
cloud.main([self.NAME])
class TestGenericAction(_ActionTestCase):
def test_unknown_action(self):
"""Test expected fail when running undefined action."""
bad_action = 'foo'
expected_error = 'Action {} undefined'.format(bad_action)
cloud.main([bad_action])
cloud.function_fail.assert_called_with(expected_error)
def test_unknown_nova_compute_state(self):
"""Test expected error when setting nova-compute state
to unknown value"""
bad_state = 'foo'
self.assertRaises(RuntimeError, cloud._set_service, bad_state)
class TestDisableAction(_ActionTestCase):
NAME = 'disable'
def test_successful_disable(self):
"""Test that expected steps are performed when enabling nova-compute
service"""
client = MagicMock()
nova_services = MagicMock()
client.services = nova_services
cloud.cloud_utils.nova_client.return_value = client
self.call_action()
nova_services.disable.assert_called_with(self.hostname, 'nova-compute')
cloud.function_fail.assert_not_called()
class TestEnableAction(_ActionTestCase):
NAME = 'enable'
def test_successful_disable(self):
"""Test that expected steps are performed when disabling nova-compute
service"""
client = MagicMock()
nova_services = MagicMock()
client.services = nova_services
cloud.cloud_utils.nova_client.return_value = client
self.call_action()
nova_services.enable.assert_called_with(self.hostname, 'nova-compute')
cloud.function_fail.assert_not_called()
class TestRemoveFromCloudAction(_ActionTestCase):
NAME = 'remove-from-cloud'
def __init__(self, methodName='runTest'):
super(TestRemoveFromCloudAction, self).__init__(methodName=methodName)
self.nova_client = MagicMock()
def setUp(self, to_mock=None):
additional_mocks = {
cloud: {'service_pause'}
}
super(TestRemoveFromCloudAction, self).setUp(to_mock=additional_mocks)
cloud.cloud_utils.nova_client.return_value = self.nova_client
def test_nova_is_running_vms(self):
"""Action fails if there are VMs present on the unit"""
cloud.cloud_utils.running_vms.return_value = 1
error_msg = "This unit can not be removed from the cloud because " \
"it's still running VMs. Please remove these VMs or " \
"migrate them to another nova-compute unit"
self.call_action()
self.assert_function_fail_msg(error_msg)
def test_remove_from_cloud(self):
"""Test that expected steps are executed when running action
remove-from-cloud"""
nova_services = MagicMock()
self.nova_client.services = nova_services
self.call_action()
# stopping services
cloud.service_pause.assert_called_with('nova-compute')
# unregistering services
nova_services.delete.assert_called_with(self.nova_service_id)
# setting unit state
cloud.status_set.assert_called_with(
cloud.WORKLOAD_STATES.BLOCKED,
cloud.UNIT_REMOVED_MSG
)
cloud.function_set.assert_called_with(
{'message': cloud.UNIT_REMOVED_MSG}
)
cloud.function_fail.assert_not_called()
class TestRegisterToCloud(_ActionTestCase):
NAME = 'register-to-cloud'
def setUp(self, to_mock=None):
additional_mocks = {
cloud: {'service_resume'}
}
super(TestRegisterToCloud, self).setUp(to_mock=additional_mocks)
def test_dont_reset_unit_status(self):
"""Test that action wont reset unit state if the current state was not
set explicitly by 'remove-from-cloud' action"""
cloud.status_get.return_value = (cloud.WORKLOAD_STATES.BLOCKED.value,
'Unrelated reason for blocked status')
self.call_action()
cloud.status_set.assert_not_called()
cloud.function_fail.assert_not_called()
def test_reset_unit_status(self):
"""Test that action will reset unit state if the current state was
set explicitly by 'remove-from-cloud' action"""
cloud.status_get.return_value = (cloud.WORKLOAD_STATES.BLOCKED.value,
cloud.UNIT_REMOVED_MSG)
self.call_action()
cloud.status_set.assert_called_with(cloud.WORKLOAD_STATES.ACTIVE,
'Unit is ready')
cloud.function_fail.assert_not_called()
def test_action_starts_services(self):
"""Test that expected steps are executed when running action
register-to-cloud"""
self.call_action()
cloud.service_resume.assert_called_with('nova-compute')
cloud.function_fail.assert_not_called()

View File

@ -0,0 +1,105 @@
# Copyright 2020 Canonical Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from unittest import TestCase
from unittest.mock import MagicMock, patch
import nova_compute.cloud_utils as cloud_utils
class NovaServiceMock():
def __init__(self, id, host, binary):
self.id = id
self.host = host
self.binary = binary
class TestCloudUtils(TestCase):
def __init__(self, methodName='runTest'):
super(TestCloudUtils, self).__init__(methodName=methodName)
self.nova_client = MagicMock()
nova_services = MagicMock()
nova_services.list.return_value = []
self.nova_client.services = nova_services
self.neutron_client = MagicMock()
self.unit_hostname = 'nova-commpute-0'
def setUp(self):
to_patch = [
'loading',
'log',
'nova_client_',
'_nova_cfg',
'service_hostname',
]
for object_ in to_patch:
mock_ = patch.object(cloud_utils, object_, MagicMock())
mock_.start()
self.addCleanup(mock_.stop)
cloud_utils._nova_cfg.return_value = MagicMock()
cloud_utils.nova_client.return_value = self.nova_client
cloud_utils.service_hostname.return_value = self.unit_hostname
def test_os_credentials_content(self):
"""Test that function '_os_credentials' returns credentials
in expected format"""
credentials = cloud_utils._os_credentials()
expected_keys = [
'username',
'password',
'auth_url',
'project_name',
'project_domain_name',
'user_domain_name',
]
for key in expected_keys:
self.assertIn(key, credentials.keys())
def test_nova_service_not_present(self):
"""Test that function 'nova_service_id' raises expected exception if
current unit is not registered in 'nova-cloud-controller'"""
nova_client = MagicMock()
nova_services = MagicMock()
nova_services.list.return_value = []
nova_client.services = nova_services
cloud_utils.nova_client.return_value = nova_client
self.assertRaises(RuntimeError, cloud_utils.nova_service_id,
nova_client)
def test_nova_service_id_multiple_services(self):
"""Test that function 'nova_service_id' will log warning and return
first ID in the event that multiple nova-compute services are present
on the same host"""
first_id = 0
second_id = 1
warning_msg = 'Host "{}" has more than 1 nova-compute service ' \
'registered. Selecting one ID ' \
'randomly.'.format(self.unit_hostname)
self.nova_client.services.list.return_value = [
NovaServiceMock(first_id, self.unit_hostname, 'nova-compute'),
NovaServiceMock(second_id, self.unit_hostname, 'nova-compute'),
]
service_id = cloud_utils.nova_service_id(self.nova_client)
self.assertEqual(service_id, first_id)
cloud_utils.log.assert_called_with(warning_msg, cloud_utils.WARNING)