Add Secure Clustering

* Add a connection-string based workflow to MicroStack;
  * microstack add-compute command can be run at the Control node in
    order to generate a connection string (an ASCII blob for the user);
  * the connection string contains:
    * an address of the control node;
    * a sha256 fingerprint of the TLS certificate used by the clustering
      service at the control node (which is used during verification
      similar to the Certificate Pinning approach);
    * an application credential id;
    * an application credential secret (short expiration time, reader
      role on the service project, restricted to listing the service
      catalog);
  * a MicroStack admin is expected to have ssh access to all nodes that
    will participate in a cluster - prior trust establishment is on
    them to figure out which is normal since they provision the nodes;
  * a MicroStack admin is expected to securely copy a connection string
    to a compute node via ssh. Since it is short-lived and does not
    carry service secrets, there is no risk of a replay at a later time;
  * If the compute role is specified during microstack.init, a
    connection string is requested and used to perform a request to the
    clustering service and validate the certificate fingerprint. The
    credential ID and secret are POSTed for verification to the
    clustering service which responds with the necessary config data
    for the compute node upon successful authorization.
* Set up TLS termination for the clustering service;
  * run the flask app as a UWSGI daemon behind nginx;
  * configure nginx to use a TLS certificate;
  * generate a self-signed TLS certificate.

This setup does not require PKI to be present for its own purposes of
joining compute nodes to the cluster. However, this does not mean that
PKI will not be used for TLS termination of the OpenStack endpoints.

Control node init workflow (non-interactive):

sudo microstack init --auto --control
microstack add-compute
<the connection string to be used at the compute node>

Compute node init workflow (non-interactive):

sudo microstack init --auto --compute --join <connection-string>

Change-Id: I9596fe1e6e5c1a325cc71fd3bf0c78b660b9a83e
This commit is contained in:
Dmitrii Shcherbakov 2020-10-12 22:12:39 +00:00
parent 81cbaa4433
commit 0ba5358865
25 changed files with 1080 additions and 188 deletions

View File

@ -77,7 +77,7 @@ each other. Plus, you need a root password and other niceties. Run the
init script to set all of that up: init script to set all of that up:
``` ```
microstack.init --auto microstack init --auto --control
``` ```
(Note that you may leave --auto out at present. The init script will (Note that you may leave --auto out at present. The init script will
@ -114,7 +114,7 @@ sudo systemctl restart snap.microstack.*
Create a test instance in your cloud. Create a test instance in your cloud.
`microstack.launch cirros --name test` `microstack launch cirros --name test`
This will launch a machine using the built-in cirros image. Once the This will launch a machine using the built-in cirros image. Once the
machine is setup, verify that you can ping it, then tear it down. machine is setup, verify that you can ping it, then tear it down.

View File

@ -10,7 +10,18 @@ from init import credentials
def _get_default_config(): def _get_default_config():
snap_common = os.getenv('SNAP_COMMON') snap_common = os.getenv('SNAP_COMMON')
return { return {
'config.clustered': False, 'config.is-clustered': False,
'config.cluster.tls-cert-path':
f'{snap_common}/etc/cluster/tls/cert.pem',
'config.cluster.tls-key-path':
f'{snap_common}/etc/cluster/tls/key.pem',
'config.cluster.fingerprint': 'null',
'config.cluster.hostname': 'null',
'config.cluster.credential-id': 'null',
'config.cluster.credential-secret': 'null',
'config.post-setup': True, 'config.post-setup': True,
'config.keystone.region-name': 'microstack', 'config.keystone.region-name': 'microstack',
'config.credentials.key-pair': '/home/{USER}/snap/{SNAP_NAME}' 'config.credentials.key-pair': '/home/{USER}/snap/{SNAP_NAME}'
@ -70,16 +81,16 @@ def _setup_secrets():
else: else:
existing_cred_keys = [] existing_cred_keys = []
shell.config_set(**{ shell.config_set(**{
k: credentials.generate_password() for k in [ f'config.credentials.{k}': credentials.generate_password() for k in [
'config.credentials.mysql-root-password', 'mysql-root-password',
'config.credentials.rabbitmq-password', 'rabbitmq-password',
'config.credentials.keystone-password', 'keystone-password',
'config.credentials.nova-password', 'nova-password',
'config.credentials.cinder-password', 'cinder-password',
'config.credentials.neutron-password', 'neutron-password',
'config.credentials.placement-password', 'placement-password',
'config.credentials.glance-password', 'glance-password',
'config.credentials.ovn-metadata-proxy-shared-secret', 'ovn-metadata-proxy-shared-secret',
] if k not in existing_cred_keys ] if k not in existing_cred_keys
}) })

View File

@ -19,6 +19,8 @@ setup:
- "{snap_common}/etc/nova/uwsgi/snap" - "{snap_common}/etc/nova/uwsgi/snap"
- "{snap_common}/etc/horizon/uwsgi/snap" - "{snap_common}/etc/horizon/uwsgi/snap"
- "{snap_common}/etc/placement/uwsgi/snap" - "{snap_common}/etc/placement/uwsgi/snap"
- "{snap_common}/etc/cluster/tls"
- "{snap_common}/etc/cluster/uwsgi/snap"
- "{snap_common}/etc/rabbitmq" - "{snap_common}/etc/rabbitmq"
- "{snap_common}/fernet-keys" - "{snap_common}/fernet-keys"
- "{snap_common}/lib" - "{snap_common}/lib"
@ -31,6 +33,7 @@ setup:
- "{snap_common}/etc/iscsi" - "{snap_common}/etc/iscsi"
- "{snap_common}/etc/target" - "{snap_common}/etc/target"
templates: templates:
cluster-nginx.conf.j2: "{snap_common}/etc/nginx/snap/sites-enabled/cluster.conf"
keystone-nginx.conf.j2: "{snap_common}/etc/nginx/snap/sites-enabled/keystone.conf" keystone-nginx.conf.j2: "{snap_common}/etc/nginx/snap/sites-enabled/keystone.conf"
keystone-snap.conf.j2: "{snap_common}/etc/keystone/keystone.conf.d/keystone-snap.conf" keystone-snap.conf.j2: "{snap_common}/etc/keystone/keystone.conf.d/keystone-snap.conf"
neutron-snap.conf.j2: "{snap_common}/etc/neutron/neutron.conf.d/neutron-snap.conf" neutron-snap.conf.j2: "{snap_common}/etc/neutron/neutron.conf.d/neutron-snap.conf"
@ -82,6 +85,9 @@ setup:
"{snap_common}/etc/microstack.rc": 0644 "{snap_common}/etc/microstack.rc": 0644
"{snap_common}/etc/microstack.json": 0644 "{snap_common}/etc/microstack.json": 0644
snap-config-keys: snap-config-keys:
is_clustered: 'config.is-clustered'
cluster_tls_cert_path: 'config.cluster.tls-cert-path'
cluster_tls_key_path: 'config.cluster.tls-key-path'
region_name: 'config.keystone.region-name' region_name: 'config.keystone.region-name'
keystone_password: 'config.credentials.keystone-password' keystone_password: 'config.credentials.keystone-password'
nova_password: 'config.credentials.nova-password' nova_password: 'config.credentials.nova-password'
@ -132,6 +138,12 @@ entry_points:
- "{snap_common}/etc/keystone/keystone.conf.d" - "{snap_common}/etc/keystone/keystone.conf.d"
templates: templates:
keystone-api.ini.j2: "{snap_common}/etc/keystone/uwsgi/snap/keystone-api.ini" keystone-api.ini.j2: "{snap_common}/etc/keystone/uwsgi/snap/keystone-api.ini"
cluster-uwsgi:
type: uwsgi
uwsgi-dir: "{snap_common}/etc/cluster/uwsgi/snap"
uwsgi-dir-override: "{snap_common}/etc/cluster/uwsgi"
templates:
cluster-api.ini.j2: "{snap_common}/etc/cluster/uwsgi/snap/cluster-api.ini"
nginx: nginx:
type: nginx type: nginx
config-file: "{snap_common}/etc/nginx/snap/nginx.conf" config-file: "{snap_common}/etc/nginx/snap/nginx.conf"

View File

@ -0,0 +1,11 @@
[uwsgi]
module = cluster.daemon:app
uwsgi-socket = {{ snap_common }}/run/cluster-api.sock
buffer-size = 65535
master = true
enable-threads = true
processes = 2
thunder-lock = true
lazy-apps = true
home = {{ snap }}/usr
pyargv = {{ pyargv }}

View File

@ -0,0 +1,20 @@
server {
listen 10002 ssl;
error_log syslog:server=unix:/dev/log;
access_log syslog:server=unix:/dev/log;
{% if is_clustered %}
ssl_session_timeout 1d;
ssl_session_tickets off;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384;
ssl_certificate {{ cluster_tls_cert_path }};
ssl_certificate_key {{ cluster_tls_key_path }};
{% endif %}
location / {
include {{ snap }}/usr/conf/uwsgi_params;
uwsgi_param SCRIPT_NAME '';
uwsgi_pass unix://{{ snap_common }}/run/cluster-api.sock;
}
}

View File

@ -1,10 +0,0 @@
server {
listen 8011;
error_log syslog:server=unix:/dev/log;
access_log syslog:server=unix:/dev/log;
location / {
include {{ snap }}/usr/conf/uwsgi_params;
uwsgi_param SCRIPT_NAME '';
uwsgi_pass unix://{{ snap_common }}/run/keystone-api.sock;
}
}

View File

@ -23,7 +23,7 @@ command[check_swap]={{ snap }}/usr/lib/nagios/plugins/check_swap -n ok -w 5 -c 1
command[check_zombie_procs]={{ snap }}/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z command[check_zombie_procs]={{ snap }}/usr/lib/nagios/plugins/check_procs -w 5 -c 10 -s Z
command[check_total_procs]={{ snap }}/usr/lib/nagios/plugins/check_procs -w 220 -c 300 command[check_total_procs]={{ snap }}/usr/lib/nagios/plugins/check_procs -w 220 -c 300
command[check_rabbitmq_server]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.microstack.rabbitmq-server command[check_rabbitmq_server]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.microstack.rabbitmq-server
command[check_cluster_server]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.microstack.cluster-server command[check_cluster_server]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.microstack.cluster-uwsgi
#command[check_external_bridge]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.icrostack.external-bridge #command[check_external_bridge]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.icrostack.external-bridge
command[check_glance_api]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.microstack.glance-api command[check_glance_api]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.microstack.glance-api
command[check_horizon_uwsgi]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.microstack.horizon-uwsgi command[check_horizon_uwsgi]=python3 {{ snap }}/usr/lib/nagios/plugins/check_systemd.py snap.microstack.horizon-uwsgi

View File

@ -14,7 +14,7 @@ mkdir -p ${OVN_LOGDIR}
mkdir -p ${OVN_RUNDIR} mkdir -p ${OVN_RUNDIR}
mkdir -p ${OVN_SYSCONFDIR}/ovn mkdir -p ${OVN_SYSCONFDIR}/ovn
if [ `basename $1` = 'ovn-ctl' -a `snapctl get config.clustered` == 'true' ] if [ `basename $1` = 'ovn-ctl' -a `snapctl get config.is-clustered` == 'true' ]
then then
# TODO: replace this with a secure alternative once TLS is supported. # TODO: replace this with a secure alternative once TLS is supported.
# Create an SB TCP socket to be used by remote ovn-controller and neutron-ovn-metadata # Create an SB TCP socket to be used by remote ovn-controller and neutron-ovn-metadata

View File

@ -53,6 +53,8 @@ snapctl stop --disable $SNAP_INSTANCE_NAME.filebeat
snapctl stop --disable $SNAP_INSTANCE_NAME.nrpe snapctl stop --disable $SNAP_INSTANCE_NAME.nrpe
snapctl stop --disable $SNAP_INSTANCE_NAME.telegraf snapctl stop --disable $SNAP_INSTANCE_NAME.telegraf
snapctl stop --disable $SNAP_INSTANCE_NAME.cluster-uwsgi
mkdir -p $SNAP_DATA/lib/libvirt/images mkdir -p $SNAP_DATA/lib/libvirt/images
mkdir -p ${SNAP_COMMON}/log/libvirt/qemu mkdir -p ${SNAP_COMMON}/log/libvirt/qemu
# NOTE(dmitriis): there is currently no way to make sure this directory gets # NOTE(dmitriis): there is currently no way to make sure this directory gets

View File

@ -67,6 +67,21 @@ apps:
- network - network
# TODO: - microstack-support # TODO: - microstack-support
# A proxy command to avoid calling <namespace>.<command>.
# TODO: potentially remove the individual commands completely in favor of this.
microstack:
command: bin/microstack
plugs:
- network
- mount-observe
- network-bind
- firewall-control
- network-control
- ssh-keys
- system-observe
- hardware-observe
# TODO: - microstack-support
# OpenStack Service Configuration # OpenStack Service Configuration
init: init:
command: bin/microstack_init command: bin/microstack_init
@ -81,6 +96,11 @@ apps:
- hardware-observe # rabbitmq ? - hardware-observe # rabbitmq ?
# TODO: - microstack-support # TODO: - microstack-support
add-compute:
command: bin/microstack_add_compute
plugs:
- network
# Keystone # Keystone
keystone-uwsgi: keystone-uwsgi:
command: bin/snap-openstack launch keystone-uwsgi command: bin/snap-openstack launch keystone-uwsgi
@ -600,16 +620,13 @@ apps:
# TODO: - microstack-support # TODO: - microstack-support
# Cluster # Cluster
cluster-server: cluster-uwsgi:
command: bin/flask run -p 10002 --host 0.0.0.0 # TODO: run as a uwsgi app command: bin/snap-openstack launch cluster-uwsgi
daemon: simple daemon: simple
environment:
LC_ALL: C.UTF-8 # Makes flask happy
LANG: C.UTF-8 # Makes flask happy
FLASK_APP: $SNAP/lib/python3.8/site-packages/cluster/daemon.py
plugs: plugs:
- network - network
- network-bind - network-bind
# TODO: - microstack-support
telegraf: telegraf:
command: bin/telegraf command: bin/telegraf
@ -686,11 +703,11 @@ parts:
- uwsgi - uwsgi
- git+https://opendev.org/x/snap.openstack#egg=snap.openstack - git+https://opendev.org/x/snap.openstack#egg=snap.openstack
- http://tarballs.openstack.org/nova/nova-stable-ussuri.tar.gz - http://tarballs.openstack.org/nova/nova-stable-ussuri.tar.gz
- neutron - https://tarballs.opendev.org/openstack/neutron/neutron-stable-ussuri.tar.gz
- https://tarballs.opendev.org/openstack/glance/glance-stable-ussuri.tar.gz - https://tarballs.opendev.org/openstack/glance/glance-stable-ussuri.tar.gz
- https://tarballs.opendev.org/openstack/cinder/cinder-stable-ussuri.tar.gz - https://tarballs.opendev.org/openstack/cinder/cinder-stable-ussuri.tar.gz
- https://tarballs.opendev.org/openstack/placement/placement-stable-ussuri.tar.gz - https://tarballs.opendev.org/openstack/placement/placement-stable-ussuri.tar.gz
- horizon - https://tarballs.opendev.org/openstack/horizon/horizon-stable-ussuri.tar.gz
- python-cinderclient - python-cinderclient
- python-openstackclient - python-openstackclient
- python-swiftclient - python-swiftclient
@ -1453,6 +1470,25 @@ parts:
rm $SNAPCRAFT_PART_INSTALL/bin/python3 rm $SNAPCRAFT_PART_INSTALL/bin/python3
rm $SNAPCRAFT_PART_INSTALL/bin/python rm $SNAPCRAFT_PART_INSTALL/bin/python
microstack:
plugin: python
source: tools/microstack
stage-packages:
# note(dmitriis) in order to avoid conflicts about lib64/ld-linux-x86-64.so.2
# with other parts.
- libc6
build-environment: *python-build-environment
override-build: |
snapcraftctl build
`find $SNAPCRAFT_PART_INSTALL -name '__pycache__' | xargs rm -r`
`find $SNAPCRAFT_PART_INSTALL -name 'RECORD' | xargs rm`
rm $SNAPCRAFT_PART_INSTALL/pyvenv.cfg
rm $SNAPCRAFT_PART_INSTALL/bin/activate
rm $SNAPCRAFT_PART_INSTALL/bin/activate.csh
rm $SNAPCRAFT_PART_INSTALL/bin/activate.fish
rm $SNAPCRAFT_PART_INSTALL/bin/python3
rm $SNAPCRAFT_PART_INSTALL/bin/python
# Clustering client and server # Clustering client and server
cluster: cluster:
plugin: python plugin: python

View File

@ -118,7 +118,7 @@ class Host():
def init(self, args=['--auto']): def init(self, args=['--auto']):
print(f"Initializing the snap with {args}") print(f"Initializing the snap with {args}")
check(*self.prefix, 'sudo', 'microstack.init', *args) check(*self.prefix, 'sudo', 'microstack', 'init', *args)
def multipass(self): def multipass(self):
self.machine = petname.generate() self.machine = petname.generate()

View File

@ -38,16 +38,21 @@ class TestBasics(Framework):
host.install() host.install()
host.init([ host.init([
'--auto', '--auto',
'--control',
'--setup-loop-based-cinder-lvm-backend', '--setup-loop-based-cinder-lvm-backend',
'--loop-device-file-size=32' '--loop-device-file-size=24'
]) ])
prefix = host.prefix prefix = host.prefix
endpoints = check_output( endpoints = check_output(
*prefix, '/snap/bin/microstack.openstack', 'endpoint', 'list') *prefix, '/snap/bin/microstack.openstack', 'endpoint', 'list')
# Endpoints should be listening on 10.20.20.1 control_ip = check_output(
self.assertTrue("10.20.20.1" in endpoints) *prefix, 'sudo', 'snap', 'get',
'microstack', 'config.network.control-ip'
)
# Endpoints should contain the control IP.
self.assertTrue(control_ip in endpoints)
# Endpoints should not contain localhost # Endpoints should not contain localhost
self.assertFalse("localhost" in endpoints) self.assertFalse("localhost" in endpoints)

View File

@ -0,0 +1,122 @@
#!/usr/bin/env python3
import uuid
import secrets
import argparse
from datetime import datetime
from dateutil.relativedelta import relativedelta
from oslo_serialization import (
base64,
msgpackutils
)
from cluster.shell import config_get
from keystoneauth1.identity import v3
from keystoneauth1 import session
from keystoneclient.v3 import client
VALIDITY_PERIOD = relativedelta(minutes=20)
def _create_credential():
project_name = 'service'
domain_name = 'default'
# TODO: add support for TLS-terminated Keystone once this is supported.
auth = v3.password.Password(
auth_url="http://localhost:5000/v3",
username='nova',
password=config_get('config.credentials.nova-password'),
user_domain_name=domain_name,
project_domain_name=domain_name,
project_name=project_name
)
sess = session.Session(auth=auth)
keystone_client = client.Client(session=sess)
# Only allow this credential to list the Keystone catalog. After it
# expires, Keystone will return Unauthorized for requests made with tokens
# issued from that credential.
access_rules = [{
'method': 'GET',
'path': '/v3/auth/catalog',
'service': 'identity'
}]
# TODO: make the expiration time customizable since this may be used by
# automation or during live demonstrations where the lag between issuance
# and usage may be more than the expiration time.
expires_at = datetime.now() + VALIDITY_PERIOD
# Role objects themselves are not tied to a specific domain by default
# - this does not affect role assignments themselves which are scoped.
reader_role = keystone_client.roles.find(name='reader', domain_id=None)
return keystone_client.application_credentials.create(
name=f'cluster-join-{uuid.uuid4().hex}',
expires_at=expires_at,
access_rules=access_rules,
# Do not allow this app credential to create new app credentials.
unrestricted=False,
roles=[reader_role.id],
# Make the secret shorter than the default but secure enough.
secret=secrets.token_urlsafe(32)[:32]
)
def add_compute():
"""Generates connection string for adding a compute node to the cluster.
Steps:
* Make sure we are running in the clustered mode and this is a control
node which is an initial node in the cluster;
* Generate an application credential via Keystone scoped to the service
project with restricted capabilities (reader role and only able to list
the service catalog) and a short expiration time enough for a user to
copy the connection string to the compute node;
* Get an FQDN that will be used by the client to establish a connection to
the clustering service;
* Serialize the above data into a base64-encoded string.
"""
role = config_get('config.cluster.role')
if role != 'control':
raise Exception('Running add-compute is only supported on a'
' control node.')
app_cred = _create_credential()
data = {
# TODO: we do not use hostname verification, however, using
# an FQDN might be useful here since the host may be behind NAT
# with a split-horizon DNS implemented where a hostname would point
# us to a different IP.
'hostname': config_get('config.network.control-ip'),
# Store bytes since the representation will be shorter than with hex.
'fingerprint': bytes.fromhex(config_get('config.cluster.fingerprint')),
'id': app_cred.id,
'secret': app_cred.secret,
}
connection_string = base64.encode_as_text(msgpackutils.dumps(data))
# Print the connection string and an expiration notice to the user.
print('Use the following connection string to add a new compute node'
f' to the cluster (valid for {VALIDITY_PERIOD.minutes} minutes from'
f' this moment):\n{connection_string}')
def main():
parser = argparse.ArgumentParser(
description='add-compute',
usage='''add-compute
This command does not have subcommands - just run it to get a connection string
to be used when joining a node to the cluster.
''')
parser.parse_args()
add_compute()
if __name__ == '__main__':
main()

View File

@ -1,37 +1,100 @@
#!/usr/bin/env python3 #!/usr/bin/env python3
import urllib3
import json import json
import requests
from cluster import shell from cluster import shell
from cluster.shell import check_output
CLUSTER_SERVICE_PORT = 10002
class UnauthorizedRequestError(Exception):
pass
def join(): def join():
"""Join an existing cluster as a compute node.""" """Join an existing cluster as a compute node."""
config = json.loads(check_output('snapctl', 'get', 'config'))
password = config['cluster']['password'] cluster_config = shell.config_get('config.cluster')
control_ip = config['network']['control-ip'] control_hostname = cluster_config['hostname']
my_ip = config['network']['compute-ip'] fingerprint = cluster_config['fingerprint']
credential_id = cluster_config['credential-id']
credential_secret = cluster_config['credential-secret']
if not password: request_body = json.dumps({
raise Exception("No cluster password specified!") 'credential-id': credential_id,
'credential-secret': credential_secret
})
resp = requests.post( # Create a connection pool and override the TLS certificate
'http://{}:10002/join'.format(control_ip), # verification method to use the certificate fingerprint instead
json={'password': password, 'ip_address': my_ip}) # of hostname validation + validation via CA cert and expiration time.
if resp.status_code != 200: # This avoids relying on any kind of PKI and DNS assumptions in the
# TODO better error and formatting. # installation environment.
raise Exception('Failed to get info from control node: {}'.format( # If the fingerprint does not match, MaxRetryError will be raised
resp.json)) # with SSLError as a cause even with the rest of the checks disabled.
resp = resp.json() conn_pool = urllib3.HTTPSConnectionPool(
control_hostname, CLUSTER_SERVICE_PORT,
assert_fingerprint=fingerprint, assert_hostname=False,
cert_reqs='CERT_NONE',
)
credentials = resp['config']['credentials'] try:
resp = conn_pool.urlopen(
'POST', '/join', retries=0, preload_content=True,
headers={
'API-VERSION': '1.0.0',
'Content-Type': 'application/json',
}, body=request_body)
except urllib3.exceptions.MaxRetryError as e:
if isinstance(e.reason, urllib3.exceptions.SSLError):
raise Exception(
'The actual clustering service certificate fingerprint'
' did not match the expected one, please make sure that: '
'(1) that a correct token was specified during initialization;'
' (2) a MITM attacks are not performed against HTTPS requests'
' (including transparent proxies).'
) from e.reason
raise Exception('Could not retrieve a response from the clustering'
' service.') from e
if resp.status == 401:
response_data = resp.data.decode('utf-8')
# TODO: this should be more bulletproof in case a proxy server
# returns this response - it will not have the expected format.
print('An authorization failure has occurred while joining the'
' the cluster: please make sure the connection string'
' was entered as returned by the "add-compute" command'
' and that it was used before its expiration time.')
if response_data:
message = json.loads(response_data)['message']
raise UnauthorizedRequestError(message)
raise UnauthorizedRequestError()
if resp.status != 200:
raise Exception('Unexpected response status received from the'
f' clustering service: {resp.status}')
try:
response_data = resp.data.decode('utf-8')
except UnicodeDecodeError:
raise Exception('The response from the clustering service contains'
' bytes invalid for UTF-8')
if not response_data:
raise Exception('The response from the clustering service is empty'
' which is unexpected: please check its status'
' and file an issue if the problem persists')
# Load the response assuming it has the correct format. API versioning
# should rule out inconsistencies, otherwise we will get an error here.
response_dict = json.loads(response_data)
credentials = response_dict['config']['credentials']
control_creds = {f'config.credentials.{k}': v control_creds = {f'config.credentials.{k}': v
for k, v in credentials.items()} for k, v in credentials.items()}
shell.config_set(**control_creds) shell.config_set(**control_creds)
# TODO: use the hostname from the connection string instead to
# resolve an IP address (requires a valid DNS setup).
control_ip = response_dict['config']['network']['control-ip']
shell.config_set(**{'config.network.control-ip': control_ip})
if __name__ == '__main__': if __name__ == '__main__':

View File

@ -1,51 +1,287 @@
import logging
import json import json
from flask import Flask, request import semantic_version
import keystoneclient.exceptions as kc_exceptions
from flask import Flask, request, jsonify
from werkzeug.exceptions import BadRequest
from cluster.shell import check_output from cluster.shell import check_output
from keystoneauth1.identity import v3
from keystoneauth1 import session
from keystoneclient.v3 import client as v3client
logger = logging.getLogger(__name__)
app = Flask(__name__) app = Flask(__name__)
API_VERSION = semantic_version.Version('1.0.0')
class Unauthorized(Exception): class Unauthorized(Exception):
pass pass
def join_info(password, ip_address): class APIException(Exception):
our_password = check_output('snapctl', 'get', 'config.cluster.password') status_code = None
message = ''
if password.strip() != our_password.strip(): def to_dict(self):
raise Unauthorized() return {'message': self.message}
# Load config
class APIVersionMissing(APIException):
status_code = 400
message = 'An API version was not specified in the request.'
class APIVersionInvalid(APIException):
status_code = 400
message = 'Invalid API version was specified in the request.'
class APIVersionDropped(APIException):
status_code = 410
message = 'The requested join API version is no longer supported.'
class APIVersionNotImplemented(APIException):
status_code = 501
message = 'The requested join API version is not yet implemented.'
class InvalidJSONInRequest(APIException):
status_code = 400
message = 'The request includes invalid JSON.'
class IncorrectContentType(APIException):
status_code = 400
message = ('The request does not have a Content-Type header set to '
'application/json.')
class MissingAuthDataInRequest(APIException):
status_code = 400
message = 'The request does not have the required authentication data.'
class InvalidAuthDataFormatInRequest(APIException):
status_code = 400
message = 'The authentication data in the request has invalid format.'
class InvalidAuthDataInRequest(APIException):
status_code = 400
message = 'The authentication data in the request is invalid.'
class AuthorizationFailed(APIException):
status_code = 401
message = ('Failed to pass authorization using the data provided in the'
' request')
class UnexpectedError(APIException):
status_code = 500
message = ('The clustering server has encountered an unexpected'
' error while handling the request.')
def _handle_api_version_exception(error):
response = jsonify(error.to_dict())
response.status_code = error.status_code
return response
@app.errorhandler(APIVersionMissing)
def handle_api_version_missing(error):
return _handle_api_version_exception(error)
@app.errorhandler(APIVersionInvalid)
def handle_api_version_invalid(error):
return _handle_api_version_exception(error)
@app.errorhandler(APIVersionDropped)
def handle_api_version_dropped(error):
return _handle_api_version_exception(error)
@app.errorhandler(APIVersionNotImplemented)
def handle_api_version_not_implemented(error):
return _handle_api_version_exception(error)
@app.errorhandler(IncorrectContentType)
def handle_incorrect_content_type(error):
return _handle_api_version_exception(error)
@app.errorhandler(InvalidJSONInRequest)
def handle_invalid_json_in_request(error):
return _handle_api_version_exception(error)
@app.errorhandler(InvalidAuthDataInRequest)
def handle_invalid_auth_data_format_in_request(error):
return _handle_api_version_exception(error)
@app.errorhandler(InvalidAuthDataFormatInRequest)
def handle_invalid_auth_data_in_request(error):
return _handle_api_version_exception(error)
@app.errorhandler(AuthorizationFailed)
def handle_authorization_failed(error):
return _handle_api_version_exception(error)
@app.errorhandler(UnexpectedError)
def handle_unexpected_error(error):
return _handle_api_version_exception(error)
def join_info():
"""Generate the configuration information to return to a client."""
# TODO: be selective about what we return. For now, we just get everything. # TODO: be selective about what we return. For now, we just get everything.
config = json.loads(check_output('snapctl', 'get', 'config')) config = json.loads(check_output('snapctl', 'get', 'config'))
info = {'config': config} info = {'config': config}
return info return info
@app.route('/join', methods=['POST'])
def join():
"""Authorize a client node and return relevant config."""
# Retrieve an API version from the request - it is a mandatory
# header for this API.
request_version = request.headers.get('API-Version')
if request_version is None:
logger.debug('The client has not specified the API-version header.')
raise APIVersionMissing()
else:
try:
api_version = semantic_version.Version(request_version)
except ValueError:
logger.debug('The client has specified an invalid API version.'
f': {request_version}')
raise APIVersionInvalid()
# Compare the API version used by the clustering service with the
# one specified in the request and return an appropriate response.
if api_version.major > API_VERSION.major:
logger.debug('The client requested a version that is not'
f' supported yet: {api_version}.')
raise APIVersionNotImplemented()
elif api_version.major < API_VERSION.major:
logger.debug('The client request version is no longer supported'
f': {api_version}.')
raise APIVersionDropped()
else:
# Flask raises a BadRequest if the JSON content is invalid and
# returns None if the Content-Type header is missing or not set
# to application/json.
try:
req_json = request.json
except BadRequest:
logger.debug('The client has POSTed an invalid JSON'
' in the request.')
raise InvalidJSONInRequest()
if req_json is None:
logger.debug('The client has not specified the application/json'
' content type in the request.')
raise IncorrectContentType()
# So far we don't have any minor versions with backwards-compatible
# changes so just assume that all data will be present or error out.
credential_id = req_json.get('credential-id')
credential_secret = req_json.get('credential-secret')
if not credential_id or not credential_secret:
logger.debug('The client has not specified the required'
' authentication data in the request.')
return MissingAuthDataInRequest()
# TODO: handle https here when TLS termination support is added.
keystone_base_url = 'http://localhost:5000/v3'
# In an unlikely event of failing to construct an auth object
# treat it as if invalid data got passed in terms of responding
# to the client.
try:
auth = v3.ApplicationCredential(
auth_url=keystone_base_url,
application_credential_id=credential_id,
application_credential_secret=credential_secret
)
except Exception:
logger.exception('An exception has occurred while trying to build'
' an auth object for an application credential'
' passed from the clustering client.')
raise InvalidAuthDataInRequest()
try:
# Use the auth object with the app credential to create a session
# which the Keystone client will use.
sess = session.Session(auth=auth)
except Exception:
logger.exception('An exception has occurred while trying to build'
' a Session object with auth data'
' passed from the clustering client.')
raise UnexpectedError()
try:
keystone_client = v3client.Client(session=sess)
except Exception:
logger.exception('An exception has occurred while trying to build'
' a Keystone Client object with auth data'
' passed from the clustering client.')
raise UnexpectedError()
try:
# The add-compute command creates application credentials that
# allow access to /v3/auth/catalog with an expiration time.
# Authorization failures occur after an app credential expires
# in which case an error is returned to the client.
keystone_client.get(f'{keystone_base_url}/auth/catalog')
except (kc_exceptions.AuthorizationFailure,
kc_exceptions.Unauthorized):
logger.exception('Failed to get a Keystone token'
' with the application credentials'
' passed from the clustering client.')
raise AuthorizationFailed()
except ValueError:
logger.exception('Insufficient amount of parameters were'
' used in the request to Keystone.')
raise UnexpectedError()
except kc_exceptions.ConnectionError:
logger.exception('Failed to connect to Keystone')
raise UnexpectedError()
except kc_exceptions.SSLError:
logger.exception('A TLS-related error has occurred while'
' connecting to Keystone')
raise UnexpectedError()
# We were able to authenticate against Keystone using the
# application credential and verify that it has not expired
# so the information for a compute node to join the cluster can
# now be returned.
return json.dumps(join_info())
@app.route('/') @app.route('/')
def home(): def home():
status = { status = {
'status': 'running', 'status': 'running',
'info': 'Microstack clustering daemon.' 'info': 'MicroStack clustering daemon.'
} }
return json.dumps(status) return json.dumps(status)
@app.route('/join', methods=['POST'])
def join():
req = request.json # TODO: better error messages on failed parse.
password = req.get('password')
ip_address = req.get('ip_address')
if not password:
return 'No password specified', 500
try:
return json.dumps(join_info(password, ip_address))
except Unauthorized:
return (json.dumps({'error': 'Incorrect password.'}), 500)

View File

@ -8,6 +8,7 @@ setup(
entry_points={ entry_points={
'console_scripts': [ 'console_scripts': [
'microstack_join = cluster.client:join', 'microstack_join = cluster.client:join',
'microstack_add_compute = cluster.add_compute:main',
], ],
} }
) )

View File

@ -0,0 +1,75 @@
#!/usr/bin/env python3
from pathlib import Path
from datetime import datetime
from dateutil.relativedelta import relativedelta
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography import x509
from cryptography.x509.oid import NameOID
from init import shell
def generate_selfsigned():
"""Generate a self-signed certificate with associated keys.
The certificate will have a fake CNAME and subjAltName since
the expectation is that this certificate will only be used by
clients that know its fingerprint and will not use a validation
via a CA certificate and hostname. This approach is similar to
Certificate Pinning, however, here a certificate is not embedded
into the application but is generated on initialization at one
node and its fingerprint is copied in a token to another node
via a secure channel.
https://owasp.org/www-community/controls/Certificate_and_Public_Key_Pinning
"""
cert_path, key_path = (
Path(shell.config_get('config.cluster.tls-cert-path')),
Path(shell.config_get('config.cluster.tls-key-path')),
)
# Do not generate a new certificate and key if there is already an existing
# pair. TODO: improve this check and allow renewal.
if cert_path.exists() and key_path.exists():
return
dummy_cn = 'microstack.run'
key = rsa.generate_private_key(
public_exponent=65537,
key_size=2048,
backend=default_backend(),
)
common_name = x509.Name([
x509.NameAttribute(NameOID.COMMON_NAME, dummy_cn)
])
san = x509.SubjectAlternativeName([x509.DNSName(dummy_cn)])
basic_contraints = x509.BasicConstraints(ca=True, path_length=0)
now = datetime.utcnow()
cert = (
x509.CertificateBuilder()
.subject_name(common_name)
.issuer_name(common_name)
.public_key(key.public_key())
.serial_number(x509.random_serial_number())
.not_valid_before(now)
.not_valid_after(now + relativedelta(years=10))
.add_extension(basic_contraints, False)
.add_extension(san, False)
.sign(key, hashes.SHA256(), default_backend())
)
cert_fprint = cert.fingerprint(hashes.SHA256()).hex()
shell.config_set(**{'config.cluster.fingerprint': cert_fprint})
serialized_cert = cert.public_bytes(encoding=serialization.Encoding.PEM)
serialized_key = key.private_bytes(
encoding=serialization.Encoding.PEM,
format=serialization.PrivateFormat.PKCS8,
encryption_algorithm=serialization.NoEncryption(),
)
cert_path.write_bytes(serialized_cert)
key_path.write_bytes(serialized_key)

View File

@ -31,15 +31,18 @@ limitations under the License.
import argparse import argparse
import logging import logging
import secrets
import string
import sys import sys
import socket import socket
from functools import wraps from functools import wraps
from init.config import log from init.config import log
from init.shell import default_network, check, check_output from init.shell import (
default_network,
check,
check_output,
config_set,
)
from init import questions from init import questions
@ -69,12 +72,18 @@ def parse_init_args():
parser = argparse.ArgumentParser() parser = argparse.ArgumentParser()
parser.add_argument('--auto', '-a', action='store_true', parser.add_argument('--auto', '-a', action='store_true',
help='Run non interactively.') help='Run non interactively.')
parser.add_argument('--cluster-password') parser.add_argument('--join', '-j',
dest='connection_string',
help='Pass a connection string generated by the'
' add-compute command at the control node'
' (required for compute nodes, unused for control'
' nodes).')
parser.add_argument('--compute', action='store_true') parser.add_argument('--compute', action='store_true')
parser.add_argument('--control', action='store_true') parser.add_argument('--control', action='store_true')
parser.add_argument('--debug', action='store_true') parser.add_argument('--debug', action='store_true')
parser.add_argument( parser.add_argument(
'--setup-loop-based-cinder-lvm-backend', '--setup-loop-based-cinder-lvm-backend',
default=False,
action='store_true', action='store_true',
help='(experimental) set up a loop device-backed' help='(experimental) set up a loop device-backed'
' LVM backend for Cinder.' ' LVM backend for Cinder.'
@ -94,41 +103,42 @@ def process_init_args(args):
values in our snap config, based on those args. values in our snap config, based on those args.
""" """
auto = args.auto or args.control or args.compute if args.auto and not (args.control or args.compute):
raise ValueError('A role (--compute or --control) must be specified '
' when using --auto')
if args.compute or args.control: if args.compute or args.control:
check('snapctl', 'set', 'config.clustered=true') config_set(**{'config.is-clustered': 'true'})
if args.compute: if args.compute:
check('snapctl', 'set', 'config.cluster.role=compute') config_set(**{'config.cluster.role': 'compute'})
if args.control: if args.control:
# If both compute and control are passed for some reason, we # If both compute and control are passed for some reason, we
# wind up with the role of 'control', which is best, as a # wind up with the role of 'control', which is best, as a
# control node also serves as a compute node in our hyper # control node also serves as a compute node in our hyper
# converged architecture. # converged architecture.
check('snapctl', 'set', 'config.cluster.role=control') config_set(**{'config.cluster.role': 'control'})
if args.cluster_password: if args.connection_string:
check('snapctl', 'set', 'config.cluster.password={}'.format( config_set(**{
args.cluster_password)) 'config.cluster.connection-string.raw': args.connection_string})
if auto and not args.cluster_password: if args.auto and not args.control and not args.connection_string:
alphabet = string.ascii_letters + string.digits raise ValueError('The connection string parameter must be specified'
password = ''.join(secrets.choice(alphabet) for i in range(10)) ' for compute nodes.')
check('snapctl', 'set', 'config.cluster.password={}'.format(
password))
if args.debug: if args.debug:
log.setLevel(logging.DEBUG) log.setLevel(logging.DEBUG)
check('snapctl', 'set', config_set(**{
f'config.cinder.setup-loop-based-cinder-lvm-backend=' 'config.cinder.setup-loop-based-cinder-lvm-backend':
f'{str(args.setup_loop_based_cinder_lvm_backend).lower()}') f'{str(args.setup_loop_based_cinder_lvm_backend).lower()}',
check('snapctl', 'set', 'config.cinder.loop-device-file-size':
f'config.cinder.loop-device-file-size={args.loop_device_file_size}G') f'{args.loop_device_file_size}G',
})
return auto return args.auto
@requires_sudo @requires_sudo
@ -136,8 +146,13 @@ def init() -> None:
args = parse_init_args() args = parse_init_args()
auto = process_init_args(args) auto = process_init_args(args)
# Do not ask about this if a CLI argument asking for it has been
# provided already.
cinder_lvm_question = questions.CinderVolumeLVMSetup()
if args.setup_loop_based_cinder_lvm_backend:
cinder_lvm_question.interactive = False
question_list = [ question_list = [
questions.Clustering(),
questions.DnsServers(), questions.DnsServers(),
questions.DnsDomain(), questions.DnsDomain(),
questions.NetworkSettings(), questions.NetworkSettings(),
@ -157,11 +172,31 @@ def init() -> None:
questions.GlanceSetup(), questions.GlanceSetup(),
questions.SecurityRules(), questions.SecurityRules(),
questions.CinderSetup(), questions.CinderSetup(),
questions.CinderVolumeLVMSetup(), cinder_lvm_question,
questions.PostSetup(), questions.PostSetup(),
questions.ExtraServicesQuestion(), questions.ExtraServicesQuestion(),
] ]
clustering_question = questions.Clustering()
# If the connection string is specified we definitely
# want to set up clustering and we don't need to ask.
if args.connection_string:
if args.auto:
clustering_question.interactive = False
if args.control:
raise ValueError('Joining additional control nodes is'
' not supported.')
elif args.compute:
clustering_question.role_interactive = False
config_set(**{'config.is-clustered': True})
clustering_question.connection_string_interactive = False
clustering_question.yes(answer=True)
else:
if args.control or args.compute:
clustering_question.role_interactive = False
# The same code-path as for other questions will be executed.
question_list.insert(0, clustering_question)
for question in question_list: for question in question_list:
if auto: if auto:
# Force all questions to be non-interactive if we passed --auto. # Force all questions to be non-interactive if we passed --auto.

View File

@ -31,6 +31,7 @@ from init import shell
from init.shell import (check, call, check_output, sql, nc_wait, log_wait, from init.shell import (check, call, check_output, sql, nc_wait, log_wait,
start, restart, download, disable, enable) start, restart, download, disable, enable)
from init.config import Env, log from init.config import Env, log
from init import cluster_tls
from init.questions.question import Question from init.questions.question import Question
from init.questions import clustering, network, uninstall # noqa F401 from init.questions import clustering, network, uninstall # noqa F401
@ -46,66 +47,68 @@ class ConfigError(Exception):
class Clustering(Question): class Clustering(Question):
"""Possibly setup clustering.""" """Possibly configure clustering."""
_type = 'boolean' _type = 'boolean'
_question = 'Do you want to setup clustering?' _question = 'Would you like to configure clustering?'
config_key = 'config.clustered' config_key = 'config.is-clustered'
interactive = True interactive = True
# Overrides to be used when options are explicitly specified via
# command-line arguments.
connection_string_interactive = True
role_interactive = True
def yes(self, answer: bool): def yes(self, answer: bool):
log.info('Configuring clustering ...') log.info('Configuring clustering ...')
role_question = clustering.Role()
if not (self.interactive and self.role_interactive):
role_question.interactive = False
role_question.ask()
questions = [ questions = [
clustering.Role(), # Skipped for the compute role and is automatically taken
clustering.Password(), # from the connection string.
clustering.ControlIp(), clustering.ControlIp(),
clustering.ComputeIp(), # Automagically skipped role='control' # Skipped for the control role since it is identical to the
# control node IP.
clustering.ComputeIp(),
] ]
for question in questions: for question in questions:
if not self.interactive: if not self.interactive:
question.interactive = False question.interactive = False
question.ask() question.ask()
role = check_output('snapctl', 'get', 'config.cluster.role') connection_string_question = clustering.ConnectionString()
control_ip = check_output('snapctl', 'get', if not (self.interactive and self.connection_string_interactive):
'config.network.control-ip') connection_string_question.interactive = False
password = check_output('snapctl', 'get', 'config.cluster.password') connection_string_question.ask()
log.debug('Role: {}, IP: {}, Password: {}'.format( role = shell.config_get('config.cluster.role')
role, control_ip, password))
# TODO: raise an exception if any of the above are None (can
# happen if we're automatig and mess up our params.)
if role == 'compute': if role == 'compute':
log.info('I am a compute node.') log.info('Setting up as a compute node.')
# Gets config info and sets local env vals. # Gets config info and sets local env vals.
check_output('microstack_join') check_output('microstack_join')
shell.config_set(**{
# Set default question answers. 'config.services.control-plane': 'false',
check('snapctl', 'set', 'config.services.control-plane=false') 'config.services.hypervisor': 'true',
check('snapctl', 'set', 'config.services.hypervisor=true') })
if role == 'control': if role == 'control':
log.info('I am a control node.') log.info('Setting up as a control node.')
check('snapctl', 'set', 'config.services.control-plane=true') shell.config_set(**{
# We want to run a hypervisor on our control plane nodes 'config.services.control-plane': 'true',
# -- this is essentially a hyper converged cloud. 'config.services.hypervisor': 'true',
check('snapctl', 'set', 'config.services.hypervisor=true') })
# Generate a self-signed certificate for the clustering service.
# TODO: if this is run after init has already been called, cluster_tls.generate_selfsigned()
# need to restart services.
# Write templates # Write templates
check('snap-openstack', 'setup') check('snap-openstack', 'setup')
def no(self, answer: bool): def no(self, answer: bool):
# Turn off cluster server disable('cluster-uwsgi')
# TODO: it would be more secure to reverse this -- only enable
# to service if we are doing clustering.
disable('cluster-server')
class ConfigQuestion(Question): class ConfigQuestion(Question):
@ -554,6 +557,10 @@ class NovaControlPlane(Question):
) )
check('openstack', 'role', 'add', '--project', check('openstack', 'role', 'add', '--project',
'service', '--user', 'nova', 'admin') 'service', '--user', 'nova', 'admin')
# Assign the reader role to the nova user so that read-only
# application credentials can be created.
check('openstack', 'role', 'add', '--project',
'service', '--user', 'nova', 'reader')
# Use snapctl to start nova services. We need to call them # Use snapctl to start nova services. We need to call them
# out manually, because systemd doesn't know about them yet. # out manually, because systemd doesn't know about them yet.
@ -680,8 +687,8 @@ class CinderVolumeLVMSetup(Question):
_type = 'boolean' _type = 'boolean'
config_key = 'config.cinder.setup-loop-based-cinder-lvm-backend' config_key = 'config.cinder.setup-loop-based-cinder-lvm-backend'
_question = ('(experimental) Do you want to setup a loop device-backed LVM' _question = ('(experimental) Would you like to setup a loop device-backed'
' volume backend for Cinder?') ' LVM volume backend for Cinder?')
interactive = True interactive = True
def yes(self, answer: bool) -> None: def yes(self, answer: bool) -> None:
@ -780,6 +787,7 @@ class NeutronControlPlane(Question):
'neutron-ovn-metadata-agent' 'neutron-ovn-metadata-agent'
]: ]:
enable(service) enable(service)
restart(service)
# Disable the other services. # Disable the other services.
for service in [ for service in [
@ -904,7 +912,6 @@ class PostSetup(Question):
config_key = 'config.post-setup' config_key = 'config.post-setup'
def yes(self, answer: str) -> None: def yes(self, answer: str) -> None:
log.info('restarting libvirt and virtlogd ...') log.info('restarting libvirt and virtlogd ...')
# This fixes an issue w/ logging not getting set. # This fixes an issue w/ logging not getting set.
# TODO: fix issue. # TODO: fix issue.
@ -912,7 +919,13 @@ class PostSetup(Question):
restart('virtlogd') restart('virtlogd')
restart('nova-compute') restart('nova-compute')
restart('horizon-uwsgi') role = shell.config_get('config.cluster.role')
if role == 'control':
# TODO: since snap-openstack launch is used, this depends on the
# database readiness and hence the clustering service is enabled
# and started here. There needs to be a better way to do this.
enable('cluster-uwsgi')
restart('horizon-uwsgi')
check('snapctl', 'set', 'initialized=true') check('snapctl', 'set', 'initialized=true')
log.info('Complete. Marked microstack as initialized!') log.info('Complete. Marked microstack as initialized!')
@ -936,7 +949,7 @@ class SimpleServiceQuestion(Question):
class ExtraServicesQuestion(Question): class ExtraServicesQuestion(Question):
_type = 'boolean' _type = 'boolean'
_question = 'Do you want to setup extra services?' _question = 'Would you like to setup extra services?'
config_key = 'config.services.extra.enabled' config_key = 'config.services.extra.enabled'
interactive = True interactive = True
@ -958,7 +971,7 @@ class ExtraServicesQuestion(Question):
class Filebeat(SimpleServiceQuestion): class Filebeat(SimpleServiceQuestion):
_type = 'boolean' _type = 'boolean'
_question = 'Do you want to enable Filebeat?' _question = 'Would you like to enable Filebeat?'
config_key = 'config.services.extra.filebeat' config_key = 'config.services.extra.filebeat'
interactive = True interactive = True
@ -971,7 +984,7 @@ class Filebeat(SimpleServiceQuestion):
class Telegraf(SimpleServiceQuestion): class Telegraf(SimpleServiceQuestion):
_type = 'boolean' _type = 'boolean'
_question = 'Do you want to enable Telegraf?' _question = 'Would you like to enable Telegraf?'
config_key = 'config.services.extra.telegraf' config_key = 'config.services.extra.telegraf'
interactive = True interactive = True
@ -984,7 +997,7 @@ class Telegraf(SimpleServiceQuestion):
class Nrpe(SimpleServiceQuestion): class Nrpe(SimpleServiceQuestion):
_type = 'boolean' _type = 'boolean'
_question = 'Do you want to enable NRPE?' _question = 'Would you like to enable NRPE?'
config_key = 'config.services.extra.nrpe' config_key = 'config.services.extra.nrpe'
interactive = True interactive = True

View File

@ -1,13 +1,32 @@
from getpass import getpass import logging
import msgpack
import re
import netaddr
from cryptography.hazmat.primitives import hashes
from typing import Tuple
from init.questions.question import Question, InvalidAnswer from init.questions.question import Question, InvalidAnswer
from init.shell import check, check_output, fetch_ip_address from init.shell import (
fetch_ip_address,
config_get,
config_set,
)
from oslo_serialization import (
base64,
msgpackutils
)
logger = logging.getLogger(__name__)
class Role(Question): class Role(Question):
_type = 'string' _type = 'string'
config_key = 'config.cluster.role' config_key = 'config.cluster.role'
_question = "What is this machines' role? (control/compute)" _question = ('Which role would you like to use for this node:'
' "control" or "compute"?')
_valid_roles = ('control', 'compute') _valid_roles = ('control', 'compute')
interactive = True interactive = True
@ -20,33 +39,181 @@ class Role(Question):
if role in self._valid_roles: if role in self._valid_roles:
return role return role
print('Role must be either "control" or "compute"') print('The role must be either "control" or "compute".')
raise InvalidAnswer('Too many failed attempts.') raise InvalidAnswer('Too many failed attempts.')
class Password(Question): class ConnectionString(Question):
_type = 'string' # TODO: type password support _type = 'string'
config_key = 'config.cluster.password' config_key = 'config.cluster.connection-string.raw'
_question = 'Please enter a cluster password > ' _question = ('Please enter a connection string returned by the'
' add-compute command > ')
interactive = True interactive = True
def _input_func(self, prompt): def _validate(self, answer: str) -> Tuple[str, bool]:
if not self.interactive: try:
conn_str_bytes = base64.decode_as_bytes(
answer.encode('ascii'))
except TypeError:
print('The connection string contains non-ASCII'
' characters please make sure you entered'
' it as returned by the add-compute command.')
return answer, False
try:
conn_info = msgpackutils.loads(conn_str_bytes)
except msgpack.exceptions.ExtraData:
print('The connection string contains extra data'
' characters please make sure you entered'
' it as returned by the add-compute command.')
return answer, False
except ValueError:
print('The connection string contains extra data'
' characters please make sure you entered'
' it as returned by the add-compute command.')
return answer, False
except msgpack.exceptions.FormatError:
print('The connection string format is invalid'
' please make sure you entered'
' it as returned by the add-compute command.')
return answer, False
except Exception:
print('An unexpeted error has occured while trying'
' to decode the connection string. Please'
' make sure you entered it as returned by'
' the add-compute command and raise an'
' issue if the error persists')
return answer, False
# Perform token field validation as well so that the rest of
# the code-base can assume valid input.
# The input can be either an IPv4 or IPv6 address or a hostname.
hostname = conn_info.get('hostname')
try:
is_valid_address = self._validate_address(hostname)
is_valid_address = True
except ValueError:
logger.debug('The hostname specified in the connection string is'
' not an IPv4 or IPv6 address - treating it as'
' a hostname.')
is_valid_address = False
if not is_valid_address:
try:
self._validate_hostname(hostname)
except ValueError as e:
print(f'The hostname {hostname} provided in the connection'
f' string is invalid: {str(e)}')
return answer, False
fingerprint = conn_info.get('fingerprint')
try:
self._validate_fingerprint(fingerprint)
except ValueError as e:
print('The clustering service TLS certificate fingerprint provided'
f' in the connection string is invalid: {str(e)}')
return answer, False
credential_id = conn_info.get('id')
try:
self._validate_credential_id(credential_id)
except ValueError as e:
print('The credential id provided in the connection string is'
f' invalid: {str(e)}')
return answer, False
credential_secret = conn_info.get('secret')
try:
self._validate_credential_secret(credential_secret)
except ValueError as e:
print('The credential secret provided in the connection string is'
f' invalid: {str(e)}')
return answer, False
self._conn_info = conn_info
return answer, True
def _validate_hostname(self, hostname):
if hostname is None:
raise ValueError('A hostname has not been provided.')
if len(hostname) == 0:
raise ValueError('An empty hostname is invalid.')
# Remove the trailing dot as it does not count to the following
# length limit check.
if hostname.endswith('.'):
name = hostname[:-1]
else:
name = hostname
# See https://tools.ietf.org/html/rfc1035#section-3.1
# 255 - octet limit, 253 - visible hostname limit (without
# a trailing dot. The limit is also documented in hostname(7).
if len(name) > 253:
raise ValueError('The specified hostname is too long.')
allowed = re.compile('(?!-)[A-Z0-9-]{1,63}(?<!-)$', re.IGNORECASE)
if not re.search('[a-zA-Z-]', name.split(".")[-1]):
raise ValueError(f'{hostname} contains no non-numeric characters'
' in the top-level domain part of the hostname.')
if any((not allowed.match(x)) for x in name.split('.')):
raise ValueError('{hostname} is an invalid hostname.')
def _validate_address(self, address):
if address is None:
raise ValueError('An address has not been provided.')
if not (netaddr.valid_ipv4(address, netaddr.core.INET_PTON) or
netaddr.valid_ipv6(address, netaddr.core.INET_PTON)):
raise ValueError(f'{address} is not a valid IPv4 or IPv6 address.')
def _validate_fingerprint(self, fingerprint):
# We expect a byte sequence equal to the SHA256 hash of the cert.
actual_len = len(fingerprint)
expected_len = hashes.SHA256.digest_size
if not actual_len == expected_len:
raise ValueError('The provided fingerprint has an invalid '
f'length: {actual_len}, expected: {expected_len}')
def _validate_credential_id(self, credential_id):
if credential_id is None:
raise ValueError('A credential id has not been provided.')
# We expect a UUID (rfc4122) without dashes.
UUID_LEN = 32
actual_len = len(credential_id)
if actual_len != UUID_LEN:
raise ValueError('The credential length is not equal to a length'
'of a UUID without dashes:'
f'actual: {actual_len}, expected: {UUID_LEN}')
def _validate_credential_secret(self, credential_secret):
if credential_secret is None:
raise ValueError('A credential secret has not been provided.')
# The artificial secret length controlled by the MicroStack code-base.
# https://docs.python.org/3/library/secrets.html#how-many-bytes-should-tokens-use
SECRET_LEN = 32
actual_len = len(credential_secret)
if actual_len != SECRET_LEN:
raise ValueError('The credential secret has an unexpected length:'
f'actual: {actual_len}, expected: {SECRET_LEN}')
def after(self, answer: str) -> None:
# Store the individual parts of the connection string in the snap
# config for easy access and avoidance of extra parsing.
prefix = 'config.cluster'
config_set(**{
f'{prefix}.hostname': self._conn_info['hostname'],
f'{prefix}.fingerprint': self._conn_info['fingerprint'].hex(),
f'{prefix}.credential-id': self._conn_info['id'],
f'{prefix}.credential-secret': self._conn_info['secret'],
})
def ask(self):
# Skip this question for a control node since we are not connecting
# to ourselves.
role = config_get(Role.config_key)
if role == 'control':
return return
return super().ask()
# Get rid of 'default=' string the parent class has added to prompt.
prompt = self._question
for _ in range(0, 3):
password0 = getpass(prompt)
password1 = getpass('Please re-enter password > ')
if password0 == password1:
return password0
print("Passwords don't match!")
raise InvalidAnswer('Too many failed attempts.')
class ControlIp(Question): class ControlIp(Question):
@ -56,12 +223,18 @@ class ControlIp(Question):
interactive = True interactive = True
def _load(self): def _load(self):
if check_output( if config_get(Role.config_key) == 'control':
'snapctl', 'get', 'config.cluster.role') == 'control':
return fetch_ip_address() or super()._load() return fetch_ip_address() or super()._load()
return super()._load() return super()._load()
def ask(self):
# Skip this question for a compute node since the control IP
# address is taken from the connection string instead.
role = config_get(Role.config_key)
if role == 'compute':
return
return super().ask()
class ComputeIp(Question): class ComputeIp(Question):
_type = 'string' _type = 'string'
@ -70,18 +243,18 @@ class ComputeIp(Question):
interactive = True interactive = True
def _load(self): def _load(self):
if check_output( role = config_get(Role.config_key)
'snapctl', 'get', 'config.cluster.role') == 'compute': if role == 'compute':
return fetch_ip_address() or super().load() return fetch_ip_address() or super().load()
return super()._load() return super()._load()
def ask(self): def ask(self):
# If we are a control node, skip this question. # If we are a control node, skip this question.
role = check_output('snapctl', 'get', Role.config_key) role = config_get(Role.config_key)
if role == 'control': if role == 'control':
ip = check_output('snapctl', 'get', ControlIp.config_key) ip = config_get(ControlIp.config_key)
check('snapctl', 'set', '{}={}'.format(self.config_key, ip)) config_set(**{self.config_key: ip})
return return
return super().ask() return super().ask()

View File

@ -1,6 +1,10 @@
from init.config import Env, log from init.config import Env, log
from init.questions.question import Question from init.questions.question import Question
from init.shell import check, check_output from init.shell import (
check,
config_get,
config_set,
)
_env = Env().get_env() _env = Env().get_env()
@ -13,18 +17,21 @@ class ExtGateway(Question):
config_key = 'config.network.ext-gateway' config_key = 'config.network.ext-gateway'
def yes(self, answer): def yes(self, answer):
clustered = check_output('snapctl', 'get', 'config.clustered') clustered = config_get('config.is-clustered')
if clustered.lower() != 'true': if not clustered:
check('snapctl', 'set', 'config.network.control-ip={}'.format( ip_dict = {
answer)) 'config.network.control-ip': answer,
check('snapctl', 'set', 'config.network.compute-ip={}'.format( 'config.network.compute-ip': answer,
answer)) }
_env['control_ip'] = _env['compute_ip'] = answer config_set(**ip_dict)
_env.update(ip_dict)
else: else:
_env['control_ip'] = check_output('snapctl', 'get', ip_dict = config_get(*['config.network.control-ip',
'config.network.control-ip') 'config.network.compute-ip'])
_env['compute_ip'] = check_output('snapctl', 'get', _env.update({
'config.network.compute-ip') 'control_ip': ip_dict['config.network.control-ip'],
'compute_ip': ip_dict['config.network.compute-ip'],
})
class ExtCidr(Question): class ExtCidr(Question):

View File

@ -176,8 +176,13 @@ def config_get(*keys):
"""Get snap config keys via snapctl. """Get snap config keys via snapctl.
:param keys list[str]: Keys to retrieve from the snap configuration. :param keys list[str]: Keys to retrieve from the snap configuration.
:return: The parsed JSON document representation.
:rtype: str or int or float or bool or dict or list
""" """
return json.loads(check_output('snapctl', 'get', '-t', *keys)) if keys:
return json.loads(check_output('snapctl', 'get', '-t', *keys))
else:
return None
def config_set(**kwargs): def config_set(**kwargs):
@ -185,7 +190,8 @@ def config_set(**kwargs):
:param kwargs dict[str, str]: Values to set in the snap configuration. :param kwargs dict[str, str]: Values to set in the snap configuration.
""" """
check_output('snapctl', 'set', *[f'{k}={v}' for k, v in kwargs.items()]) if kwargs:
check('snapctl', 'set', *[f'{k}={v}' for k, v in kwargs.items()])
def download(url: str, output: str) -> None: def download(url: str, output: str) -> None:

View File

@ -0,0 +1,13 @@
# Copyright 2020 Canonical Ltd
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

View File

@ -0,0 +1,48 @@
#!/usr/bin/env python3
import argparse
import sys
import launch.main
import cluster.add_compute
import init.main
def main():
'''Implements a proxy command called "microstack"'''
parser = argparse.ArgumentParser(
description='microstack',
usage='''microstack <command> [<args>]
Available commands:
init initialize a MicroStack node
add-compute generate a connection string for a node to join the cluster
launch launch a virtual machine
''')
parser.add_argument('command',
help='A subcommand to run:\n'
' {init, launch, add-compute}')
args = parser.parse_args(sys.argv[1:2])
COMMANDS = {
'init': init.main.init,
'add-compute': cluster.add_compute.main,
'launch': launch.main.main
}
cmd = COMMANDS.get(args.command, None)
if cmd is None:
parser.print_help()
raise Exception('Unrecognized command')
# TODO: Implement this properly via subparsers and get rid of
# extra modules.
sys.argv[0] = sys.argv[1]
# Get rid of the command name in the args and call the actual command.
del(sys.argv[1])
cmd()
if __name__ == '__main__':
main()

13
tools/microstack/setup.py Normal file
View File

@ -0,0 +1,13 @@
from setuptools import setup, find_packages
setup(
name="microstack",
description="The MicroStack command",
packages=find_packages(exclude=("tests",)),
version="0.0.1",
entry_points={
'console_scripts': [
'microstack = microstack.main:main',
],
}
)