8 Commits

Author SHA1 Message Date
Corey Bryant
064aae8458 Add TLS OpenStack API endpoints
This patch provides TLS endpoints secured by a self-signed
certificate. Another patch will provide support for trusted CA-signed
certificates.

A new config.tls.generate-cert option is added that defaults to true.
When true, a self-signed certificate will be generated and OpenStack
API endpoints will be configured to use TLS with that self-signed
certificate. The following config options are added:

snap get microstack config.tls.generate-self-signed
snap get microstack config.tls.cacert-path
snap get microstack config.tls.cert-path
snap get microstack config.tls.key-path

Users can provide their own self-signed certificate by setting
generate-self-signed to false and storing their own certificates/key
at the paths specified by cacert-path, cert-path, and key-path.
'snap set' can also be used to change the cert/key file names.

If using clustering, the certificates/key will be copied from the
control node to the compute nodes. The config for cacert-path,
cert-path, and key-path will be set to the same values as on the
control node.

Other notable changes:
* The existing generate_selfsigned() function is modified to change
  the subject alternative name to be made up of the hostname and
  optionally an IP. The controller hostname and IP are used when
  generating the certificate for self-signed TLS endpoints. The
  hostname is now used instead of 'microstack.run' when generating
  the clustering certificate.
* This change also aligns logging for nginx and corresponding sites
  and moves all nginx sites to {snap_common}/etc/nginx/sites-enabled.

Change-Id: Iceea3127822404a3275fcf8a221cbedc4b52c217
2021-05-26 16:39:33 -04:00
Dmitrii Shcherbakov
a904cb6804 Rework the test framework & the clustering test
* Remove the dead code;
* Rework the test types;
* Restore the instance connectivity check;
* Rework the clustering test to support the new node addition workflow;
* Check whether a machine where MicroStack is installed has hardware
  virtualization capabilities for different architectures. If not, use
  software emulation;
  * the host model is used with KVM since the default QEMU CPU models on
    x86_64 are subject to vulnerabilities without certain CPU-specific
    features. This conflicts with being able to use live migration
    reliably across hosts with different CPUs.
* Add a default-source-ip init argument to allow controlling the source
  IP of the installation host that will be used as a control ip or
  compute ip locally.
  * used in the clustering test so that the local host IP on the
    multipass network is used as a control IP instead of the IP
    through which the default gateway is available;
  * the IP through which the default gateway is accessible is
    used as a fallback for default-source-ip;
* Given upstream CI has a low amount of resources allocated per machine
  use LXD to set up a dummy compute node;
  * Set RLIMIT_MEMLOCK to 'unlimited' in the LXD container profile
    (see the discussion in LP: #1906280);
  * set remember_owner to 0 in qemu.conf for libvirt to avoid the
    uses of XATTRS (the root user is used anyway so there is no
    need to remember a file owner), otherwise libvirt errors out
    in an unprivileged LXD container.
* Use numeric versions of OpenStack packages in the python-packages
  section of the openstack-projects part since the resolver change in
  recent versions of pip disallows for constraints dependencies of
  packages that come from a URL or a path.
  https://github.com/pypa/pip/issues/8210
  * The newest released version of pip is always used during builds
    since snapcraft uses venv to set up virtual environments and the
    ensurepip package is invoked such that a pip version shipped with
    the distro version of python is upgraded:
    https://github.com/python/cpython/blob/3.8/Lib/venv/__init__.py#L282-L289
            cmd = [context.env_exe, '-Im', 'ensurepip', '--upgrade',
                                                    '--default-pip']
  * Environment variables are ignored when pip is installed in the venv:
    https://docs.python.org/3/using/cmdline.html#id2 (-I option)
    So there is no way to use the old pip version resolver.

Minor clustering client and add-compute changes:

* use stderr for diagnostic messages;
* use stdout to output the connection string so that it can be easily
  picked up by CLI tools without parsing.

Change-Id: I5cb3872c5d142c34da2c8b073652c67021d9ef55
2021-01-15 15:58:03 +03:00
Billy Olsen
d7f3c1229f Use UTC for expiration date of tokens
Keystone assumes UTC for expires_at dates when generating auth
tokens, so set the the expires_at to UTC timezone before making
the request.

Change-Id: I55cb6ccf7a8cf79057d5699372ecd27bf936643f
Closes-Bug: #1903208
2020-11-05 20:42:03 -07:00
Dmitrii Shcherbakov
0ba5358865 Add Secure Clustering
* Add a connection-string based workflow to MicroStack;
  * microstack add-compute command can be run at the Control node in
    order to generate a connection string (an ASCII blob for the user);
  * the connection string contains:
    * an address of the control node;
    * a sha256 fingerprint of the TLS certificate used by the clustering
      service at the control node (which is used during verification
      similar to the Certificate Pinning approach);
    * an application credential id;
    * an application credential secret (short expiration time, reader
      role on the service project, restricted to listing the service
      catalog);
  * a MicroStack admin is expected to have ssh access to all nodes that
    will participate in a cluster - prior trust establishment is on
    them to figure out which is normal since they provision the nodes;
  * a MicroStack admin is expected to securely copy a connection string
    to a compute node via ssh. Since it is short-lived and does not
    carry service secrets, there is no risk of a replay at a later time;
  * If the compute role is specified during microstack.init, a
    connection string is requested and used to perform a request to the
    clustering service and validate the certificate fingerprint. The
    credential ID and secret are POSTed for verification to the
    clustering service which responds with the necessary config data
    for the compute node upon successful authorization.
* Set up TLS termination for the clustering service;
  * run the flask app as a UWSGI daemon behind nginx;
  * configure nginx to use a TLS certificate;
  * generate a self-signed TLS certificate.

This setup does not require PKI to be present for its own purposes of
joining compute nodes to the cluster. However, this does not mean that
PKI will not be used for TLS termination of the OpenStack endpoints.

Control node init workflow (non-interactive):

sudo microstack init --auto --control
microstack add-compute
<the connection string to be used at the compute node>

Compute node init workflow (non-interactive):

sudo microstack init --auto --compute --join <connection-string>

Change-Id: I9596fe1e6e5c1a325cc71fd3bf0c78b660b9a83e
2020-10-15 01:37:33 +03:00
Dmitrii Shcherbakov
32ad5af7f4 Generate random passwords instead of hard-coding
* The prototype stage hard-coding of passwords is replaced by random
  generation of passwords for:
  * all API services;
  * RabbitMQ;
  * MySQL;
  * OpenStack admin user;
  * OpenStack service users;
* Passwords are not replaced upon successive microstack.init calls to
  preserve idempotency.

Change-Id: Ic3d6108a81d09bdd09e986f80b3040b030605178
2020-10-08 11:25:25 +03:00
Dmitrii Shcherbakov
71ad68d36a Fix Clustering after a rebase to Ussuri + OVN
The previous work included incorrect handling of
configuration for the multi-node case in terms of
OVN configuration.

This change addresses that in addition to other
minor fixes related to the clustering setup.

Change-Id: Ibf04af95271d1746f59192d11831d6129ba5b8d0
2020-10-05 02:37:02 +03:00
Dmitrii Shcherbakov
780a4c4ead Use focal/core20/Ussuri/OVN & enable confinement
Major changes:

* Plumbing necessary for strict confinement with
  the microstack-support interface
  https://github.com/snapcore/snapd/pull/8926
  * Until the interface is merged, devmode will be used and kernel
    modules will be loaded via an auxiliary service.
* upgraded OpenStack components to Focal (20.04) and OpenStack Ussuri;
  * reworked the old patches;
  * added the Placement service since it is now separate;
  * addressed various build issues due to changes in snapcraft and
    built dependencies:
    * e.g. libvirt requires the build directory to be separate from the
      source directory) and LP: #1882255;
    * LP: #1882535 and https://github.com/pypa/pip/issues/8414
    * LP: #1882839
    * LP: #1885294
    * https://storyboard.openstack.org/#!/story/2007806
    * LP: #1864589
    * LP: #1777121
    * LP: #1881590
* ML2/OVS replated with ML2/OVN;
  * dnsmasq is not used anymore;
  * neutron l3 and DHCP agents are not used anymore;
  * Linux network namespaces are only used for
    neutron-ovn-metadata-agent.
  * ML2 DNS support is done via native OVN mechanisms;
  * OVN-related database services (southbound and northbound dbs);
  * OVN-related control plane services (ovn-controller, ovn-northd);
* core20 base support (bionic hosts are supported);
* the removal procedure now relies on the "remove" hook since `snap
remove` cannot be used from the confined environment anymore;
* prerequisites to enabling AppArmor confinement for QEMU processes
  created by the confined libvirtd.
* Added the Spice html5 console proxy service to enable clients to
  retrieve and use it via
  `microstack.openstack console url show --spice <servername>`.
* Added missing Cinder templates and DB migrations for the Cinder DB.
* Added experimental support for a loop device-based LVM backend for
  Cinder. Due to LP: #1892895 this is not recommended to be used in
  production except for tempest testing with an applied workaround;
  * includes iscsid and iscsi-tcp kernel module loading;
  * includes LIO and loading of relevant kernel modules;
  * An LVM PV is created on top of a loop device with a backing file
  present in $SNAP_COMMON/cinder-lvm.img;
  * A VG is created on top of the PV;
  * LVs are created by Cinder and exported via LIO over iscsi to iscsid
  which hot-plugs new SCSI devices. Those SCSI devices are then
  propagated by Nova to libvirt and QEMU during volume attachment;
* Added post-deployment testing via rally and tempest (via the
  microstack-test snap). A set of tests included into Refstack 2018.02
  is executed (except for object storage tests due to the lack of object
  storage support).

Change-Id: Ic70770095860a57d5e0a55a8a9451f9db6be7448
2020-09-25 13:20:12 +00:00
Pete Vander Giessen
5404a261aa Clustering prototype
This enables basic clustering functionality. We add:

tools/cluster/cluster/daemon.py: A server that handles validation of
cluster passwords.

tools/cluster/cluster/client.py: A client for this server.

Important Note: This prototype does not support TLS, and the
functionality in the client and server is basic. Before we roll
clustering out to production, we need to have those two chat over TLS,
and be much more careful about verifying credentials.

Also included ...

Various fixes and changes to the init script and config templates to
support cluster configuration, and allow for the fact that we may have
endpoint references for two network ips.

Updates to snapcraft.yaml, adding the new tooling.

A more formalized config infrastructure. It's still a TODO to move the
specification out of the implicit definition in the install hook, and
into a nice, explicit, well documented yaml file.

Added nesting to the Question classes in the init script, as well as
strings pointing at config keys, rather than having the config be
implicitly indicated by the Question subclass' name. (This allows us
to put together a config spec that doesn't require the person reading
the spec to understand what Questions are, and how they are
implemented.)

Renamed and unified the "unit" and "lint" tox environments, to allow
for the multiple Python tools that we want to lint and test.

Added hooks in the init script to make it possible to do automated
testing, and added an automated test for a cluster. Run with "tox -e
cluster".

Added cirros image to snap, to work around sporadic issues downloading
it from download.cirros.net.

Removed ping logic from snap, to workaround failures in gate. Need to
add it back in once we fix them.

Change-Id: I44ccd16168a7ed41486464df8c9e22a14d71ccfd
2019-11-04 13:03:41 +00:00