This patch configures TLS for OVN to use the local CA cert on the
controller. The compute nodes request certificates to be provided by the
CA cert and will use those certificates to configure local controller
connections to the OVN SB database via TLS. The client certificates are
validated against the control nodes CA.
Local connections on the control node continue to use the local unix
socket, which should be considered to be secure since it does not egress
the node.
Change-Id: Iacf5d5637c3a093bd80879c2ebb58efb16b52e66
Treat the control node as a CA for certificates at compute nodes.
Upon joining a cluster, the compute node will request a certificate to
be created by generating a CSR and asking the control node to sign the
certificate.
This adds new config options for the compute private keys and
certificate locations in use.
Change-Id: I8e8b1a86cf7df752b6cb34cfdf65a87a72934ec5
In order to apply dynamically generated virt-type config to the actual
templates they need to be rendered.
Also improves a KVM presence check since the CPU features may be visible
to a container but KVM API via a character file might not be available
there.
The cpu-mode is now also set to "host-model" instead of "host-passthrough"
when emulation is used as it is done in the default config.
Closes-Bug: #1942761
Change-Id: I689543232a94f4df16445c6e3057c5a329d3f6ae
Fix the builds for MicroStack on aarch64/arm64. To resolve the build
issues, the SNAP_ARCH and SNAP_ARCH_TRIPLET variables need to be used
in appropriate places to reference the underlying platform rather than
the x86_64 platform.
Arm64/Aarch64 support requires that EFI support is enabled, which
involves adding the EFI packages for arm64. These are also included for
x86_64 for ensuring consistency in EFI support between the
architectures. The /usr/share/{OVMF,AAVMF} paths need to be bind-mounted
to the appropriate locations to avoid having to custom build the
packages within the snap.
The setup sequence also needs to consider loading the right image into
glance. This is done by using the platform to determine which cirros
image to import into glance. A new cirros image was not included in this
patch for aarch64 platform in order to keep the snap image size lower.
The setup sequence has fallback code to attempt to download the image if
it is unavailable in the filesystem.
Finally, the snapcraft architectures for building microstack are limited
to the x86_64 and arm64 platforms. There's no need to build on s390x or
other architectures at this point.
Closes-Bug: #1821872
Change-Id: I26625621fb9895027139ecb895e882e60f2e6502
This patch provides TLS endpoints secured by a self-signed
certificate. Another patch will provide support for trusted CA-signed
certificates.
A new config.tls.generate-cert option is added that defaults to true.
When true, a self-signed certificate will be generated and OpenStack
API endpoints will be configured to use TLS with that self-signed
certificate. The following config options are added:
snap get microstack config.tls.generate-self-signed
snap get microstack config.tls.cacert-path
snap get microstack config.tls.cert-path
snap get microstack config.tls.key-path
Users can provide their own self-signed certificate by setting
generate-self-signed to false and storing their own certificates/key
at the paths specified by cacert-path, cert-path, and key-path.
'snap set' can also be used to change the cert/key file names.
If using clustering, the certificates/key will be copied from the
control node to the compute nodes. The config for cacert-path,
cert-path, and key-path will be set to the same values as on the
control node.
Other notable changes:
* The existing generate_selfsigned() function is modified to change
the subject alternative name to be made up of the hostname and
optionally an IP. The controller hostname and IP are used when
generating the certificate for self-signed TLS endpoints. The
hostname is now used instead of 'microstack.run' when generating
the clustering certificate.
* This change also aligns logging for nginx and corresponding sites
and moves all nginx sites to {snap_common}/etc/nginx/sites-enabled.
Change-Id: Iceea3127822404a3275fcf8a221cbedc4b52c217
The shell commands to enable or disable a service should pass
the --enable or --disable option following the verb and service
name.
Closes-Bug: #1900075
Change-Id: I97d868bbd005bc5bc9c71d6ddd6f2b7746dbf18b
Glance-registry has been deprecated since Queens and were removed
from the upstream source in Train.
Change-Id: Ia993bfce039cd46ced3442c9064e4af8547fa54f
Add missing dependencies to tools/init/test-requirements.txt
for running unit tests. Libraries are placed in test-
requirementss.txt rather than requirements.txt due to library
versions included within the snap; if the versions in requirements.txt
differ from whats installed in the snap from Ubuntu core then the snap
fails to build.
Closes-Bug: #1908610
Change-Id: I83d623db3a8d3cd8f328b42da4aff5b71f2f0520
* Remove the dead code;
* Rework the test types;
* Restore the instance connectivity check;
* Rework the clustering test to support the new node addition workflow;
* Check whether a machine where MicroStack is installed has hardware
virtualization capabilities for different architectures. If not, use
software emulation;
* the host model is used with KVM since the default QEMU CPU models on
x86_64 are subject to vulnerabilities without certain CPU-specific
features. This conflicts with being able to use live migration
reliably across hosts with different CPUs.
* Add a default-source-ip init argument to allow controlling the source
IP of the installation host that will be used as a control ip or
compute ip locally.
* used in the clustering test so that the local host IP on the
multipass network is used as a control IP instead of the IP
through which the default gateway is available;
* the IP through which the default gateway is accessible is
used as a fallback for default-source-ip;
* Given upstream CI has a low amount of resources allocated per machine
use LXD to set up a dummy compute node;
* Set RLIMIT_MEMLOCK to 'unlimited' in the LXD container profile
(see the discussion in LP: #1906280);
* set remember_owner to 0 in qemu.conf for libvirt to avoid the
uses of XATTRS (the root user is used anyway so there is no
need to remember a file owner), otherwise libvirt errors out
in an unprivileged LXD container.
* Use numeric versions of OpenStack packages in the python-packages
section of the openstack-projects part since the resolver change in
recent versions of pip disallows for constraints dependencies of
packages that come from a URL or a path.
https://github.com/pypa/pip/issues/8210
* The newest released version of pip is always used during builds
since snapcraft uses venv to set up virtual environments and the
ensurepip package is invoked such that a pip version shipped with
the distro version of python is upgraded:
https://github.com/python/cpython/blob/3.8/Lib/venv/__init__.py#L282-L289
cmd = [context.env_exe, '-Im', 'ensurepip', '--upgrade',
'--default-pip']
* Environment variables are ignored when pip is installed in the venv:
https://docs.python.org/3/using/cmdline.html#id2 (-I option)
So there is no way to use the old pip version resolver.
Minor clustering client and add-compute changes:
* use stderr for diagnostic messages;
* use stdout to output the connection string so that it can be easily
picked up by CLI tools without parsing.
Change-Id: I5cb3872c5d142c34da2c8b073652c67021d9ef55
Some services were disabled in the install hook and then started during
the init phase without being enabled. Thus, after a machine restart they
were not brought back up by systemd.
Change-Id: I27f7d7fa6b8df104567b91b5bc998ebe98b478a2
* A reliable DNS setup cannot be assumed in MicroStack installations so
relying on the host cache behavior of MySQL is not reliable. MySQL resolves
an IP address to a host name and resolves that host name back to an IP
address (https://dev.mysql.com/doc/refman/8.0/en/host-cache.html);
* IP addresses are not guaranteed to be static in a MicroStack
deployment although this is preferable. Likewise, for services like
cinder-volume to access the database on secondary nodes they need to
be allowed to do that at the MySQL ACL level.
Change-Id: Ib87ab0a71fa83dad8e8ddb40f34907ab24999423
* Add a connection-string based workflow to MicroStack;
* microstack add-compute command can be run at the Control node in
order to generate a connection string (an ASCII blob for the user);
* the connection string contains:
* an address of the control node;
* a sha256 fingerprint of the TLS certificate used by the clustering
service at the control node (which is used during verification
similar to the Certificate Pinning approach);
* an application credential id;
* an application credential secret (short expiration time, reader
role on the service project, restricted to listing the service
catalog);
* a MicroStack admin is expected to have ssh access to all nodes that
will participate in a cluster - prior trust establishment is on
them to figure out which is normal since they provision the nodes;
* a MicroStack admin is expected to securely copy a connection string
to a compute node via ssh. Since it is short-lived and does not
carry service secrets, there is no risk of a replay at a later time;
* If the compute role is specified during microstack.init, a
connection string is requested and used to perform a request to the
clustering service and validate the certificate fingerprint. The
credential ID and secret are POSTed for verification to the
clustering service which responds with the necessary config data
for the compute node upon successful authorization.
* Set up TLS termination for the clustering service;
* run the flask app as a UWSGI daemon behind nginx;
* configure nginx to use a TLS certificate;
* generate a self-signed TLS certificate.
This setup does not require PKI to be present for its own purposes of
joining compute nodes to the cluster. However, this does not mean that
PKI will not be used for TLS termination of the OpenStack endpoints.
Control node init workflow (non-interactive):
sudo microstack init --auto --control
microstack add-compute
<the connection string to be used at the compute node>
Compute node init workflow (non-interactive):
sudo microstack init --auto --compute --join <connection-string>
Change-Id: I9596fe1e6e5c1a325cc71fd3bf0c78b660b9a83e
* The prototype stage hard-coding of passwords is replaced by random
generation of passwords for:
* all API services;
* RabbitMQ;
* MySQL;
* OpenStack admin user;
* OpenStack service users;
* Passwords are not replaced upon successive microstack.init calls to
preserve idempotency.
Change-Id: Ic3d6108a81d09bdd09e986f80b3040b030605178
The previous work included incorrect handling of
configuration for the multi-node case in terms of
OVN configuration.
This change addresses that in addition to other
minor fixes related to the clustering setup.
Change-Id: Ibf04af95271d1746f59192d11831d6129ba5b8d0
Major changes:
* Plumbing necessary for strict confinement with
the microstack-support interface
https://github.com/snapcore/snapd/pull/8926
* Until the interface is merged, devmode will be used and kernel
modules will be loaded via an auxiliary service.
* upgraded OpenStack components to Focal (20.04) and OpenStack Ussuri;
* reworked the old patches;
* added the Placement service since it is now separate;
* addressed various build issues due to changes in snapcraft and
built dependencies:
* e.g. libvirt requires the build directory to be separate from the
source directory) and LP: #1882255;
* LP: #1882535 and https://github.com/pypa/pip/issues/8414
* LP: #1882839
* LP: #1885294
* https://storyboard.openstack.org/#!/story/2007806
* LP: #1864589
* LP: #1777121
* LP: #1881590
* ML2/OVS replated with ML2/OVN;
* dnsmasq is not used anymore;
* neutron l3 and DHCP agents are not used anymore;
* Linux network namespaces are only used for
neutron-ovn-metadata-agent.
* ML2 DNS support is done via native OVN mechanisms;
* OVN-related database services (southbound and northbound dbs);
* OVN-related control plane services (ovn-controller, ovn-northd);
* core20 base support (bionic hosts are supported);
* the removal procedure now relies on the "remove" hook since `snap
remove` cannot be used from the confined environment anymore;
* prerequisites to enabling AppArmor confinement for QEMU processes
created by the confined libvirtd.
* Added the Spice html5 console proxy service to enable clients to
retrieve and use it via
`microstack.openstack console url show --spice <servername>`.
* Added missing Cinder templates and DB migrations for the Cinder DB.
* Added experimental support for a loop device-based LVM backend for
Cinder. Due to LP: #1892895 this is not recommended to be used in
production except for tempest testing with an applied workaround;
* includes iscsid and iscsi-tcp kernel module loading;
* includes LIO and loading of relevant kernel modules;
* An LVM PV is created on top of a loop device with a backing file
present in $SNAP_COMMON/cinder-lvm.img;
* A VG is created on top of the PV;
* LVs are created by Cinder and exported via LIO over iscsi to iscsid
which hot-plugs new SCSI devices. Those SCSI devices are then
propagated by Nova to libvirt and QEMU during volume attachment;
* Added post-deployment testing via rally and tempest (via the
microstack-test snap). A set of tests included into Refstack 2018.02
is executed (except for object storage tests due to the lack of object
storage support).
Change-Id: Ic70770095860a57d5e0a55a8a9451f9db6be7448
(Not complete strict confinement, but these don't break anything
devmode related, and get us closer to having strict confinement
working.)
Added more needed interfaces to snapcraft.yaml.
Created a wrapper around dnsmasq so that we can run as the snap_daemon
user. Added snap_daemon user to snapcraft.yaml.
Added a utility script for connecting interfaces that don't auto
connect (tools/connect.sh). Not useful for production, but saves a lot
of time when testing.
libvirt no longer uses unix sock group "sudo" (can't run setguid in
strict confinement).
Got rid of "find_missing_plugins" in init script. By the time we
release strict confinement to production, all those plugins will auto
connect.
Change-Id: I8324ac7bd0332c41cac17703eb15d7301e7babf3
Make MicroStack strictly confined, albeit in devmode for now.
Addresses unpredictable breakages with apt package upgrades in eoan
and focal, and sets the stage for a better isolated, less fragile snap
going forward.
We now use layouts to handle libvirt and qemu setting paths at compile
time. This is cleaner than the organize hack.
Moved away from calls to systemctl in init, as a strictly confined
snap cannot call systemctl on a non snappy system.
Disabled call to sysctl to set ipv4_fowarding, as we don't have access
to sysctl in a strictly confined snap. This may break some users, and
we need to figure out a way to address the breakage.
Got rid of questions.shell.shell routine, moving rabbitmq setup into a
bash script instead (it's just cleaner).
Moved keypair creation into launch script, as it's difficult to do
sensible things with keypair creation in the init script, which is
running using sudo, and therefore doesn't have access to
/home/<someuser>/snap
Added (but commented out) code that will check to verify that plugs
are connected before running microstack.init or ovs-vsctl. This code
may go away entirely, as we plan on auto connecting all of our
interfaces, and don't technically need to guard against not having
them connected.
Added temporary local upper-constraints file, to fix an issue where
upstream upper-constraints was breaking pip install by setting a
neutron version. This needs a better long term fix, but works for now.
Closes-bug: 1860660
Change-Id: Iaf1f1482609f05285ed9061317b32e90bffd2da0
This reverts commit ce5e82e3191acb40b1ab801cde25333037d89bcb.
MicroStack cannot currently install due to a missing ovs-related
library. This is possibly due to recent changes in spacraft, or
possibly due to the workarounds for those changes. Regardless, it
appears that backing out the DPDK changes gets us back to a state
where we can install.
Partial-Bug: 1862911
Change-Id: I060c1a0095470639f9158cb9e9ebe8281a65a678
- Snapped binary packages of Filebeat, NRPE and Telegraf (disabled by default)
- Added W/A of Telegraf segfault after ELF patching by snapcraft
- Implemented IPMI input tuning for Telegraf
- Allowed to run NRPE as root:root (from custom PPA)
- Implemented Filebeat, NRPE and Telegraf control scripts and config on top of snap-overlay
- Added support for checking Microstack systemd services by NRPE
- Added few generic and Microstack-specific NRPE checks
- Added possibility to override default config paths for the daemons
- Added support for in-band IPMI input to Telegraf
- Stick LMA wrappers and services naming to Microstack conventions
- Increase build timeout in .zuul conf by 30min
Change-Id: I68dbdb11248cf0c1e22e9333af3cf0f88954f557
Running microstack.remove will remove the br-ex virtual bridge device,
then uninstall MicroStack.
We do this because we can't use ovs-ctl to remove the bridge as part
of a remove hook, as the Open vSwitch daemons are not running at that
point. The microstack.remove command gives operators a way to cleanly
uninstall the snap, without needing to reboot to get rid of br-ex.
Added test exercising the code to test_basic.py.
Rerranged entry points a bit (moved some things into main.py) to make
code sharing easier, and to prevent a proliferation of entry point
scripts in our root dir.
Change-Id: I9ff25864cd96ada3a9b3da8992c2b33955eff0b4
Closes-Bug: #1852147
Addresses requests to make it easier to avoid conflicts between the
Horizon dashboard and http services that might already be running on
the machine.
Configurable via snap config. Exposing via arguments to .init and
testing post init configuration is left for a separate PR.
Eventually, these may move to non standard ports by default. This PR
sets the stage for that, but further discussion is needed before we
decide whether to implement.
(This commit also contains a sneaky fix for the username display at the
end of the launch script.)
Closes-Bug: 1814829
Change-Id: If728d6ec8024bca4d3e809637fbdcc03ed4e6934
Now happens in a template, just like all the other values, which fixes
an issue where it doesn't get overridden during an upgrade.
Change-Id: Ied84ddc0282c77de6797f90efc8923ae66a9d59e
Previously, the snap set up a bridge using the default 10.20.20.0/24
network upon install. If there was a good reason not to use this
network (e.g., it already exists and is being used for another
purpose), MicroStack, and the host machine, could wind up in a broken
state.
This PR delays setting up the bridge until after we have given an
operator a chance to override the default settings.
This has been manually tested. To test, do the following:
1) Checkout the code, and run tox -e build
2) Run tools/make-a-microstack.sh
3) snapctl set config.network.ext-cidr and config.network.ext-gateway
4) Run microstack_init
5) Exit the snap shell and run microstack.launch
Change-Id: I9e268495f313b29d9781d80a2468fc0a1a450aa0
Closes-Bug: https://bugs.launchpad.net/microstack/+bug/1851521
Remove some of the redundancy in tox.ini.
Fixed some lint issues that weren't caught before due to gaps in the
linting coverage.
I think that there's more work to be done here, but this does make
things better than they were before.
Change-Id: I82487dbb9366f3de16b25615bd081b6315671655
Added a question which allows off host access to horizon
dashboard. Activated it by default, as that's probably what people are
going to actually want.
Change-Id: I0d5bccb3b2eb2b409072d8ae5f8b923942386119
Moved to pure Python where clib conflicts arose in using command line
tools.
Fixed erroneous assumptions about the presence and reliability of a
$HOME variable while running init.
Added tests specific to eoan, disco and xenial. They are not yet part
of the gate.
Change-Id: I2fc74fcc2ae9876442bb87a3446aef48d0428f2f
This enables basic clustering functionality. We add:
tools/cluster/cluster/daemon.py: A server that handles validation of
cluster passwords.
tools/cluster/cluster/client.py: A client for this server.
Important Note: This prototype does not support TLS, and the
functionality in the client and server is basic. Before we roll
clustering out to production, we need to have those two chat over TLS,
and be much more careful about verifying credentials.
Also included ...
Various fixes and changes to the init script and config templates to
support cluster configuration, and allow for the fact that we may have
endpoint references for two network ips.
Updates to snapcraft.yaml, adding the new tooling.
A more formalized config infrastructure. It's still a TODO to move the
specification out of the implicit definition in the install hook, and
into a nice, explicit, well documented yaml file.
Added nesting to the Question classes in the init script, as well as
strings pointing at config keys, rather than having the config be
implicitly indicated by the Question subclass' name. (This allows us
to put together a config spec that doesn't require the person reading
the spec to understand what Questions are, and how they are
implemented.)
Renamed and unified the "unit" and "lint" tox environments, to allow
for the multiple Python tools that we want to lint and test.
Added hooks in the init script to make it possible to do automated
testing, and added an automated test for a cluster. Run with "tox -e
cluster".
Added cirros image to snap, to work around sporadic issues downloading
it from download.cirros.net.
Removed ping logic from snap, to workaround failures in gate. Need to
add it back in once we fix them.
Change-Id: I44ccd16168a7ed41486464df8c9e22a14d71ccfd
Moved security rules and keypair creation into init first.
Launch script now takes image name as positional argument, and name of
instance as a named argument. This makes it work more like launch in
other Canonical tools.
Written in Python, for ease of maintenance.
--retry and --wait args allow it to behave like tests expect it to,
while humans will get a much more intuitive (and much less noisy)
experience.
Also increased time we wait for a ping on the host, to allow for
slower, pure qemu, emulation times, and bring it in line with what
Tempest does in similar situations.
Change-Id: I11dcc098012468e9c88dcc7af78cde6920f31ecd
Ported basic-test.sh to test_basic.py, and folded in
test_horizonlogin.py.
Made a testing framework for shared components.
Added test_control.py
Got rid of default .stestr.conf, as we're going to have multiple tests
running, and one conf is confusing.
Manually ordering functional tests for now, as stestr noms too much
output, and runs things in parallel, which doesn't work for our
functional tests.
Skipping compute node test for now, as it won't work until we can
connect to a control node with databases and such.
Moved very-basic-test.sh to tools/make-a-microstack.sh. It's really
more of a tool for manual testing than an automated test.
Added test-requirements and updated gitignore.
Moved auto-detection of kvm extensions to init, rather than test, as
it makes more sense there.
Change-Id: Iba7f7fe07cbb066790f802cf2a7c87c68994062c
This lays the groundwork for interactive init, as well as being able
to specify control and compute nodes.
Added preliminary config lists for control and compute nodes. Added
appropriate default snapctl config settings in install script.
Also changed "binary" questions to "boolean" questions, as that's
better wording, and it means that my docstrings are not a confusing
mix of "boolean" and "binary" when I forget which term I used.
Snuck in a fix for the "basic" testing environment -- it was missing
the Python requirements, and was therefore failing!
Change-Id: I7f95ab68f924fa4d4280703c372b807cc7c77758
Move logging output for most services to systemd.
Add a hook in snap.openstack to tell OpenStack services to wait
until we set database.ready in the snap config before starting. This
prevents spamming systemd with error messages before we run
microstack.init (See matching PR against snap.openstack, coming soon.)
Incidentally fix issue w/ the way that shell.py was running
CalledProcessError and parsing output.
Order patches part after uca-sources, to work around an issue we
discovered with apt update while those two parts are running in
parallel. (python-apt segfaults, and no fun is had by anyone.)
Remaining gaps in our logging: systemd still displays some errors
during init, which might be fixable with further ordering of snapctl
start invocations. We're also relying on MySQL and RabbitMQ log output
to know when those services are started, so we haven't moved their
output to systemd just yet.
Dropped in a fix to work w/ updated version of snap.openstack.
Change-Id: I130ed730c14ab35b8b677b9c9f573fa6fe1e8f13
Move openstack-projects part from python2 to python3.
Add cloud archive.
Update qemu and libvirt versions to those from cloud archive (they
work with python3, while the distro packages versions don't).
Switch from rocky to stein.
Fetch libvirt and qemu sources via "apt source". Gets rid of sub
version hard coding in snapcraft.
Update hard coded references in tests to rocky from stein.
Change-Id: Idb38717998a13feaaf0782e880e540f28bc452a8