289 Commits

Author SHA1 Message Date
Kevin Carter
e9726bf252 Add task to regather facts after run
This role will introduce quite a bit of state chagne within the host
it's deployed on. After the run we should force regather facts to ensure
we have the most up-to-date information before running any other
playbooks/roles on the host.

Change-Id: I05d71964f96a8e025aa0f89f37f8dcb2a705a2e5
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-05-15 16:30:08 +00:00
Kevin Carter
25478e9b4e
Enable quota system and set qgroups
This change implements the machinectl quota system and qgroups when
they're enabled and available. This change is being implemented to
resolve an issue where machinectl based containers using a loopback file
system spam DMESG with the following:

* BTRFS error (device loop0): could not find root $INT

While various upstream sources say this error is benign[0], it raises
an inconsistency flag within the host system and is speculatively the
cause of our inconsistent read-only/Full-FS issues we've seen in the
integrated gate. Once the qgroups are properly setup the system will
remove the inconsistency flag and the message spam will stop.

* BTRFS info (device loop0): qgroup scan completed (inconsistency flag cleared)

To resolve this issue the quota system is being enabled by default
within the "lxc_host" role. This change essentially acknowledges
the built-in quota system and when enabled provides for the ability
to set / define specific quota (qgroup) options as necessary. While
many deployers may never use these options or this tooling, the role
will now properly set everything up should it ever be needed.

[0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1651435
Closes-Bug: #1753790
Depends-On: I34a41ac8a9fe4419254284c83f4600efee274c04
Change-Id: Ica79472568799098ebf83c6cefc585f117975f37
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-05-14 22:33:44 -05:00
OpenStack Proposal Bot
454583cf6e Updated from OpenStack Ansible Tests
Change-Id: I38a3a01c8c91c8c696e5fb48e4adf104fea847cc
2018-05-09 19:38:49 +00:00
OpenStack Proposal Bot
0f8afec802 Updated from OpenStack Ansible Tests
Change-Id: If25d4932eee6cce1edc208ab4baa71db28f9330e
2018-04-30 05:12:05 +00:00
Logan V
e58699c1bd Remove veth wiring check for machine-id
machine_id is not registered until further down in the file, so
this will fail with "The error was: |changed expects a dictionary"

We don't see the failure in our gates because the two preceding
conditions: not ((default_configuration_container | changed) or
           (bind_configuration_container | changed)

are always true, so the machine_id test is never used.

In an existing environment where the container is being updated
from an old configuration to the new networkd installation, it is
very possible that default_configuration_container and
bind_configuration_container are not changed, so the machine_id
var is checked for changed state. At that point ansible fails
because the var is undefined.

Change-Id: I0b95c6c5d0f52344d476e52219c1ce31edcf65da
18.0.0.0b1
2018-04-01 23:13:51 -05:00
Jesse Pretorius
22b9be1248 Remove tests-repo-clone.sh
Now that run_tests.sh handles the tests repo clone, we can
remove the use of the older tests-repo-clone.sh script.

Change-Id: Iead678057f3888fe7aaddce6685865f4fcdfed53
2018-03-28 10:11:34 +01:00
OpenStack Proposal Bot
8ff5feb72e Updated from OpenStack Ansible Tests
Change-Id: I42e962aae5fa5da426dd79ebe4266c64a27a72a7
2018-03-27 15:47:53 +00:00
Zuul
61fcf1af80 Merge "Add container journal linking" 2018-03-24 14:58:59 +00:00
Kevin Carter
72a16fd9e5 Add container journal linking
The container and host can link journals giving operators the ability to
log stream and check on the health of a system without needing to login
(attach) to the container. This change implements journal linking for
LXC containers following the reference systemd specification.

Reference implementation:
https://www.freedesktop.org/software/systemd/man/systemd-nspawn.html#--link-journal=

Change-Id: Id68cf39a77b5dd9c13c010829b47cd7a414378bc
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-24 01:45:25 +00:00
Kevin Carter
846b4f9ed2 Allow deployers to define the container type
The variable `lxc_user_defined_container` has been added which allows a
deployer to define the container variable file in use for a given
container type.

Depends-On: https://review.openstack.org/554383
Change-Id: Ia1373bfa916b4add49a8444d2e4553f898650328
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-24 01:18:07 +00:00
Zuul
883bc78164 Merge "Collect physical host facts if missing" 2018-03-19 15:26:16 +00:00
Zuul
8c8a22affa Merge "Use hostnamectl to set the container hostname" 2018-03-19 13:13:40 +00:00
Zuul
52ac75ca14 Merge "Add lxc.haltsignal to container configs" 2018-03-19 09:17:59 +00:00
Logan V
59f326b63e Collect physical host facts if missing
Allow the role to collect facts for the physical host if missing,
since the role has a hard dependency on checking the physical host's
kernel version.

In the OSA container create playbook[1], facts are collected only
if the physical host itself is included in the playbook scope. When
a '--limit containername' parameter is used, no physical host facts
are collected and the role fails with:

The conditional check 'hostvars[physical_host]['ansible_kernel'] |
version_compare('3.18.0-0-generic', '<')' failed. The error was:
Version comparison: 'dict object' has no attribute 'ansible_kernel'

Change-Id: Id84aefed6c0129909cb6153258863564c7cc914a
2018-03-18 22:40:41 -05:00
Zuul
2520c83523 Merge "Correct cgroup access on older kernels" 2018-03-18 12:17:37 +00:00
Kevin Carter
a2fc120d06 Use hostnamectl to set the container hostname
This change sets the hostname of containers using the hostnamectl
command which has several enhancements over legacy method. By using
hostnamectl the command will validate the hostname for correctness
ensuring the container hostnames are conforming the the RFC.

The old methods have been removed and the command has been made part of
the handlers and will be run after the activation of dbus.

Change-Id: I158a5deb0685d2dcd436d7dd92caecb9966a025e
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-18 01:34:43 +00:00
Kevin Carter
514a894cce
Remove generic default interfaces
With the implementation of networkd the ENI scripts and config files for
the default interfaces shipped with the lxc container images we use is
no longer useful. These old files can cause conflicts in networking
should the old scripts and networkd get confused especially when it
comes to an interface that is setup for DHCP. This change simply defines
the default interfaces for both suse and ubuntu and ensures they're
deleted.

The interface flush handler has been set to failed when false because on
initial container create the eth0 device may not exist until
systemd-networkd is restarted for the first time.

Change-Id: I70abb5ec4226a81a065e495e19f5e7e0c569e1b0
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-17 12:55:49 -05:00
Kevin Carter
a372183af9 Add lxc.haltsignal to container configs
This change adds the lxc.haltsignal option to the container config which
ensures containers are gracefully stopped but quickly.

Presently, when a container is restart they can hang for 20 - 30 seconds
which is due to the fact that the default stop signal is SIGPWR.
While the hang when stopping a container is not 100% reproducible in all
environments it can be seen when simply executing `lxc-stop`. If the
user were to stream the container journal while stopping the container
it's would be seen that the container hangs when trying to shutdown some
systemd services. If the `lxc-stop` command is executed a second time the
container is stopped more forcibly with SIGRTMIN+3. This change is using
an example stop signal from the lxc documentation [0] which is
implementing a Real-time signal, SIGRTMIN+n. More on the signal used can
be found here [1].

[0] http://manpages.ubuntu.com/manpages/xenial/en/man5/lxc.container.conf.5.html
[1] http://manpages.ubuntu.com/manpages/xenial/en/man7/signal.7.html

Change-Id: I01e82eabf17d2ac5a89c13ef56616fd1fe0607dd
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-17 17:16:15 +00:00
Paul Belanger
f2b3780c8d
Fix 'properties' is undefined errors
When following the example playbook for the role, it is possible for
properties undefined to be raise. Because default() filters don't work
with undefined dictionary keys, ensure a default dictionary for
properties exists.

Change-Id: Iee2e992efe8ee801506e5de622bd90ac3915a33c
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
2018-03-16 22:00:36 -04:00
Kevin Carter
774aef5472
Correct cgroup access on older kernels
This change adds an auto mount entry into config which ensures containers
have access to the cgroups, even if they're read only. Without this
change containers see a notable slowdown and repeating message regarding
a failure when resetting the device list. This option has no effect and
is not needed on newer kernels (4.15+) as cgroup namespaces and device
access is inherent to the creation of a container namespace.

> Example Error: http://paste.openstack.org/show/702764

While this change is introducing new config into the container it is not
forcing a container restart. This is approach has been taken to ensure
we're correcting the issue on greenfield deployments but not impacting
running ones.

Change-Id: I31b1b5a044687f52b1c54049ba03c65ecda34b51
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-16 00:21:26 -05:00
Jesse Pretorius
f434c30054 tox.ini: Expose USER environment variable to execution environment
In order to allow the use of the environment variable which informs
Ansible which user executed the playbook, we pass the USER env var
into the environment that tox builds.

Change-Id: I5ba653fceec3db1073ab639d835f6a250b11a4e6
Implements: blueprint python-build-install-simplification
Signed-off-by: Jesse Pretorius <jesse.pretorius@rackspace.co.uk>
2018-03-15 17:56:19 +00:00
OpenStack Proposal Bot
beeb8573f6 Updated from global requirements
Change-Id: I1da560616a0883a51f0252ff31641a535d147819
2018-03-15 07:30:04 +00:00
OpenStack Proposal Bot
865c4b657e Updated from global requirements
Change-Id: Ibf282e513d1e3025750261bcb763f5b395d63a6b
2018-03-13 07:11:38 +00:00
OpenStack Proposal Bot
7f33e14ed7 Updated from global requirements
Change-Id: Iada9a6f9947c8983b3b4d84cd98b794504a7d10c
2018-03-11 13:49:56 +00:00
ZhongShengping
6eb5907be5 Follow the new PTI for document build
For compliance with the Project Testing Interface as described in:
https://governance.openstack.org/tc/reference/project-testing-interface.html

For more detials information, please refer to:
http://lists.openstack.org/pipermail/openstack-dev/2017-December/125710.html

Change-Id: I0e329f1414787aaedc59835600a3c972bb817f9c
2018-03-09 12:01:25 +08:00
Major Hayden
c36980380e Flush entire interface rather than just routes
This patch changes the flush routes handler to flush the entire
interface config from the interface. This is needed because
systemd-networkd does not restore the route of non-DHCP interfaces
when flushing routes and restarting systemd-networks.

Change-Id: I17748b0dd2307fd9bee705140c67883140090298
Signed-off-by: Major Hayden <major@mhtx.net>
2018-03-07 04:02:25 +00:00
Zuul
f89140f478 Merge "Always create containers with fixed MAC addresses" 2018-03-07 00:07:39 +00:00
Zuul
73d5347233 Merge "tests: Use domain names for external network testing" 2018-03-06 18:11:47 +00:00
Markos Chandras
49309c4a92 Always create containers with fixed MAC addresses
Patch I0d83fd4895d4c5beaf5a84a239c1a1ed71521dee dropped the ARP=yes
option for networkd because it's not supported by old systemd releases.
This however brings back a problem where the default one sysctl
arp_notify option in the kernel may not correctly set for our use case.
Containers are created with random MAC addresses so we need to ensure
that ARP entries are populated correctly when a container is restarted.
Instead of having to implement some sort of a new workaround on the host,
it's probably better to create all containers with fixed MAC addresses from
now on.

Change-Id: I8ad390fc3ce27756f26c57c92aaa3adc8e506a17
2018-03-06 17:00:36 +00:00
Markos Chandras
dd9b378642 tests: Use domain names for external network testing
We should use domain names for the external network testing task in
order to verify no only that the default gateway works properly but also
that our DNS is able to resolve hostnames.

Change-Id: I3aebcf1dff8268e4dbaebae8fb598ee27e3f481d
Depends-On: I316c3851f40f08d272b7bb5f7165e010e3a95c3a
Depends-On: Ied7632037f737c3f32c34dac70531065c54496c9
Depends-On: I14f8373897da28dea2ea03500c2be46c5b40d51c
Depends-On: I0d83fd4895d4c5beaf5a84a239c1a1ed71521dee
2018-03-06 10:29:05 +00:00
Markos Chandras
c210b45ba7 templates: networkd: Drop Link=ARP from networkd configuration
The ARP option has been added in systemd-232. As such, current stable
distributions may not support it so drop the option and let the kernel
decide what to do with ARP. Fixes the following warning:

[/etc/systemd/network/eth0.network:14] Unknown lvalue 'ARP' in section 'Link'

Link: https://github.com/systemd/systemd/pull/3854
Link: 99d2baa2ca
Depends-On: I14f8373897da28dea2ea03500c2be46c5b40d51c
Change-Id: I0d83fd4895d4c5beaf5a84a239c1a1ed71521dee
2018-03-06 09:49:19 +00:00
Markos Chandras
5896c16b9f templates: networkd: UseDNS requires systemd-resolved
The UseDNS option requires the systemd-resolved service so set this
option based on the lxc_container_enable_resolved variable.

Change-Id: I5b7c3f01534f5ccbaf76aced673aefc6ec7fcf6e
2018-03-06 09:49:12 +00:00
Kevin Carter
aee117fc09
Set a route metric when static routes are used
When using a static route we need to set a route metric to ensure the
priority of the routes being passed in. This change ensures we maintain
our expected interface and functionality should any static routes be
passed into the container.

Before the implementation of networkd, EIN would amend the main table
with the defined routes in the order they were written. However
systemd-networkd inserts the defined routes at the top of the default
table which can cause confusion and conflict. This change simply adds
a route metric to all defined routes and increments the metric integer
based on the list index which explicitly ensures all defined routes
are prioritized in the order in which they were written.

Change-Id: I13768580fbd926033fde4a74cbbf90b9eda24658
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-02 23:32:24 -06:00
Zuul
dde887a2e1 Merge "Update the outdated links" 2018-03-03 02:52:14 +00:00
Markos Chandras
be6b7da72e zuul: Add missing CentOS and openSUSE jobs
openSUSE and CentOS have been voting jobs for a while so we should
start testing all the scenarios on them. The only job that hasn't
been added is the ZFS one since there is no such package on openSUSE
or CentOS.

Change-Id: Icde5ed7a4e6be8ac19412f15b84febf2096ba404
2018-03-02 15:40:55 +00:00
Markos Chandras
e3d35d306c tests: Use ansible_pkg_mgr fact to determine the BTRFS package
The ansible_distribution variable is causing some troubles since it can
contain spaces etc. As such, we can simply use the ansible_pkg_mgr
module to figure out the name of the package we want to install.

Change-Id: Ic92eb1f9030df2883b049b9868e031ff4f0d42f2
2018-03-02 15:40:26 +00:00
Kevin Carter
815ece7454 Unify container network interfaces with networkd
Unify container network interfaces using Systemd Networkd for ubuntu,
centos, and openSUSE. This change allows the role to use a single way to
configure container networks.

Care has been taken to ensure we're able to cleanly upgrade to the new
capabilities within existing environments without breaking any feature
compatibility or causing any container restarts.

It's also worth noting that all of the pre/post networking up/down
script options have been converted to systemd "oneshot" services. This
retains the ability to run adhoc scripts post network availability
while also opening up this capability, which used to be ubuntu only,
to all of our supported operating systems.

> Our usage of `lxc-attach` was removed in favor of `nsenter` to fix a
  issue where multiple `lxc-attach` commands issued to a single physical
  host could result in a hang.

> Scripts that were being generated inline have been placed into
  template files. This solves a long standing memory consumption issue
  when creating lots of containers. The old shell tasks will now be 
  executed from a generated script. While this should also help with 
  debugging, the main driver is to ensure better system stability.

> A lot of cleanup has been done throughout the task files and
  templates. In the process of updating the role to use unified
  networking a lot of duplicate tasks, scripts, and processes have
  consolidated.

> Handlers have been added for network connection wait conditions and
  to various service restarts.

> The OSA plugins have been added to this role as a dependency. We
  rely on the connection plugins throughout the stack however we were
  doing a lot of workarounds to cater to the possibility of a deployer
  running this role without them. This change simply adds the plugins
  as a known dependency which allows for a more streamlined setup.

Change-Id: I5d3ddcfa11d575648a69a04f2fb30236c2c89da3
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-01 10:55:14 +00:00
lidong
9a464a66ad Update the outdated links
Change-Id: I9e5b4c11bede6a97cf7effd39f67fbbee3503193
2018-02-27 13:45:28 +08:00
Zuul
1264f62e48 Merge "Update minimum Ansible version" 2018-02-16 17:56:41 +00:00
Major Hayden
1d2840a6a8
Update minimum Ansible version
Change-Id: If6927aebeaa101909d9418a32041b7cc0138b5da
2018-02-15 10:22:59 -06:00
Zuul
65c62aab21 Merge "Change include: to include_tasks:" 2018-02-15 15:02:32 +00:00
Major Hayden
0aa5ae5290
Change include: to include_tasks:
This removes the warnings in Ansible 2.4+.

Change-Id: Icd4757495ae42df95ced7e6cd6cdc8c6a04eb669
2018-02-15 07:54:54 -06:00
Zuul
f167955e66 Merge "Update reno for stable/queens" 2018-02-15 12:49:22 +00:00
Zuul
f9f0dd1c74 Merge "Generate a unique machine-id" 2018-02-15 12:49:21 +00:00
OpenStack Proposal Bot
ab635bf27b Updated from OpenStack Ansible Tests
Change-Id: I9639c8040d139411d819cc00db3ed3439c0de21b
2018-02-14 20:04:05 +00:00
Kevin Carter
01fc5fa643 Generate a unique machine-id
The systemd machine-id needs to be unique on all network attached
devices. This change ensures that when a container comes online, a
unique machine-id is generated if one was not already present. When
the machine-id is created for the first time the container will restart
so the new ID can take effect.

More information on the machine-id can be found here:
https://www.freedesktop.org/software/systemd/man/machine-id.html

Change-Id: Ib25aeeecf1e6001e6c6b1a7d6b6d50eca7ab45fa
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-02-14 16:11:03 +00:00
d69d576fa9 Update reno for stable/queens
Change-Id: Ife16e3eed7b04d8a7123e85440cd71ae962f235f
2018-02-14 15:45:42 +00:00
Zuul
87353ebc31 Merge "Zuul: Remove project name" 17.0.0.0rc1 2018-02-05 20:31:50 +00:00
James E. Blair
3cae698656 Zuul: Remove project name
Zuul no longer requires the project-name for in-repo configuration.
Omitting it makes forking or renaming projects easier.

Change-Id: Ie0a2f156de3c136439ac4dc5e28b16ed5509288c
2018-02-05 11:17:13 -08:00
OpenStack Proposal Bot
cfe49479b7 Updated from global requirements
Change-Id: Id1ac6dcf7749b80541b6825f7312c974df80929b
17.0.0.0b3
2018-01-24 01:16:00 +00:00