Ansible yum module installs all packages available in the repo
if you use asterix. We instead will use yum -y update name*.
Change-Id: I8e71367ae91faa06313711c6a954c61af705fd8f
Resolves: rhbz#1549845
Some container yaml file does not get the
service_config_settings from the base file.
This patch makes for the following docker yaml files get
the service_config_settings:
docker/services/neutron-l3.yaml
docker/services/neutron-metadata.yaml
docker/services/neutron-ovs-agent.yaml
Related-Bug: #1757066
Change-Id: Ifc8def10da0b10decd12efaab4452ff46f3c685b
Using host_prep_tasks interface to handle undercloud teardown before we
run the undercloud install.
The reason of not using upgrade_tasks is because the existing tasks were
created for the overcloud upgrade first and there are too much logic
right now so we can easily re-use the bits for the undercloud. In the
future, we'll probably use upgrade_tasks for both the undercloud and
overcloud but right now this is not possible and a simple way to move
forward was to implement these tasks that work fine for the undercloud
containerization case.
Workflow will be:
- Services will be stopped and disabled (except mariadb)
- Neutron DB will be renamed, then mariadb stopped & disabled
- Remove cron jobs
- All packages will be upgraded with yum update.
Change-Id: I36be7f398dcd91e332687c6222b3ccbb9cd74ad2
Nova compute and cinder volume uses oslo concurrency
processuitls.execute to run privileged commands.
Containers inherit file descriptor limit from docker daemon
(currently:1048576) which is too high and leads to performance
issue. This patch sets nofile limit to 1024 for nova compute
and 131072 for cinder volume, which is reasonable as before
containers nova compute used host defaults i.e 1024 and cinder
volume systemctl override([1]) i.e 131072. Also updated neutron
l3, dhcp and ovs agent to use Parameters for ulimit configuration.
[1] https://review.rdoproject.org/r/#/c/1360/.
Closes-Bug: #1762455
Related-Bug: #1760471
Related-Bug: #1757556
Change-Id: I4d4b36de32f8a8e311efd87ea1c4095c5568dec4
This init container runs docker-puppet manually and is responsible of
provisioning the mysql users and passwords. This currently doesn't get
ran every time since the configuration stays the same, even if the users
or passwords change (which are gotten from hieradata). Allowing this to
run every time will allow us to change database passwords
Closes-Bug: #1762991
Change-Id: I1f07272499b419079466cf9f395fb04a082099bd
As part of the minor update workflow and the update workflow, this changes
the pacemaker haproxy bundle resource to add the needed mount for public
TLS to work.
This also handles the reloading of the container to fetch any new certificates
and if needed, it will restart the pacemaker resource (for upgrades), since
we would need pacemaker to re-create the resource.
Change-Id: I850f4de17e7f7e3b46deb27119227ef76658dcb5
Closes-Bug: #1759797
ovn-cms-options config option is mistakenly added as ovn-cms-opts.
As a result ovn_cms_options is never set in SBDB and OVN
mechanism driver is unable to schedule router as expected.
Change-Id: Iaa89a1dbec732c3aa743fa3f5cf1f4931e2ab9ef
Added nfs as an option to where CinderBackupBackend was hardcoded
as either ceph or swift. Also added some parameters for this
driver - CinderBackupNfsShare and CinderBackupNfsMountOptions
Depends-On: Ic0adb294aa2e60243f8adaf167bdd75e42c8e20e
Change-Id: I29a488374726676a28fb82f2f950db891fcf9627
Closes-Bug: #1744174
InternalTLSVncCAFile currently defaults to /etc/ipa/vnc.crt.
Certmonger attempts to save the CA cert to this path as cert_t, however
/etc/ipa is etc_t.
Moving to /etc/pki/CA/certs which is cert_t resolves the issue, and is
arugably a more suitable location.
Change-Id: Ib275fc43dd772851511598a4932c19fcda706479
Neutron agents are using oslo-rootwrap-daemon to run
privileged commands. Containers inherit file descriptor
limit from docker daemon(currently:1048576) which is too
high and leading to performance issues. This patch set
nofile limit for neutron agent containers to 1024 which is
reasonable as before containers they were using host defaults
i.e 1024.
Depends-On: I0cfcf4e3e3e13578ec42e12f459732992fb3a760
Change-Id: Iec722cdfd7642ff3149f50d940d8079b9e1b7147
Related-Bug: #1760471
Zaqar was using mongodb by default but we haven't supported mongodb
since pike. This change switches Zaqar to use redis by default.
Change-Id: If6ed9fddf4a4fcff3bb9105b04df777ec8a8990e
Closes-Bug: #1761239
Name was defined as ceph_client instead of ceph_external.
Closes-Bug: 1761531
Change-Id: I5fd84bbdbb175d81e247664929f728fa1c5b4bdb
Signed-off-by: Tim Rozet <trozet@redhat.com>
The Neutron UID is not static and may be different between the host and
neutron container. Since we generate certificates and keys on the host
for neutron and then mount them in a container, it is highly likely the
container Neutron UID will not match the one used on the host to
generate the files and reading these files will fail in the container.
This patch modifies the permissions after the files are mounted in the
container to be owned by the correct Neutron UID.
Closes-Bug: 1759049
Depends-On: I83b14b91d1ee600bd9d5863acba34303921368ce
Change-Id: Ibad3f1af4b44459e96a6dc9937e5fcef3e6335f4
Signed-off-by: Tim Rozet <trozet@redhat.com>
This reverts commit bd48087520.
After further inspection It seems that panko dbsync shouldn't be
needed, as it will upgrade an newly created empty db.
And this is assuming we find a way to:
- configure panko database connection properly
- create the db
Knowing that we don't have access to this information[1] as the
new hieradata haven't been rendered at this stage.
So all that to upgrade a newly (I guess empty) database seems like too
much trouble.
The db will be created in the last step of the FFU.
[1] https://github.com/openstack/tripleo-heat-templates/blob/stable/ocata/puppet/services/panko-base.yaml#L39..L75
Change-Id: Ie68849a7033c199c339d28cdb10c3dba9419904b
Closes-Bug: #1760135
This is necessary for certain setups (such as enabling multiple LDAP
domains). So, instead of always adding checks every time to see if
we need to refresh or not, lets just do it always, thus simplifying
the already convoluted logic here.
Change-Id: Ie1a0b9740ed18663451a3907ec3e3575adb4e778
Closes-Bug: #1748219
Co-Authored-By: Raildo Mascena <rmascena@redhat.com>
During major upgrade, ensure that the haproxy bundle exposes
the HAProxy stats socket by ensuring there is a bind mount of
/var/lib/haproxy from the host.
Also create /var/lib/haproxy on the host with host_prep_tasks,
and make sure that permissions will be set by Kolla init
at next container restarts.
Depends-On: Ib833ebe16fcc1356c9e0fc23a7eebe9c4b970c55
Change-Id: I0923375fef9f392d3692afb50b21fee7b57c3ca0
This patch adds possibility to pass non-standard ports of monitoring
RabbitMQ instance to sensu-client container health check
Change-Id: Icc01ce23b3fc538811b4dfc4fbaba18dc7165f89
Add an ansible task to run mysql_upgrade whenever a container
image upgrade causes a major upgrade of mariadb (e.g. 5.5 -> 10.1)
. If the overcloud was containerized prior to the major upgrade, the
mysql upgrade job is ran in an ephemeral container (where the latest
version of mysql comes from) and uses credentials from the Kolla
configuration.
. Otherwise the upgrade job is run from the host (once the mysql
rpm has been updated) and uses credentials from the host.
We log the output of the script in the journal. Also, the mysql server
needs to be started temporarily, so use a temporary log file for it
when run from the ephemeral container.
Change-Id: Id330d634ee214923407ea893fdf7a189dc477e5c
Directory /var/lib/vhost_sockets will be used to create vhost sockets
which should have the the group name as hugetlbfs, which is common
between qemu and openvswitch to share the vhost_sockets. And the
correct selinux context to be applied on the vhost_sockets directory.
Closes-Bug: #1751711
Change-Id: Ib917cf86bd9a4ce57af243ab43337ea6c88bf76c
I54b5b59ef49de8d66232312bc449559a7f16eaad configures the HAProxy
service to expose the stats socket with a bind mount, however the
main service container doesn't use that bind mount. Fix that.
Change-Id: I316ab408e82cda70bed8b203b3755936392201da
HA containerized services currently log under
/var/log/pacemaker/bundles/{service-replica}.
Move the logging of those HA services into /var/log/containers,
like all the paunch-managed containers. Also leave a readme.txt
in the previous location to notify the change (taken from
Ic8048b25a33006a3fb5ba9bf8f20afd2de2501ee)
Only the main service log is being moved, e.g. for mysql:
. mysqld.log now ends up in /var/log/containers/mysqld.log
. pacemaker logs stay under /var/log/pacemaker/bundles/{service-replica}
Note: some HA services don't need to be changed during upgrade:
. cinder-{backup|volume} log under /var/log/containers/cinder
. manila-share log under /var/log/containers/manila
. haproxy only logs to the journal
Change-Id: Icb311984104eac16cd391d75613517f62ccf6696
Co-Authored-By: Jiri Stransky <jistr@redhat.com>
Partial-Bug: #1731969
During major upgrade of non-HA overcloud, paunch stops the
containerized mysql service, update container image and restart
the containerized mysql.
After a major update of mysql (e.g. 5.5 to 10.0), run mysql_upgrade
to ensure that database on-disk is upgraded to match the mysql
server version (e.g. update all MyISAM tables)
The mysql_upgrade cannot be performed during upgrade_steps because
paunch only runs during the deploy_steps. So run it in the
post_upgrade_steps, once we know paunch has updated mysql.
Change-Id: I6b6a531fd716ad9abcbf29886c0b1f2c64f04c9d
The upgrade task doesn't check for the service existence which make
the upgrade fails during ffu.
We're using the set_facts idiom as it persist between steps.
Closes-Bug: #1757985
Change-Id: I1d3ccd7d3fb641d187f214c20f1d6a4d6113304a
Missing attributes docker_config_scripts and update_tasks are added in the
neutron-ovs-dpdk-agent docker service.
Closes-Bug: #1757947
Change-Id: I7301eb7a2b094236c7caad38996a4c3983f22603
If openvswitch is not started (meaning the socket file doesn't exist)
and the docker container launches first, docker may create a folder for
the db.sock file which would prevent ovs from starting up later. We
should mount the directory since ovs could be started after the docker
containers.
Change-Id: I0aaed5c73c1c1485ad61202f3fca53348ef5a669
Closes-Bug: #1757111
Currently, the idiomatic "download image and retag to pcmklatest"
happens at step 2 during upgrade. This doesn't work if the stack
is already containerized before the upgrade, because pacemaker
is still running at step 2.
Reshuffle the steps at which the various upgrade tasks are run,
while keeping the ordering guarantees of the upgrade flow:
. Deletion of non-containerized resources happens at step 1,
to allow calling pcs while pacemaker is running.
. Pacemaker is stopped at step 2.
. Docker images for containerized resources are upgraded at
step 3, after the cluster is guaranteed to be stopped.
. Pacemaker is restarted at step 4 as before, once we know
that all resources have been upgraded, yum packages updated
and any potential docker restart has been executed.
Also change the way we detect containerized resources, so that
the predicate still remains valid past step 2 when pacemaker
has been stopped and has deleted its containerized resources.
Change-Id: I85e11dd93c7fd2c42e71b467f46b0044d4516524
container. Otherwise, the collectd-ovs-stats or
collectd-ovs-events plugins won't be able to read stats from
ovs.
Change-Id: I2520a6d25470f144589d957737ae8ac39d3215a2
The docker nova-compute.yaml file does not have the
service_config_settings from the base file. So the
collectd and fluentd parameters defined are not got
when the service run containerized.
The patch get the service_config_settings from the base file and
use it on the when is containerized.
Closes-Bug: #1757066
Change-Id: I0b3567edd7530d1ae0a69d3ee25bd1442f14fb58
The ODL etc directory was being overridden with an empty mount directory
before kolla_start copied the puppet-generated config files. The
puppet-generated config files only include modified configuration files
and not all of the default config files. Therefore ODL was missing
config files when it started so the container was constantly crashing.
This patch removes the unwanted mount erasing the /opt/opendaylight/etc
directory and moves the upgrade file to be created in puppet-generated,
which will be copied at kolla start time for upgrade. The
puppet-generated dir is read-only, so the REST call to disable upgrade
flag in ODL will only disable it for the running instance. Therefore we
have to use ansible to write the file again to disable it incase ODL is
rebooted.
Closes-Bug: 1755916
Change-Id: Ie380cc41ca50a294a2647d673f339d02111bf6b3
Signed-off-by: Tim Rozet <trozet@redhat.com>
Refer from docker readme to puppet readme as the basics are documented
there. Add pre-upgrade rolling tasks docs, move and reword some
related content.
Change-Id: Ie4f84481029be1e133f4da7dc6e0a22fefc4f4ad
Co-Authored-By: Martin André <m.andre@redhat.com>
Co-Authored-By: Dan Prince <dprince@redhat.com>
Co-Authored-By: Emilien Macchi <emilien@redhat.com>
Partially-Implements: bp tripleo-ui-undercloud-container
Change-Id: I1109d19e586958ac4225107108ff90187da30edd
We need to set fact instead of registering values and we shouldn't
update packages if we don't run any DB migrations.
Change-Id: I2e508b06064f66ae640ae0c8694dbe290ef42846
Check if pacemaker resource is defined, not if it's running.
Ensure we try disabling pacemaker resources during FFU.
Change-Id: I9be9118490a28ee9c24d9c8c89a8daee75e5b817
Currently we are calling /usr/bin/gnocchi-upgrade
--sacks-number=SACK_NUM from each node where gnocchi-api is part of the
role. gnocchi-upgrade seems to be racy and we sometimes end up with the
following error:
2018-03-14 12:39:39,683 [1] ERROR oslo_db.sqlalchemy.exc_filters: DBAPIError exception wrapped from (pymysql.err.InternalError) (1050, u"Table 'archive_policy' already exists") [SQL: u'\nCREATE TABLE archive_policy (\n\tname VARCHAR(255) NOT NULL, \n\tback_window INTEGER NOT NULL, \n\tdefinition TEXT NOT NULL, \n\taggregation_methods TEXT NOT NULL, \n\tPRIMARY KEY (name)\n)ENGINE=InnoDB CHARSET=utf8\n\n']
Traceback (most recent call last):
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
cursor.execute(statement, parameters)
File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 166, in execute
result = self._query(query)
File "/usr/lib/python2.7/site-packages/pymysql/cursors.py", line 322, in _query
conn.query(q)
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 856, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1057, in _read_query_result
result.read()
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1340, in read
first_packet = self.connection._read_packet()
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 1014, in _read_packet
packet.check_error()
File "/usr/lib/python2.7/site-packages/pymysql/connections.py", line 393, in check_error
err.raise_mysql_exception(self._data)
File "/usr/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
raise errorclass(errno, errval)
InternalError: (1050, u"Table 'archive_policy' already exists")
Let's run it from a the bootstrap node by wrapping it into the
bootstrap_host_exec magic.
Change-Id: I106512eeffff3425608a543f9bc5e6a9508d15e5
Closes-Bug: #1755564
We need to register fact instead of reruning checks and we can't
hijack glance-api service with glance-registry removal. For the
removal of glance-registry we reintroduce the disabled service
to Controller role.
Change-Id: I38ab5a91b541e7e070f188ee73ef4c7dd7f65eaa
This adds the relevant templates to enable novajoin in a containerized
undercloud environment. Note that this is not meant for the overcloud
(yet), and since there are several limitations that need to be addressed
first. This is meant for the containerized undercloud.
Depends-On: Iea461f66b8f4e3b01a0498e566a2c3684144df80
Depends-On: Ia733b436d5ebd0710253c070ec47a655036e0751
Depends-On: I554125fd6b48e620370f9e3a6061bbdc1d55b0ae
Change-Id: I3aad8a90816e6fc443f20579f6ac7ad4f35eafcb
We should try to disable pacemaker resource and shut down services
in step 1 and make sure we check running services only once.
Change-Id: I5676132be477695838c59a0d59c62e09e335a8f0