Relocation of helm charts required some modifications to
the spec and relocation of the makefile..
Story: 2006166
Task: 35687
Depends-On: I5c34bf66a3631e86e22684412e01c02980e9ae30
Change-Id: If27d138708c580df168797a3878e349fde2c6d19
Signed-off-by: Scott Little <scott.little@windriver.com>
Upgrading from kubernetes 1.13.5 to 1.15.0 meant the config
needed to be updated to handle whatever was deprecated or dropped
in 1.14 and 1.15.
1) Removed "ConfigMapAndSecretChangeDetectionStrategy = Watch"
reported by https://github.com/kubernetes/kubernetes/issues/74412
because this was a golang deficiency, and is fixed by the newer
version of golang.
2) Enforced the kubernetes 1.15.3 version
3) Updated v1alpha3 to v1beta2, since alpha3 was dropped in 1.14
changed fields for beta1 and beta2 are mentioned in these docs:
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2
4) cgroup validation checking now includes the pids subfolder.
5) Update ceph-config-helper to v1.15 kubernetes compatable
This means that the stx-openstack version check needed to be increased
Change-Id: Ibe3d5960c5dee1d217d01fbb56c785581dd1b42c
Story: 2005860
Task: 35841
Depends-On: https://review.opendev.org/#/c/671150
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
To reduce cpu usage on platform cores (especially on AIO), reduce the
frequency of the rabbitmq readiness and liveness probes from every 10s
to 30s. These probes both run the command "rabbitmqctl status" which
seems to have significant cpu impact.
For reference, the platform rabbitmq process status check runs every
20s.
Partial-Bug: 1837426
Depends-On: https://review.opendev.org/#/c/677041
Change-Id: Ie8eea35b9ed268f4156d1cdc884a6d5004e87018
Signed-off-by: Gerry Kopec <gerry.kopec@windriver.com>
Affinity weigher is required to support soft-anti-affinity and
soft-affinity server group policies in nova. Set to a relatively high
mulitplier of 20 to ensure that this criteria predominates the host
selection.
Adjust other weigher multipliers accordingly:
io_ops: remove override to let it use default value of -1. Old -5
setting was related to discontinued stx-nova patch in previous
stx release.
cpu & build_failure: disable similar to ram, disk & pci.
Also enable shuffle_best_same_weighed_hosts to randomize host selection
where weights are equal across multiple hosts.
Change-Id: I28f92a7c703d1b78d5cab93418359ce164e61066
Closes-Bug: 1834255
Signed-off-by: Gerry Kopec <gerry.kopec@windriver.com>
radosgw is a now an optional platform service which is provisioned via a
system service parameter. To align with this optionality, the ceph-rgw
chart which is used to enable the containerized swift endpoints also
becomes optional.
Changes include:
- Update the stx-openstack application disabled_charts setting in the
application metadata.yaml to include the ceph-rgw chart. This sets the
initial chart state to disabled.
- Optimize ceph.pp puppet manifests to provide two runtime classes: one
for setting up the platform radosgw configuration which will set the
haproxy configuration and the other for updating the keystone
information in the ceph configuration based on if the ceph-rgw chart
is enabled.
- Update the sm.pp manifest to dynamically provision/deprovision the
radosgw based on if it's enabled in the service parameters
- Rename the SWIFT service parameters to RADOSGW as this is the platform
service being enabled.
- Restructure ceph.py/ceph.pp to generate and use hieradata such that
_revert_cephrgw_config() and _update_cephrgw_config() can be combined
into a single function for runtime updates.
Change-Id: Id8d5c6b1159881d44810fc3622990456f1e54e75
Depends-On: If284f622ceac48c4ffd74e7022fdd390971d0fd8
Partial-Bug: #1833738
Signed-off-by: Robert Church <robert.church@windriver.com>
Extend the helm_charts API to support an enable attribute. This
attribute is set on application upload and stored in the existing
system_overrides element of the helm_overrides table.
Changes include
- Add application metadata support for disabling charts on application
upload.
- Add the system helm-chart-attribute-modify command to allow enabling
and disabling charts from the command-line. This removes the current
implementation of adding a faux label via the system host-label-assign
command to enable and disable charts.
- Add a --long option to helm-override-list to enable easy viewing of
what charts are enabled for a given application
- Enhance the ArmadaManifestOperator to make this a base class for
application specific operator classes. Introduce classes for the
stx-openstack and platform-integ-apps manifests with specific
knowledge of the charts and chart groups within each class.
- Use stevedore to load the application specific manifest operators.
This will allow future packaging of manifest operators with new
application tarballs.
- Move the helm chart definition from the common/constants.py to
helm/common.py. This limits helm/armada specific data leakage outside
of the helm directory, which we may carve out of sysinv in the future.
- Clean up the code related to the faux labels: LABEL_IRONIC,
LABEL_BARBICAN, and LABEL_TELEMETRY
- Rework the manifest update code in the plugins to include checks for
if the chart for a given application has been disabled.
Change-Id: If284f622ceac48c4ffd74e7022fdd390971d0fd8
Closes-Bug: #1833746
Depends-On: I418f0fe4978946a44e512c3025817fb27216c078
Signed-off-by: Robert Church <robert.church@windriver.com>
Enable service cleaner cronjob in nova helm chart. This will run hourly
and delete any nova services that are no longer up (e.g. conductor,
scheduler & consoleauth). These will be left over after controller
lock/unlock or application-update.
Change-Id: I001bf79b497eb1924b4252612c5ead6e992e8196
Closes-Bug: 1835565
Signed-off-by: Gerry Kopec <gerry.kopec@windriver.com>
The nova-api-proxy is currently running in a single pod on one of
the controllers. To improve recovery time when a controller
fails, the nova-api-proxy pod will now be run with replicas
set to two and anti-affinity configured so there is a pod on
each controller.
Closes-bug: 1833730
Change-Id: Iacd17251b86050e337d9a0f832b9dfa6e9864fce
Signed-off-by: Bart Wensley <barton.wensley@windriver.com>
In stx.1.0, stx-neutron hard-coded the rpc_response_max_timeout value
to 60 seconds. With the migration to containers and upstream neutron,
the default is now set to 600 seconds. To align with the previous
starlingx behavior, the rpc_response_max_timeout set to 60 seconds
by the system through a neutron helm override.
Change-Id: Ibf0f591ac9cb05dac09add37b3c31f6f5b66446d
Closes-Bug: #1836413
Signed-off-by: marvin <weifei.yu@intel.com>
The issue occur when do swact && lock/unlock standby controller.
After swact && lock/unlock standby controller, the pod created for
job doesn't exist any more. So there is no pod for ceph_rgw.
While armada assumes there is at least 1 pod exist in default, and
will wait for the pod to be up and ready. For chart without pod, we
need explicitly declare the resources that need to be waited for in
armada schema. This declare will override the default wait list in
armada.
Change the timeout value to 300s, which should be enough for the 3
job to be ready.
Closes-Bug: 1833609
Change-Id: I5339406cf914cd54f45b3de5df7ff213e8845bfc
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Panko API is not actually working due to the default api configuration
doesn't work with stein Panko. This commit updates the Armada manifest
to override it.
Change-Id: Ic75a92bd95ab6a7300f9ac7c2562d3ff07a2ef6a
Closes-Bug: 1836636
Signed-off-by: Angie Wang <angie.wang@windriver.com>
This update adds support for an application version_check plugin,
which is called during the system application-upload command as
a validation step. This verifies that the application being loaded
is supported by the current application plugin code.
Change-Id: I9b854ff5d74065812cde90a6531e1be21fc73adb
Closes-Bug: 1833425
Signed-off-by: Don Penney <don.penney@windriver.com>
If wait label is not set, armada will try to wait for all the job/pod
in the same namespace, which may cause unintended consequences.
For nginx chart, there is only a GlobalNetworkPolicy set. So set the
wait resources list as empty.
Closes-Bug: 1836303
Change-Id: Ic08816b09518d09af2dad6b78a806feeeb3e9ac5
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
long_rpc_timeout: timeout for long running live migration tasks
Setting to 400s. This is reduced from the nova default of 1800s to
give a more timely response and is based on an old custom patch to
stx-nova where we allowed pre_live_migration to take 300s (if block
migration) + 6s * number of vifs.
libvirt/cpu_mode: set to host-model so guest can closely match host
Story: 2003909
Task: 30059
Change-Id: Iab26aa7143697e21678eecfb3f3e161a6e0a786c
Signed-off-by: Gerry Kopec <Gerry.Kopec@windriver.com>
Several cases might trigger stx-openstack re-apply, for example,
lock and unlock standby controller. During the re-apply process,
nginx-ports-control helm chart has to be removed first, otherwise
re-apply process will be blocked because a previously applied GNP
(GlobalNetworkPolicy) in nginx-ports-control chart has existed.
Closes-Bug: 1834070
Change-Id: I10805f052914a5157edc9b53699a94a2c7fd7953
Signed-off-by: yhu6 <yong.hu@intel.com>
Magnum is no longer packaged on bare metal.
The sysinv and upgrades code related to magnum has been removed.
The helm configuration for magnum remains, although it is not currently
supported in containers either. The magnum-ui is not installed in
platform or containerized horizon so the code to enable it is removed.
Some upgrade code remains, due to the fact that that utility is
in the process of being re-written.
Story: 2004764
Task: 34333
Change-Id: I56873b4e04aac2e7d0cd57909beea00ecc2c1b9a
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
Dependent patch has added nova service token capability to nova helm
chart but is defaulted to disabled. Enable it via override.
Story: 2003909
Task: 34734
Depends-On: https://review.opendev.org/#/c/667583
Change-Id: I82260b8f0fe196308844990e9523e85ed065bafd
Signed-off-by: Gerry Kopec <gerry.kopec@windriver.com>
Currenty sysinv generates overrides only for default and backup
Ceph pools. Adding a new Ceph storage backend does not make it
available to stx-openstack application.
Iterate all Ceph storage backend and create corresponding
Ceph pool overrides which are then used by Cinder Helm
chart to setup Cinder configuration and pool access.
Change-Id: I2ca84406238e6c7462709822b303e25176fb9c8a
Depends-On: I29c7d3ed118f4a6726f2ea887a165f256bc32fd5
Story: 2003909
Task: 30351
Signed-off-by: Daniel Badea <daniel.badea@windriver.com>
This change is required to enable the FM panels in the
containerized horizon. The fmclient dependency used by the
back-end already includes the reference to "v1". Then "v1"
should not be part of the endpoint.
Story: 2004008
Task: 34183
Change-Id: Ibe92d942b2847ef6d9271595fcc9ac2faad2fb13
Co-authored-by: Sun Austin <austin.sun@intel.com>
Signed-off-by: Mario Alfredo Carrillo Arevalo <mario.alfredo.c.arevalo@intel.com>
this chart is added as a part of "stx-openstack" application,
in the same chart group as openstack-ingress chart, so that
when "nginx-ingress-controller" starts working, http and https
ports are allowed for nginx which accepts http/https requests
and forwards to internal services accordingly.
In the following LP#1827246, the http request of opening console
of VM instance is sent to nginx 80 first, and then nginx forwards
the request to "nova-novncproxy" at port 6080 internally.
Closes-Bug: 1827246
Change-Id: I183f7edc92f1a9e0bdedad0afe35e3d03e20e7d5
Signed-off-by: yhu6 <yong.hu@intel.com>
This change allows to deploy the placement helm
chart with armada system and remove placement deployment within
nova.
Below test pass on both AIO and multi setup
1) Openstack Application apply and reapply
2) VM creation and delete
3) Active controller switch and create vm after that
Story: 2005750
Task: 33418
Depends-On: https://review.opendev.org/662371/
Change-Id: I32dc127dcbc0319e3a20703ed66c9e8119fabcba
Signed-off-by: zhipengl <zhipengs.liu@intel.com>
This reverts commit 248d3b3921024bc4cb6ad5234b34a9c2edf11599.
Change-Id: I06931b7f5ae047ec93eaa0ee553e35577ceb433b
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
To properly enable Cinder volume backup, the following configuration
changes are required:
- For Cinder, enable 'CephBackupDriver' as the Cinder backup_driver and
'cinder' as the rbd_user for each Cinder backend
- For libvirt, enable Ceph and use 'cinder-volume-rbd-keyring' for the
Ceph client user secret. This will create a libvirt secret that will
be used with the 'cinder' user.
- For nova, enable the rbd_secret_uuid shared with libvirt and set the
'rbd_user' to cinder.
- Update the chart group initialization sequence, so that
'openstack-cinder' is initialized prior to 'openstack-compute-kit'.
This is done because 'cinder-volume-rbd-keyring' is created by Cinder
and is required by libvirt to successfully initialize.
With these configuration changes:
- Cinder volumes were created
- Cinder volumes were backed up
- Instances were booted by volume (from Cinder)
- Instances were booted by image (from Ceph ephemeral disks)
Change-Id: I29c7d3ed118f4a6726f2ea887a165f256bc32fd5
Depends-On: https://review.opendev.org/#/c/664619/
Story: 2004520
Task: 28266
Signed-off-by: Robert Church <robert.church@windriver.com>
Add a helm chart for configuring and starting openstack
clients pods. The pod is configured with admin credentials
and launched on a controller node.
Change-Id: I4dea49301fd778db9a9ddf900a752831bd455fda
Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com>
Story: 2005312
Task: 30557
Add ironic chart to stx-openstack manifest. Ironic services are
enabled when label openstack-ironic=enabled is set. A nova service
nova-compute-ironic within nova chart is referring to this label as
well. Nova-compute-ironic is configured to use ironic driver for
creating/scheduling instance to ironic node through nova service/CLI.
Ironic chart group enablement will be added by ironic meta overrides.
Story: 2004760
Task: 28869
Depends-On: https://review.opendev.org/#/c/653914/
Change-Id: I5728586c69689e32afc948009c2b8c9e2bff84e0
Signed-off-by: Mingyuan Qi <mingyuan.qi@intel.com>
http://fm.openstack.svc.cluster.local:80 can not be accessed
The ingress service is missing in helm chart
Closes-Bug: 1832155
Change-Id: I61ea514d3092e1e3fedcd8ca8001a178d65282a3
Signed-off-by: Sun Austin <austin.sun@intel.com>
A new command "system application-update" is introduced in this
commit to support updating an applied application to a new version
with a new versioned app tarfile.
The application update leverages the existing application upload
workflow to first validating/uploading the new app tarfile, then
invokes Armada apply or rollback to deploy the charts for the new
versioned application. If the version has ever applied before,
Armada rollback will be performed, otherwise, Armada apply will be
performed.
After apply/rollback to the new version is done, the files for the
old application version will be cleaned up as well as the releases
which are not in the new application version. Once the update is
completed successfully, the status will be set to "applied" so that
user can continue applying app with user overrides.
If there has any failure during updating, application recover will be
triggered to recover the app to the old version. If application recover
fails, the application status will be populated to "apply-failed" so
that user can re-apply app.
In order to use Armada rollback, a new sysinv table "kube_app_releases"
is created to record deployed helm releases versions. After each app
apply, if any helm release version changed, the corresponding release
needs to be updated in sysinv db as well.
The application overrides have been changed to tie to a specific
application in commit https://review.opendev.org/#/c/660498/. Therefore,
the user overrides is preserved when updating.
Note: On the AIO-SX, always use Armada apply even it was applied issue
on AIO-SX(replicas is 1) to leverage rollback, Armada/helm
rollback --wait does not wait for pods to be ready before it
returns.
Related helm issue,
https://github.com/helm/helm/issues/4210https://github.com/helm/helm/issues/2006
Tests conducted(AIO-SX, DX, Standard):
- functional tests (both stx-openstack and simple custom app)
- upload stx-openstack-1.0-13-centos-stable-latest tarfile
which uses latest docker images
- apply stx-openstack
- update to stx-openstack-1.0-13-centos-stable-versioned
which uses versioned docker images
- update back to stx-openstack-1.0-13-centos-stable-latest
- update to a version that has less/more charts compared to
the old version
- remove stx-openstack
- delete stx-openstack
- failure tests
- application-update rejected
(app not found, update to a same version,
operation not permitted etc...)
- application-update fails that trigger recover
- upload failure
ie. invalid tarfile, manifest file validation failed ...
- apply/rollback failure
ie. download images failure, Armada apply/rollback fails
Change-Id: I4e094427e673639e2bdafd8c476b897b7b4327a3
Story: 2005350
Task: 33568
Signed-off-by: Angie Wang <angie.wang@windriver.com>
The version of existing OVS docker image is 2.8.1. StarlingX
builds its own OVS docker image with latest version
2.11.0. This patch overrides the old docker images.
Change-Id: Iec56dc89cdb7a02f9b1beed459eab230c06707ec
Story: #2004649
Task: #30281
Depends-On: https://review.opendev.org/#/c/662195
Co-Authored-By: Cheng Li<cheng1.li@intel.com>
Signed-off-by: Chenjie Xu <chenjie.xu@intel.com>
In order to get swift working on containerized openstack,
changes were needed both on platform and application side.
From platform side, settings from ceph.conf file were replaced.
A runtime manifest was added to update ceph.conf after a successful
application apply:
1. Keystone auth url was updated with keystone openstack url
2. 'rgw_keystone_admin_domain' and 'rgw_keystone_project' settings
were updated with 'service'.
From application side the following changes have been implemented:
1. Ceph-rgw chart from openstack-helm-infra repo was included
in stx-openstack
2. A chart schema for ceph-rgw was added
3. An override file was generated
Signed-off-by: Elena Taivan <elena.taivan@windriver.com>
Story: 2003909
Task: 30606
Change-Id: I01f7cf412264394f4f9bfb31f3c5a5ebd73f49dc
1) add '---' in Deployment yaml to support multi resources types.
2) change Deployment apiVersion to 'apps/v1'.
3) set serviceaccount name to 'fm' to be same as db-init etc jobs.
4) add job ks_service and ks_user dependencies.
Change-Id: I3b15da621dd5a5cc1f20e9e963abbeba54827592
Closes-Bug: 1831163
Signed-off-by: Sun Austin <austin.sun@intel.com>
Override nginx "worker-processes" setting in mariadb ingress
controller. Default value is changed from auto to 4 to reduce
memory consumption by nginx worker processes. 4 worker can
give 2 per platform CPU (in AIO) to avoid blocking all users
in case that part of workers are blocked.
The static override is done in the Armada manifest.
Closes-Bug: #1823803
Depends-On: https://review.opendev.org/#/c/659464/
Change-Id: If0e6d2b2ac45dedbd9e67b4f866702d9de1db15c
Signed-off-by: Yi Wang <yi.c.wang@intel.com>
Murano is no longer installed and running on bare metal.
- Removed the system parameters related to murano.
- Removed the upgrade code for murano databases.
- Removed the murano certificate installation code from CLI
- Removed the murano puppet code
- Remove murano keystone user special handling
- Remove armada/helm code to support enabling murano in horizon
- Cleaned up comments in the code referencing murano.
Story: 2004764
Task: 30667
Change-Id: I4d9f82414043a8cad22220556181b5454572d42d
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>