This file is needed in order for people cloning the repo
to be able to initialize it for gerrit by the
"git review -s" command
Change-Id: If0c791896250519def25149dae3e077e689a054d
Story: 2006166
Task: 36530
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
The feature gate for sctp support in apiserver was added in
kubernetes 1.12 but is disabled by default. This commit enables it.
Information about SCTP is here:
https://kubernetes.io/docs/concepts/services-networking/service/#sctp
The centos version of netcat can be used to validate the feature.
A Dockerfile for building a centos netcat is provided.
Tested by:
kubectl run --generator=run-pod/v1 --image netcat:v1.0.0 \
listen-sctp -it --rm -- --sctp -l -p 9000
(get IP of the listener pod)
kubectl run --generator=run-pod/v1 --image netcat:v1.0.0 \
test-sctp -it --rm -- --sctp <listener pod IP> 9000
Change-Id: I9642e485cb9c30f6b1272c00ec1046b9c98211ac
Story: 2006472
Task: 36403
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
This commit supports to pull images from alternative authenticated
registries that configured at Ansible bootstrap to bring up k8s pods
at puppet time.
At bootstrap time, barbican secrets are created to store credentials
for accessing registry and alternative registries info are stored in
service parameter. At puppet time, the barbican sercret is retrieved
to get the credentials in order to pre-pull k8s images that required
by kubeadm to bring up static pods(ie..kube-controller-manager,
kube-apiserver, kube-scheduler..).
The images for dynamic pods(kube-multus, kube-sriov-cni, calico..) and
tiller are not needed to pre-pull, imagePullSecrets is added in their
pod spec to pass credentials to kubelet. This is done in Ansible
bootstrap https://review.opendev.org/#/c/679136/
This commit also updates to pull Armada image before creating Armada
container if Armada image is not available in docker cache.
Tests(AIO-SX, AIO-DX, Standard):
- All types of system are installed successfully
- Verified all k8s/gcr/docker images are downloaded from
authenticated registry on controller-1 and worker nodes
- Verified images from authenticated registries are used
by k8s static/dynamic pods on controller-1 and worker nodes
- Swact to controller-1, lock/unlock controller-0. Verified
that tiller image is downloaded from authenticated registry
and tiller pod is created on controller-1
- Swact to controller-1, apply application. Verified that
Armada image is downloaded from authenticated registry and
Armada container is created.
Change-Id: Iaabef0f5d8a6a4640dcfde93a8c0449948f4a59f
Depends-On: https://review.opendev.org/679335
Story: 2006274
Task: 36379
Signed-off-by: Angie Wang <angie.wang@windriver.com>
Upgrading from kubernetes 1.13.5 to 1.15.0 meant the config
needed to be updated to handle whatever was deprecated or dropped
in 1.14 and 1.15.
1) Removed "ConfigMapAndSecretChangeDetectionStrategy = Watch"
reported by https://github.com/kubernetes/kubernetes/issues/74412
because this was a golang deficiency, and is fixed by the newer
version of golang.
2) Enforced the kubernetes 1.15.3 version
3) Updated v1alpha3 to v1beta2, since alpha3 was dropped in 1.14
changed fields for beta1 and beta2 are mentioned in these docs:
https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta1https://godoc.org/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2
4) cgroup validation checking now includes the pids subfolder.
5) Update ceph-config-helper to v1.15 kubernetes compatable
This means that the stx-openstack version check needed to be increased
Change-Id: Ibe3d5960c5dee1d217d01fbb56c785581dd1b42c
Story: 2005860
Task: 35841
Depends-On: https://review.opendev.org/#/c/671150
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
Kubernetes only supports a single huge page size per worker
node. Prior to kubernetes 1.15, the huge page feature could
be disabled via a feature gate. In kubernetes 1.15, the
feature gate has been removed so huge page support is always
on in k8s.
This update removes the conditional disabling of the hugepage
feature and enforces the provisioning of a single page size
per worker.
When vswitch type is set to ovs-dpdk or avs, the application
huge pages size goes with the vswitch huge pages size.
This update also changes the auto-provisioning of VM huge
pages to 1G as there is no auto-provisioning in virtual
environment.
Story: 2006295
Task: 36006
Change-Id: I84d4959b420584fdcdf8a8664a6f4855c08ec989
Signed-off-by: Tao Liu <tao.liu@windriver.com>
Rebasing Armada to use the latest docker image tag
8a1638098f88d92bf799ef4934abe569789b885e-ubuntu_bionic.
Change-Id: Ic48a2e053d0de7dacfd6a07d817947e11dc8d596
Story: 2006347
Task: 36105
Signed-off-by: Robert Church <robert.church@windriver.com>
The K8s service host in the Multus kubeconfig file is currently
not wrapped with [brackets] in the case an IPv6 cluster service
endpoint has been configured.
This causes issues for Multus when it attemps to get (curl) for
the address.
This fix ensures the IPv6 address is properly formatted for use
by Multus.
Closes-Bug: 1836972
Change-Id: I803edfb86a70d232d6015a7bb130da0756a56458
Signed-off-by: Steven Webster <steven.webster@windriver.com>
Restart Docker process after changing proxy settings through service
parameters. There is a potential issue currently where Docker is
started before the changes to proxy settings through service parameter
is applied. This means on lock/unlock, Docker restarts with old
proxy settings. This commit fixes that issue.
Closes-Bug: 1838651
Change-Id: I57e527998fdf50c4be38c32ea8d1ee95bc46d3ff
Signed-off-by: Jerry Sun <jerry.sun@windriver.com>
The existing "platform" filesystem is now resizable and added to
the ControllerFS API. The “glance” filesystem is merged into
"platform" and therefore removed from the ControllerFS API. The
"--force" flag is removed from the controllerfs-modify API as
it was only used for glance fs resizing.
The folder /opt/cgcs is removed and the “helm_charts” and “keystone”
folders now resides under /opt/platform.
ls /opt/platform/
armada config helm nfv puppet sysinv
ls /opt/cgcs/
helm_charts keystone
Resources related to drbd-cgcs and /opt/cgcs are removed from puppet
or updated to use drbd-platform and /opt/platform.
SM is no longer monitoring resources related to drbd-cgcs.
Tested in AIO-SX, AIO-DX and Standard hardware labs.
Partial-Bug: 1830142
Change-Id: I0a80c95a057e9d6d2acec5f33cc4da31cd20955e
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
radosgw is a now an optional platform service which is provisioned via a
system service parameter. To align with this optionality, the ceph-rgw
chart which is used to enable the containerized swift endpoints also
becomes optional.
Changes include:
- Update the stx-openstack application disabled_charts setting in the
application metadata.yaml to include the ceph-rgw chart. This sets the
initial chart state to disabled.
- Optimize ceph.pp puppet manifests to provide two runtime classes: one
for setting up the platform radosgw configuration which will set the
haproxy configuration and the other for updating the keystone
information in the ceph configuration based on if the ceph-rgw chart
is enabled.
- Update the sm.pp manifest to dynamically provision/deprovision the
radosgw based on if it's enabled in the service parameters
- Rename the SWIFT service parameters to RADOSGW as this is the platform
service being enabled.
- Restructure ceph.py/ceph.pp to generate and use hieradata such that
_revert_cephrgw_config() and _update_cephrgw_config() can be combined
into a single function for runtime updates.
Change-Id: Id8d5c6b1159881d44810fc3622990456f1e54e75
Depends-On: If284f622ceac48c4ffd74e7022fdd390971d0fd8
Partial-Bug: #1833738
Signed-off-by: Robert Church <robert.church@windriver.com>
Use the experimental-cluster-signing-duration parameter to set the
kubelet certificate to expire after 1 month. Kubelet certificate
rotation is enabled by default.
Closes-Bug: 1834685
Change-Id: Ie5b91a86c1a1b536e51719dad99be0cc89d65722
Signed-off-by: David Sullivan <david.sullivan@windriver.com>
On compute nodes with openstack-compute label, the
kvm_timer_advance_setup.service should be enabled.
The puppet service runs before kubelet.
Change-Id: I84d6c6234d4bd1c8c0c52f5735d7520377b2fe80
Partial-Bug: 1823751
Depends-On: https://review.opendev.org/#/c/672124
Signed-off-by: Al Bailey <Al.Bailey@windriver.com>
This commit modifies the kubeadm template to support OpenID connect
params.
Depends-On: https://review.opendev.org/671259
Story: 2006235
Task: 35836
Change-Id: I38f736aa68f9c0031ed697cdf17cd28ed08cadf6
Signed-off-by: Jerry Sun <jerry.sun@windriver.com>
This commit is to support platform restore for AIO-DX and Standard
no-storage configuration using restore_platform playbook:
- For AIO-DX, the restored ceph crushmap is loaded through puppet
when controller-0 is unlocked for the first time. OSDs are
created on controller nodes during controller unlock.
- For Standard no-storage configuration, the restored ceph crushmap
is loaded through sysinv when ceph quorum is formed. OSDs are
created on controller nodes by applying ceph osd runtime manifests.
- The .restore_in_progress flag file is removed as part of first
unlock of controller-0.
Change-Id: I65bfc67cf90e894d125eb6c860139b26d17b562e
Story: 2004761
Task: 35965
Signed-off-by: Wei Zhou <wei.zhou@windriver.com>
Restart collectd after configuring cpu to ensure collectd loads
updated configuration
Closes-Bug: 1837424
Change-Id: I10e0f431dfd01637f38319d506559aa3927f11ff
Signed-off-by: Bin Qian <bin.qian@windriver.com>
This reverts commit a5c236dc522c050b036e638955c03074a2963996.
It was thought that setting the TCP timeouts for the cluster
network was enough to address the issues with the helm commands
hanging after a controller swact. This is not the case. In
particular, swacting away from the controller with the
tiller-deploy pod seems to cause tcp connection from that pod to
the kube-apiserver to hang. Putting the tiller-deploy pod back on
the host network "fixes" the issue.
Change-Id: I8f37530e1f615afcffcf6cb1d629518436c99cb9
Related-Bug: 1817941
Partial-Bug: 1837055
Signed-off-by: Bart Wensley <barton.wensley@windriver.com>
This commit adds the floating ip support if ironic network is
created and an interface is assigned to that network. Ironic
floating ip is used for ironic node to access openstack services
through it. It's an HA feature for ironic if 2 controllers are
deployed.
Story: 2004760
Task: 34740
Depends-On: https://review.opendev.org/669781
Change-Id: I55681abfee700dcf7036503d1490accc413b84c4
Signed-off-by: Mingyuan Qi <mingyuan.qi@intel.com>
We need the ability to update the Kubernetes ApiServer RootCA at
ansible-bootstrap-time. This includes the ability of being able to
specify the apiServerCertSANs such that user can specify additional
DNS:<FQDN> and/or IP Records for the auto-generated
apiServerCertificate.
This adds support for storing the apiServerCertSANs in the sysinv
database and modifies the puppet manifest to support user supplied SAN
records.
Partial-Bug: 1837079
Change-Id: I4d23828b31ced55d55b1c6932d0cfd6b59727288
Signed-off-by: David Sullivan <david.sullivan@windriver.com>
The barbican-api process currently writes directly
to its logfile. As such, the logrotate config file
needs a copytruncate directive to ensure the process
doesn't end up writing to the rotated file instead.
Change-Id: I60c8a08ce612fd7f82e05f69b168919b12ab0017
Partial-Bug: 1836632
Signed-off-by: Don Penney <don.penney@windriver.com>
This commit is to support platform restore for AIO-SX using
restore_platform playbook:
1. During AIO-SX restore, the restored ceph crushmap is loaded through
puppet.
2. Bypass vim when unlocking controller-0 for the first time.
3. When unlocking controller-0 for the first time, app_reapply is
skipped for stx-openstack application.
4. After controller-0 is unlocked, ceph backend task is set to None.
Change-Id: I36d27b162334e5a2f0371793243f2301b5fec1eb
Story: 2004761
Task: 33645
Signed-off-by: Wei Zhou <wei.zhou@windriver.com>
Add a new filesystem called "kubelet" to all hosts with a default
size of 10G. This new fs will be managed by the host_fs API.
Also made the scratch filesystem resizable on all hosts.
Tested with install of hardware Standard and AIO-DX labs. Also
tested install of a vbox AIO-SX lab.
Partial-Bug: 1830142
Depends-On: https://review.opendev.org/671120
Change-Id: I968f84b8ba7a069ec3d7027d4eb4a7355a06d9d3
Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>
This update reimplements the affine-tasks init script and service to
dynamically reaffine tasks and k8s-infra cgroup cpuset on AIO nodes.
This accomodates CPU intensive phases of work. Tasks are initially
allowed to float across all cores. Once system is at steady-state,
this will ensure that K8S pods are constrained to platform cores and
do not run on cores with VMs/containers.
This will speedup the first stx-application apply, as well as pod
recovery after lock/unlock, reboot, and controller swact.
This script waits forever for sufficient platform readiness criteria
(e.g., system critical pods are recovered, critical openstack pods
are running, nova-compute pod is running) before reaffining back
to platform cores.
This corrects the pod affinity problem seen on AIO introduced by fix
for bug: 1826592, commit e513baad44181f667085886007632d0ebf79eeb0,
i.e., fix allowed the AIO to not timeout, but left pods floating.
Change-Id: Ic257378eac451904a200a0f2e79f7bc4f8373009
Partial-Bug: 1832781
Signed-off-by: Jim Gauld <james.gauld@windriver.com>
This reverts commit 4802f1d96a1217124e39a057fd7a05e22177b81c.
The change made is no longer necessary due to commit 9a4b6b6a.
The playbookconfig code was moved to the ansible-playbooks repo
and will be removed there.
Conflicts:
playbookconfig/centos/build_srpm.data
playbookconfig/playbookconfig/playbooks/bootstrap/roles/bringup-essential-services/tasks/bringup_helm.yml
puppet-manifests/centos/build_srpm.data
puppet-manifests/src/modules/platform/manifests/helm.pp
Change-Id: I20a38c1ad882bebb6e1208f43d6582bc399e9e87
Related-Bug: 1817941
Signed-off-by: Bart Wensley <barton.wensley@windriver.com>
Barbican returns "503 Service Unavailable" during bootstrap
phase of StarlingX. This happens because Keystone auth token
lacks domain details for Barbican. Need to explicitly specify
project_domain_name and user_domain_name in Barbican config.
Change-Id: I4bf6b275c1eb271b62a2e7a1bc72c049f193afc4
Closes-bug: 1834670
Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>