839 lines
28 KiB
ReStructuredText
839 lines
28 KiB
ReStructuredText
..
|
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
License. http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
|
|
===================
|
|
Kubernetes Upgrades
|
|
===================
|
|
|
|
Storyboard:
|
|
https://storyboard.openstack.org/#!/story/2006781
|
|
|
|
This story will provide a mechanism to upgrade the kubernetes components on
|
|
a running StarlingX system. This is required to allow bug fixes to be
|
|
delivered for kubernetes components and to allow upgrades between kubernetes
|
|
minor versions.
|
|
|
|
|
|
Problem description
|
|
===================
|
|
|
|
The kubernetes components used by StarlingX are delivered in two ways:
|
|
|
|
* RPMs (e.g. kubeadm, kubelet)
|
|
* docker images (e.g. kube-apiserver, kube-controller-manager)
|
|
|
|
StarlingX must provide a mechanism to allow bug fixes to be delivered for
|
|
both these components. In addition, the kubernetes version skew policy [1]_
|
|
mandates that kubernetes minor release cannot be skipped. Since these minor
|
|
releases occur approximately every three months and the StarlingX release
|
|
cycle is approximately six months long, StarlingX must provide a mechanism to
|
|
support kubernetes minor version upgrades on a running system.
|
|
|
|
Kubernetes versions are in the format major.minor.patch (e.g. 1.16.2). Bug
|
|
fixes are delivered with a new patch version and releases are delivered with
|
|
a new minor version. In either case, since the StarlingX kubernetes
|
|
components are managed using kubeadm, a specific upgrade procedure must be
|
|
followed [2]_. The kubeadm tool provides commands that help perform these
|
|
upgrades, but the upgrade also involves other actions such as installing
|
|
new RPMs, pulling images, restarting processes, etc... These steps must all
|
|
be performed in a specific sequence.
|
|
|
|
In order to provide a robust and simple kubernetes upgrade experience for
|
|
users of StarlingX, the entire process must be automated as much as possible
|
|
and controls must be in place to ensure the steps are followed in the right
|
|
order.
|
|
|
|
Use Cases
|
|
---------
|
|
|
|
* End User wants to upgrade to a new patch version of kubernetes on a running
|
|
StarlingX system with minimal impact to running applications.
|
|
* End User wants to upgrade to a new minor version of kubernetes on a running
|
|
StarlingX system with minimal impact to running applications.
|
|
* End User wants to downgrade to a previous patch version of kubernetes on a
|
|
running system, because they experienced an issue with the new patch version.
|
|
Note: downgrading between minor versions is not supported.
|
|
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
StarlingX will only support specific kubernetes versions and upgrades paths.
|
|
For each supported kubernetes version we will track (via metadata in the
|
|
sysinv component):
|
|
|
|
* version (e.g. v1.16.1)
|
|
* upgrade_from (e.g. v1.15.3)
|
|
|
|
Specifies which kubernetes versions can upgrade to this version.
|
|
* downgrade_to (e.g. none)
|
|
|
|
Specifies which kubernetes versions this version can downgrade to.
|
|
* applied_patches (e.g. PATCH.10, KUBE_PATCH.1)
|
|
|
|
These patches must be applied before the upgrade starts.
|
|
* available_patches (e.g. KUBE_PATCH.2):
|
|
|
|
These patches must be available (but not applied) before the upgrade starts.
|
|
|
|
The existing patching mechanism will be used to deliver metadata and software
|
|
to enable upgrades for specific kubernetes versions. The patches for a new
|
|
kubernetes version will be structure as follows (using an upgrade from
|
|
v1.16.4 to v1.17.1 as an example):
|
|
|
|
* PATCH.X
|
|
|
|
* software patches for sysinv component
|
|
* contains metadata for new kubernetes version - for example:
|
|
|
|
* version=v1.17.1
|
|
* upgrade_from=v1.16.4
|
|
* applied_patches=PATCH.Y
|
|
* available_patches=PATCH.Z
|
|
|
|
* PATCH.Y
|
|
|
|
* patches kubeadm RPM to version 1.17.1
|
|
* patch metadata:
|
|
|
|
* requires: PATCH.X
|
|
* pre-apply: running kube-apiservers are >= 1.16.4
|
|
* pre-remove: running kube-apiservers are <= 1.16.4
|
|
|
|
* PATCH.Z
|
|
|
|
* patches kubelet and kubectl RPMs to version 1.17.1
|
|
* patch metadata:
|
|
|
|
* requires: PATCH.Y
|
|
* pre-apply: running kube-apiservers are >= 1.17.1
|
|
* pre-remove: running kube-apiservers are <= 1.17.1
|
|
|
|
The following is a summary of the steps the user will take when performing
|
|
a kubernetes upgrade (using an upgrade from v1.16.4 to v1.17.1 as an
|
|
example). For each step, a summary of the actions the system will perform
|
|
is given.
|
|
|
|
1. **Upload/apply/install metadata patch**
|
|
|
|
This is PATCH.X in the example above. The existing "sw-patch" CLIs will be
|
|
used.
|
|
|
|
2. **List available kubernetes versions**
|
|
|
|
::
|
|
|
|
# system kube-version-list
|
|
+---------+--------+-----------+
|
|
| Version | Target | State |
|
|
+---------+--------+-----------+
|
|
| v1.16.4 | True | active |
|
|
| v1.17.1 | False | available |
|
|
+---------+--------+-----------+
|
|
|
|
This list comes from metadata in the sysinv component (updated by PATCH.X).
|
|
The fields are:
|
|
|
|
* Target: denotes version currently selected for installation
|
|
* States:
|
|
|
|
* active: version is running everywhere
|
|
* partial: version is running somewhere
|
|
* available: version that can be upgraded to
|
|
|
|
The state must be calculated at runtime by querying the kubernetes
|
|
component versions running on each node.
|
|
|
|
3. **Upload/apply/install kubeadm patch**
|
|
|
|
This is PATCH.Y in the example above. The existing "sw-patch" CLIs will be
|
|
used. The patch pre-apply script verifies that all kube-apiservers
|
|
versions are >= 1.16.4.
|
|
|
|
4. **Upload (but don't apply) kubelet/kubectl patch**
|
|
|
|
This is PATCH.Z in the example above. The existing "sw-patch" CLIs will be
|
|
used.
|
|
|
|
5. **Start kubernetes upgrade**
|
|
|
|
::
|
|
|
|
# system kube-upgrade-start v1.17.1
|
|
+-------------------+-------------------+
|
|
| Property | Value |
|
|
+-------------------+-------------------+
|
|
| from_version | v1.16.4 |
|
|
| to_version | v1.17.1 |
|
|
| state | upgrade-started |
|
|
+-------------------+-------------------+
|
|
|
|
This will do semantic checks for applied/available patches, upgrade path,
|
|
application support for the new kubernetes version, etc...
|
|
|
|
The states will include:
|
|
|
|
* upgrade-started: semantic checks passed, upgrade started
|
|
* upgrading-first-master: first master node control plane upgrade in progress
|
|
* upgraded-first-master: first master node control plane upgrade complete
|
|
* upgrading-networking: networking plugin upgrade in progress
|
|
* upgraded-networking: networking plugin upgrade complete
|
|
* upgrading-second-master: second master node control plane upgrade in
|
|
progress
|
|
* upgraded-second-master: second master node control plane upgrade complete
|
|
* upgrading-kubelets: kubelet upgrades in progress
|
|
* upgrade-complete: all nodes upgraded
|
|
* upgrade-failed: upgrade has failed
|
|
|
|
6. **Show kubernetes upgrade status for hosts**
|
|
|
|
::
|
|
|
|
# system kube-host-upgrade-list
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
| id | hostname | personality | target_version | control_plane_version | kubelet_version | status |
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
| 1 | controller-0 | controller | v1.16.4 | v1.16.4 | v1.16.4 | |
|
|
| 2 | controller-1 | controller | v1.16.4 | v1.16.4 | v1.16.4 | |
|
|
| 3 | compute-0 | worker | v1.16.4 | N/A | v1.16.4 | |
|
|
| 4 | compute-1 | worker | v1.16.4 | N/A | v1.16.4 | |
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
|
|
* The control_plane_version must be calculated at runtime by examining the
|
|
version of the image associated with each control plane pod on each
|
|
controller.
|
|
* The kubelet_version must be calculated at runtime based on the kubelet
|
|
version running on each node (i.e. kubectl get nodes).
|
|
* The status field will indicate the current action in progress (e.g.
|
|
upgrading-control-plane)
|
|
* The data will be retrieved by the sysinv-api, using the kubernetes REST
|
|
API.
|
|
|
|
7. **Upgrade control plane on first controller**
|
|
|
|
::
|
|
|
|
# system kube-host-upgrade control-plane controller-0
|
|
+-----------------------+-------------------+
|
|
| Property | Value |
|
|
+-----------------------+-------------------+
|
|
| target_version | v1.16.1 |
|
|
| control_plane_version | v1.15.3 |
|
|
| kubelet_version | v1.15.3 |
|
|
| status | |
|
|
+-----------------------+-------------------+
|
|
|
|
* Either controller can be upgraded first
|
|
* The control plane upgrade involves the following steps:
|
|
|
|
* upgrade kubernetes control plane components (must be done locally):
|
|
|
|
* docker login <repo>
|
|
* kubeadm config images pull --kubernetes-version <version> --image-repository=<repo>
|
|
* docker logout <repo>
|
|
* kubeadm upgrade apply <version>
|
|
|
|
* update affinity for coredns pod (done through kubernetes API)
|
|
|
|
* The local upgrade actions will be done by applying a runtime puppet
|
|
manifest on the host.
|
|
|
|
8. **Show kubernetes upgrade status**
|
|
|
|
::
|
|
|
|
# system kube-upgrade-show
|
|
+-------------------+-------------------------+
|
|
| Property | Value |
|
|
+-------------------+-------------------------+
|
|
| from_version | v1.16.4 |
|
|
| to_version | v1.17.1 |
|
|
| state | upgraded-first-master |
|
|
+-------------------+-------------------------+
|
|
|
|
9. **Upgrade networking**
|
|
|
|
::
|
|
|
|
# system kube-upgrade-networking
|
|
+-------------------+----------------------+
|
|
| Property | Value |
|
|
+-------------------+----------------------+
|
|
| from_version | v1.16.4 |
|
|
| to_version | v1.17.1 |
|
|
| state | upgrading-networking |
|
|
+-------------------+----------------------+
|
|
|
|
* The networking upgrade involves the following steps:
|
|
|
|
* upgrade calico/multus/sriov if necessary (done through Ansible and
|
|
kubernetes API)
|
|
|
|
* In the future, StarlingX may support different types of networking (e.g.
|
|
tungsten). The upgrade networking step would perform the steps required
|
|
for whatever networking was installed.
|
|
|
|
10. **Upgrade control plane on second controller**
|
|
|
|
::
|
|
|
|
# system kube-host-upgrade control-plane controller-1
|
|
+-----------------------+-------------------+
|
|
| Property | Value |
|
|
+-----------------------+-------------------+
|
|
| target_version | v1.17.1 |
|
|
| control_plane_version | v1.16.4 |
|
|
| kubelet_version | v1.16.4 |
|
|
| status | |
|
|
+-----------------------+-------------------+
|
|
|
|
* The control plane upgrade involves the following steps:
|
|
|
|
* upgrade kubernetes control plane components (must be done locally):
|
|
|
|
* docker login <repo>
|
|
* kubeadm config images pull --kubernetes-version <version> --image-repository=<repo>
|
|
* docker logout <repo>
|
|
* kubeadm upgrade node
|
|
|
|
11. **Show kubernetes upgrade status**
|
|
|
|
::
|
|
|
|
# system kube-upgrade-show
|
|
+-------------------+--------------------------+
|
|
| Property | Value |
|
|
+-------------------+--------------------------+
|
|
| from_version | v1.16.4 |
|
|
| to_version | v1.17.1 |
|
|
| state | upgraded-second-master |
|
|
+-------------------+--------------------------+
|
|
|
|
12. **Show kubernetes upgrade status for hosts**
|
|
|
|
::
|
|
|
|
# system kube-host-upgrade-list
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
| id | hostname | personality | target_version | control_plane_version | kubelet_version | status |
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
| 1 | controller-0 | controller | v1.17.1 | v1.16.4 | v1.16.4 | |
|
|
| 2 | controller-1 | controller | v1.17.1 | v1.16.4 | v1.16.4 | |
|
|
| 3 | compute-0 | worker | v1.16.4 | N/A | v1.16.4 | |
|
|
| 4 | compute-1 | worker | v1.16.4 | N/A | v1.16.4 | |
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
|
|
13. **Apply/install kubelet/kubectl patch**
|
|
|
|
This is PATCH.Z in the example above. The existing "sw-patch" CLIs will be
|
|
used. This will place the v1.17.1 kubelet binary on each host, but will
|
|
not restart kubelet.
|
|
|
|
14. **Upgrade kubelet on each controller**
|
|
|
|
The first controller will first be locked using the existing "system
|
|
host-lock" CLI (either controller can be done first). This results in
|
|
services being migrated off the host and applies the NoExecute taint, which
|
|
will evict any pods that can be evicted.
|
|
|
|
The kubelet is then upgraded:::
|
|
|
|
# system kube-host-upgrade kubelet controller-<n>
|
|
+-----------------------+-------------------+
|
|
| Property | Value |
|
|
+-----------------------+-------------------+
|
|
| target_version | v1.17.1 |
|
|
| control_plane_version | v1.17.1 |
|
|
| kubelet_version | v1.16.4 |
|
|
| status | |
|
|
+-----------------------+-------------------+
|
|
|
|
* The kubelet upgrade involves the following steps:
|
|
|
|
* restart kubelet (must be done locally)
|
|
|
|
The controller is then unlocked using the existing "system host-unlock" CLI.
|
|
The kubelet on the second controller is then upgraded in the same way.
|
|
|
|
15. **Show kubernetes upgrade status**
|
|
|
|
::
|
|
|
|
# system kube-upgrade-show
|
|
+-------------------+--------------------------+
|
|
| Property | Value |
|
|
+-------------------+--------------------------+
|
|
| from_version | v1.16.4 |
|
|
| to_version | v1.17.1 |
|
|
| state | upgrading-kubelets |
|
|
+-------------------+--------------------------+
|
|
|
|
16. **Show kubernetes upgrade status for hosts**
|
|
|
|
::
|
|
|
|
# system kube-host-upgrade-list
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
| id | hostname | personality | target_version | control_plane_version | kubelet_version | status |
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
| 1 | controller-0 | controller | v1.17.1 | v1.17.1 | v1.17.1 | |
|
|
| 2 | controller-1 | controller | v1.17.1 | v1.17.1 | v1.17.1 | |
|
|
| 3 | compute-0 | worker | v1.16.4 | N/A | v1.16.4 | |
|
|
| 4 | compute-1 | worker | v1.16.4 | N/A | v1.16.4 | |
|
|
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
|
|
|
17. **Upgrade kubelet on all worker hosts**
|
|
|
|
Each worker host will first be locked using the existing "system
|
|
host-lock" CLI (worker hosts can be done in any order). This results in
|
|
services being migrated off the host and applies the NoExecute taint, which
|
|
will evict any pods that can be evicted.
|
|
|
|
The kubelet is then upgraded:::
|
|
|
|
# system kube-host-upgrade kubelet worker-<n>
|
|
+-----------------------+-------------------+
|
|
| Property | Value |
|
|
+-----------------------+-------------------+
|
|
| target_version | v1.17.1 |
|
|
| control_plane_version | v1.17.1 |
|
|
| kubelet_version | v1.16.4 |
|
|
| status | |
|
|
+-----------------------+-------------------+
|
|
|
|
* The kubelet upgrade involves the following steps (must be done locally):
|
|
|
|
* download new pause image if the version has changed
|
|
* kubeadm upgrade node
|
|
* restart kubelet
|
|
|
|
The worker is then unlocked using the existing "system host-unlock" CLI.
|
|
Multiple worker hosts can be upgraded at the same time, as long as there
|
|
is sufficient capacity remaining on other worker hosts.
|
|
|
|
18. **Show kubernetes upgrade status**
|
|
|
|
::
|
|
|
|
# system kube-upgrade-show
|
|
+-------------------+--------------------------+
|
|
| Property | Value |
|
|
+-------------------+--------------------------+
|
|
| from_version | v1.16.4 |
|
|
| to_version | v1.17.1 |
|
|
| state | upgrade-complete |
|
|
+-------------------+--------------------------+
|
|
|
|
**Failure Handling**
|
|
|
|
* When a failure happens and cannot be resolved without manual intervention,
|
|
the upgrade state will be set to upgrade-failed. The "kubeadm upgrade"
|
|
commands will fall back to the previous configuration if (for example) image
|
|
downloads fail.
|
|
* To recover, the user will resolve the issue that caused the failure and then
|
|
re-attempt the upgrade (this will require a "system kube-upgrade-resume"
|
|
command). Based on the kubernetes versions running on each host, the system
|
|
will reset the upgrade state to the right point and the upgrade will resume.
|
|
|
|
**Health Checks**
|
|
|
|
* In order to ensure the health and stability of the system we likely will do
|
|
health checks both before allowing a kubernetes upgrade to start and then as
|
|
each upgrade CLI is run.
|
|
* The health checks will include:
|
|
|
|
* basic system health (i.e. system health-query)
|
|
* new kubernetes specific checks - for example:
|
|
|
|
* verify that all kubernetes control plane pods are running
|
|
* verify that all kubernetes applications are fully applied
|
|
|
|
**Interactions with container applications**
|
|
|
|
* Before starting an upgrade, we also need to check that all installed
|
|
applications are compatible with the new kubernetes version. Ideally this
|
|
checking should be done by invoking a plugin provided by each application.
|
|
* When a kubernetes upgrade is in progress, we will prevent container
|
|
application operations (e.g. system application-apply/remove/update). This
|
|
will be done by introducing semantic checks in these APIs.
|
|
* When a kubernetes upgrade is in progress, we will prevent helm-override
|
|
operations (e.g. system helm-override-update/delete). These operations can
|
|
trigger the applications to be re-applied, which we wouldn't want to do
|
|
during a kubernetes upgrade. This will be done by introducing semantic
|
|
checks in these APIs.
|
|
|
|
Alternatives
|
|
------------
|
|
|
|
Given that StarlingX is using kubeadm to install and manage kubernetes, this
|
|
tool is the only reasonable choice for upgrading kubernetes.
|
|
|
|
The alternative to the approach described above would be to have the user do
|
|
the kubernetes upgrades by running the docker and kubeadm commands directly.
|
|
This approach would be very complex and error prone and would not be
|
|
acceptable to users of StarlingX.
|
|
|
|
Data model impact
|
|
-----------------
|
|
|
|
The following new tables in the sysinv DB will be required:
|
|
|
|
* kube_host_upgrade:
|
|
|
|
* created/updated/deleted_at: as per other tables
|
|
* id: as per other tables
|
|
* uuid: as per other tables
|
|
* forhostid: foreign key (i_host.id)
|
|
* target_version: character (255)
|
|
* status: character (255)
|
|
|
|
* kube_upgrade:
|
|
|
|
* created/updated/deleted_at: as per other tables
|
|
* id: as per other tables
|
|
* uuid: as per other tables
|
|
* from_version: character (255)
|
|
* to_version: character (255)
|
|
* state: character (255)
|
|
|
|
REST API impact
|
|
---------------
|
|
|
|
This impacts the sysinv REST API:
|
|
|
|
* The new resource /kube_versions is added.
|
|
|
|
* URLS:
|
|
|
|
* /v1/kube_versions
|
|
|
|
* Request Methods:
|
|
|
|
* GET /v1/kube_versions
|
|
|
|
* Returns all kube_versions known to the system
|
|
|
|
* Response body example::
|
|
|
|
{"kube_versions": [{"state": "active",
|
|
"version": "v1.16.4",
|
|
"target": true}]}
|
|
|
|
* GET /v1/kube_versions/{version}
|
|
|
|
* Returns details of specified kube_version
|
|
|
|
* Response body example::
|
|
|
|
{"target": true,
|
|
"upgrade_from": ["v1.16.4"],
|
|
"downgrade_to": [],
|
|
"applied_patches": ["PATCH.Y"],
|
|
"state": "active",
|
|
"version": "v1.17.1",
|
|
"available_patches": ["PATCH.Z"]}
|
|
|
|
* The new resource /kube_upgrade is added.
|
|
|
|
* URLS:
|
|
|
|
* /v1/kube_upgrade
|
|
|
|
* Request Methods:
|
|
|
|
* POST /v1/kube_upgrade
|
|
|
|
* Creates (starts) a new kube_upgrade
|
|
|
|
* Request body example::
|
|
|
|
{"to_version": "v1.17.1"}
|
|
|
|
* Response body example::
|
|
|
|
{"from_version": "v1.16.4",
|
|
"to_version": "v1.17.1",
|
|
"state": "upgrade-started",
|
|
"uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
|
|
"created_at": "2019-10-25T12:04:10.372399+00:00",
|
|
"updated_at": "2019-10-25T12:04:10.372399+00:00"}
|
|
|
|
* GET /v1/kube_upgrade
|
|
|
|
* Returns the current kube_upgrade
|
|
|
|
* Response body example::
|
|
|
|
{"from_version": "v1.16.4",
|
|
"to_version": "v1.17.1",
|
|
"state": "upgrade-started",
|
|
"uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
|
|
"created_at": "2019-10-25T12:04:10.372399+00:00",
|
|
"updated_at": "2019-10-25T14:45:43.252964+00:00"}
|
|
|
|
* PATCH /v1/kube_upgrade
|
|
|
|
* Modifies the current kube_upgrade. Used to update the state of the
|
|
upgrade (e.g. to upgrade networking).
|
|
|
|
* Request body example::
|
|
|
|
{"state": "upgrading-networking"}
|
|
|
|
* Response body example::
|
|
|
|
{"from_version": "v1.16.4",
|
|
"to_version": "v1.17.1",
|
|
"state": "upgrade-started",
|
|
"uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
|
|
"created_at": "2019-10-25T12:04:10.372399+00:00",
|
|
"updated_at": "2019-10-25T14:45:43.252964+00:00"}
|
|
|
|
* DELETE /v1/kube_upgrade
|
|
|
|
* Deletes the current kube_upgrade (after it is completed)
|
|
|
|
* The existing resource /ihosts is modified to add new actions.
|
|
|
|
* URLS:
|
|
|
|
* /v1/ihosts/<hostid>
|
|
|
|
* Request Methods:
|
|
|
|
* POST /v1/ihosts/<hostid>/kube_upgrade_control_plane
|
|
|
|
* Upgrades the control plane on the specified host
|
|
|
|
* Response body example::
|
|
|
|
{"id": "4",
|
|
"hostname": "controller-1",
|
|
"personality": "controller",
|
|
"target_version": "v1.17.1",
|
|
"control_plane_version": "v1.16.4.",
|
|
"kubelet_version": "v1.16.4",
|
|
"status": ""}
|
|
|
|
* POST /v1/ihosts/<hostid>/kube_upgrade_kubelet
|
|
|
|
* Upgrades the kubelet on the specified host
|
|
|
|
* Response body example::
|
|
|
|
{"id": "4",
|
|
"hostname": "controller-1",
|
|
"personality": "controller",
|
|
"target_version": "v1.17.1",
|
|
"control_plane_version": "v1.17.1.",
|
|
"kubelet_version": "v1.16.4",
|
|
"status": ""}
|
|
|
|
Security impact
|
|
---------------
|
|
|
|
This story is providing a mechanism to upgrade kubernetes from one version
|
|
to another. It does not introduce any additional security impacts above
|
|
what is already there regarding the initial deployment of kubernetes.
|
|
|
|
Other end user impact
|
|
---------------------
|
|
|
|
End users will typically perform kubernetes upgrades using the sysinv (i.e.
|
|
system) CLI. The new CLI commands are shown in the `Proposed change`_ section
|
|
above.
|
|
|
|
Performance Impact
|
|
------------------
|
|
|
|
When a kubernetes upgrade is in progress, each host must be taken out of
|
|
service in order to upgrade the kubelet. This is necessary because running
|
|
containers can be adversely impacted by the restart of the kubelet. The
|
|
user must ensure that there is enough capacity in the system to handle
|
|
the removal from service of one (or more) hosts as the kubelet on each
|
|
host is upgraded.
|
|
|
|
Other deployer impact
|
|
---------------------
|
|
|
|
Deployers will now be able to upgrade kubernetes on a running system.
|
|
|
|
Developer impact
|
|
----------------
|
|
|
|
Developers working on the StarlingX components that manage container
|
|
applications may need to be aware that certain operations should be
|
|
prevented when a kubernetes upgrade is in progress. This is discussed in the
|
|
`Proposed change`_ section above.
|
|
|
|
Upgrade impact
|
|
--------------
|
|
|
|
Kubernetes upgrades are independent from the upgrade of the StarlingX platform.
|
|
However, when StarlingX platform upgrades are supported, checks must be put
|
|
in place to ensure that the kubernetes version is not allowed to change due
|
|
to a platform upgrade. In effect, the system must be upgraded to the same
|
|
version of kubernetes as is packaged in the new platform release, to ensure
|
|
this is the case. This will be enforced through semantic checking in the
|
|
platform upgrade APIs.
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary assignee:
|
|
|
|
* Bart Wensley (bartwensley)
|
|
|
|
Other contributors:
|
|
|
|
* Al Bailey (albailey)
|
|
* Don Penney (dpenney)
|
|
* Kevin Smith (kevin.smith.wrs)
|
|
|
|
Repos Impacted
|
|
--------------
|
|
|
|
* config
|
|
* integ
|
|
* update
|
|
|
|
Work Items
|
|
----------
|
|
|
|
Sysinv:
|
|
|
|
* Define new metadata for kubernetes versions
|
|
* DB API for new tables
|
|
* kube-version-list/show CLI/API
|
|
|
|
* basic infrastructure
|
|
* calculate state for each known version
|
|
|
|
* kube-upgrade-start/show CLI/API
|
|
|
|
* basic infrastructure
|
|
* semantic checks for upgrade start
|
|
|
|
* applied/available patches
|
|
* installed applications support new kubernetes version
|
|
* tiller/armada images support new kubernetes version
|
|
|
|
* kube-host-upgrade-list/show CLI/API
|
|
|
|
* basic infrastructure
|
|
* calculate versions for each host
|
|
|
|
* kube-host-upgrade control-plane CLI/API
|
|
|
|
* basic infrastructure
|
|
* semantic checks
|
|
* conductor RPC/implementation (trigger puppet manifest apply, wait for
|
|
completion, update coredns affinity, etc...)
|
|
|
|
* kube-upgrade-networking CLI/API
|
|
|
|
* basic infrastructure
|
|
* semantic checks
|
|
* conductor RPC/implementation (trigger playbook apply, wait for completion,
|
|
etc...)
|
|
|
|
* kube-host-upgrade kubelet CLI/API
|
|
|
|
* basic infrastructure
|
|
* semantic checks
|
|
* conductor RPC/implementation (trigger puppet manifest apply, wait for
|
|
completion, etc...)
|
|
|
|
* kube-upgrade-resume CLI/API
|
|
|
|
* basic infrastructure
|
|
* semantic checks
|
|
* conductor RPC/implementation (determine what state the upgrade should be
|
|
in, etc...)
|
|
|
|
* New KubeOperator functions, including:
|
|
|
|
* retrieve versions of each control plane component
|
|
* retrieve versions of each kubelet
|
|
* utility to roll up versions into overall kubernetes version
|
|
* update affinity (for coredns pod)
|
|
|
|
* Kubernetes specific health checks
|
|
|
|
* add to existing health-query CLI
|
|
* verify all control plane pods are running/healthy
|
|
* verify that all applications are fully applied
|
|
* figure out what else we should check
|
|
|
|
* Add semantic checks to existing APIs
|
|
|
|
* application-apply/remove/etc... - prevent when kubernetes upgrade in
|
|
progress
|
|
* helm-override-update/etc... - prevent when kubernetes upgrade in progress
|
|
|
|
Ansible:
|
|
|
|
* enhance upgrade networking playbook to support applying different manifests
|
|
based on what kubernetes version is running
|
|
|
|
Puppet:
|
|
|
|
* kubernetes runtime manifest for control plane upgrade
|
|
* kubernetes runtime manifest for kubelet upgrade
|
|
|
|
Patching:
|
|
|
|
* Pre-apply/remove scripts to check running kubernetes version
|
|
|
|
|
|
Dependencies
|
|
============
|
|
|
|
None
|
|
|
|
|
|
Testing
|
|
=======
|
|
|
|
Kubernetes upgrades must be tested in the following StarlingX configurations:
|
|
|
|
* AIO-SX
|
|
* AIO-DX
|
|
* Standard with controller storage
|
|
* Standard with dedicated storage
|
|
* Distributed cloud
|
|
|
|
The testing can be performed on hardware or virtual environments.
|
|
|
|
|
|
Documentation Impact
|
|
====================
|
|
|
|
New user end user documentation will be required to describe how kubernetes
|
|
upgrades should be done. The config API reference will also need updates.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] https://kubernetes.io/docs/setup/release/version-skew-policy
|
|
.. [2] https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade
|
|
|
|
|
|
History
|
|
=======
|
|
|
|
.. list-table:: Revisions
|
|
:header-rows: 1
|
|
|
|
* - Release Name
|
|
- Description
|
|
* - stx-4.0
|
|
- Introduced
|