Kubernetes Upgrades
This story will provide a mechanism to upgrade the kubernetes components on a running StarlingX system. This is required to allow bug fixes to be delivered for kubernetes components and to allow upgrades between kubernetes minor versions. Change-Id: I4ec52bbfb651989fbcff89941e71e91c242c76a5 Signed-off-by: Bart Wensley <barton.wensley@windriver.com>
This commit is contained in:
parent
ec346d96fb
commit
51770771cc
|
@ -1,8 +0,0 @@
|
|||
.. placeholder:
|
||||
|
||||
===========
|
||||
Placeholder
|
||||
===========
|
||||
|
||||
This file is a placeholder and should be deleted when the first spec is moved
|
||||
to this directory.
|
|
@ -0,0 +1,838 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License. http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
|
||||
===================
|
||||
Kubernetes Upgrades
|
||||
===================
|
||||
|
||||
Storyboard:
|
||||
https://storyboard.openstack.org/#!/story/2006781
|
||||
|
||||
This story will provide a mechanism to upgrade the kubernetes components on
|
||||
a running StarlingX system. This is required to allow bug fixes to be
|
||||
delivered for kubernetes components and to allow upgrades between kubernetes
|
||||
minor versions.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
The kubernetes components used by StarlingX are delivered in two ways:
|
||||
|
||||
* RPMs (e.g. kubeadm, kubelet)
|
||||
* docker images (e.g. kube-apiserver, kube-controller-manager)
|
||||
|
||||
StarlingX must provide a mechanism to allow bug fixes to be delivered for
|
||||
both these components. In addition, the kubernetes version skew policy [1]_
|
||||
mandates that kubernetes minor release cannot be skipped. Since these minor
|
||||
releases occur approximately every three months and the StarlingX release
|
||||
cycle is approximately six months long, StarlingX must provide a mechanism to
|
||||
support kubernetes minor version upgrades on a running system.
|
||||
|
||||
Kubernetes versions are in the format major.minor.patch (e.g. 1.16.2). Bug
|
||||
fixes are delivered with a new patch version and releases are delivered with
|
||||
a new minor version. In either case, since the StarlingX kubernetes
|
||||
components are managed using kubeadm, a specific upgrade procedure must be
|
||||
followed [2]_. The kubeadm tool provides commands that help perform these
|
||||
upgrades, but the upgrade also involves other actions such as installing
|
||||
new RPMs, pulling images, restarting processes, etc... These steps must all
|
||||
be performed in a specific sequence.
|
||||
|
||||
In order to provide a robust and simple kubernetes upgrade experience for
|
||||
users of StarlingX, the entire process must be automated as much as possible
|
||||
and controls must be in place to ensure the steps are followed in the right
|
||||
order.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
* End User wants to upgrade to a new patch version of kubernetes on a running
|
||||
StarlingX system with minimal impact to running applications.
|
||||
* End User wants to upgrade to a new minor version of kubernetes on a running
|
||||
StarlingX system with minimal impact to running applications.
|
||||
* End User wants to downgrade to a previous patch version of kubernetes on a
|
||||
running system, because they experienced an issue with the new patch version.
|
||||
Note: downgrading between minor versions is not supported.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
StarlingX will only support specific kubernetes versions and upgrades paths.
|
||||
For each supported kubernetes version we will track (via metadata in the
|
||||
sysinv component):
|
||||
|
||||
* version (e.g. v1.16.1)
|
||||
* upgrade_from (e.g. v1.15.3)
|
||||
|
||||
Specifies which kubernetes versions can upgrade to this version.
|
||||
* downgrade_to (e.g. none)
|
||||
|
||||
Specifies which kubernetes versions this version can downgrade to.
|
||||
* applied_patches (e.g. PATCH.10, KUBE_PATCH.1)
|
||||
|
||||
These patches must be applied before the upgrade starts.
|
||||
* available_patches (e.g. KUBE_PATCH.2):
|
||||
|
||||
These patches must be available (but not applied) before the upgrade starts.
|
||||
|
||||
The existing patching mechanism will be used to deliver metadata and software
|
||||
to enable upgrades for specific kubernetes versions. The patches for a new
|
||||
kubernetes version will be structure as follows (using an upgrade from
|
||||
v1.16.4 to v1.17.1 as an example):
|
||||
|
||||
* PATCH.X
|
||||
|
||||
* software patches for sysinv component
|
||||
* contains metadata for new kubernetes version - for example:
|
||||
|
||||
* version=v1.17.1
|
||||
* upgrade_from=v1.16.4
|
||||
* applied_patches=PATCH.Y
|
||||
* available_patches=PATCH.Z
|
||||
|
||||
* PATCH.Y
|
||||
|
||||
* patches kubeadm RPM to version 1.17.1
|
||||
* patch metadata:
|
||||
|
||||
* requires: PATCH.X
|
||||
* pre-apply: running kube-apiservers are >= 1.16.4
|
||||
* pre-remove: running kube-apiservers are <= 1.16.4
|
||||
|
||||
* PATCH.Z
|
||||
|
||||
* patches kubelet and kubectl RPMs to version 1.17.1
|
||||
* patch metadata:
|
||||
|
||||
* requires: PATCH.Y
|
||||
* pre-apply: running kube-apiservers are >= 1.17.1
|
||||
* pre-remove: running kube-apiservers are <= 1.17.1
|
||||
|
||||
The following is a summary of the steps the user will take when performing
|
||||
a kubernetes upgrade (using an upgrade from v1.16.4 to v1.17.1 as an
|
||||
example). For each step, a summary of the actions the system will perform
|
||||
is given.
|
||||
|
||||
1. **Upload/apply/install metadata patch**
|
||||
|
||||
This is PATCH.X in the example above. The existing "sw-patch" CLIs will be
|
||||
used.
|
||||
|
||||
2. **List available kubernetes versions**
|
||||
|
||||
::
|
||||
|
||||
# system kube-version-list
|
||||
+---------+--------+-----------+
|
||||
| Version | Target | State |
|
||||
+---------+--------+-----------+
|
||||
| v1.16.4 | True | active |
|
||||
| v1.17.1 | False | available |
|
||||
+---------+--------+-----------+
|
||||
|
||||
This list comes from metadata in the sysinv component (updated by PATCH.X).
|
||||
The fields are:
|
||||
|
||||
* Target: denotes version currently selected for installation
|
||||
* States:
|
||||
|
||||
* active: version is running everywhere
|
||||
* partial: version is running somewhere
|
||||
* available: version that can be upgraded to
|
||||
|
||||
The state must be calculated at runtime by querying the kubernetes
|
||||
component versions running on each node.
|
||||
|
||||
3. **Upload/apply/install kubeadm patch**
|
||||
|
||||
This is PATCH.Y in the example above. The existing "sw-patch" CLIs will be
|
||||
used. The patch pre-apply script verifies that all kube-apiservers
|
||||
versions are >= 1.16.4.
|
||||
|
||||
4. **Upload (but don't apply) kubelet/kubectl patch**
|
||||
|
||||
This is PATCH.Z in the example above. The existing "sw-patch" CLIs will be
|
||||
used.
|
||||
|
||||
5. **Start kubernetes upgrade**
|
||||
|
||||
::
|
||||
|
||||
# system kube-upgrade-start v1.17.1
|
||||
+-------------------+-------------------+
|
||||
| Property | Value |
|
||||
+-------------------+-------------------+
|
||||
| from_version | v1.16.4 |
|
||||
| to_version | v1.17.1 |
|
||||
| state | upgrade-started |
|
||||
+-------------------+-------------------+
|
||||
|
||||
This will do semantic checks for applied/available patches, upgrade path,
|
||||
application support for the new kubernetes version, etc...
|
||||
|
||||
The states will include:
|
||||
|
||||
* upgrade-started: semantic checks passed, upgrade started
|
||||
* upgrading-first-master: first master node control plane upgrade in progress
|
||||
* upgraded-first-master: first master node control plane upgrade complete
|
||||
* upgrading-networking: networking plugin upgrade in progress
|
||||
* upgraded-networking: networking plugin upgrade complete
|
||||
* upgrading-second-master: second master node control plane upgrade in
|
||||
progress
|
||||
* upgraded-second-master: second master node control plane upgrade complete
|
||||
* upgrading-kubelets: kubelet upgrades in progress
|
||||
* upgrade-complete: all nodes upgraded
|
||||
* upgrade-failed: upgrade has failed
|
||||
|
||||
6. **Show kubernetes upgrade status for hosts**
|
||||
|
||||
::
|
||||
|
||||
# system kube-host-upgrade-list
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
| id | hostname | personality | target_version | control_plane_version | kubelet_version | status |
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
| 1 | controller-0 | controller | v1.16.4 | v1.16.4 | v1.16.4 | |
|
||||
| 2 | controller-1 | controller | v1.16.4 | v1.16.4 | v1.16.4 | |
|
||||
| 3 | compute-0 | worker | v1.16.4 | N/A | v1.16.4 | |
|
||||
| 4 | compute-1 | worker | v1.16.4 | N/A | v1.16.4 | |
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
|
||||
* The control_plane_version must be calculated at runtime by examining the
|
||||
version of the image associated with each control plane pod on each
|
||||
controller.
|
||||
* The kubelet_version must be calculated at runtime based on the kubelet
|
||||
version running on each node (i.e. kubectl get nodes).
|
||||
* The status field will indicate the current action in progress (e.g.
|
||||
upgrading-control-plane)
|
||||
* The data will be retrieved by the sysinv-api, using the kubernetes REST
|
||||
API.
|
||||
|
||||
7. **Upgrade control plane on first controller**
|
||||
|
||||
::
|
||||
|
||||
# system kube-host-upgrade control-plane controller-0
|
||||
+-----------------------+-------------------+
|
||||
| Property | Value |
|
||||
+-----------------------+-------------------+
|
||||
| target_version | v1.16.1 |
|
||||
| control_plane_version | v1.15.3 |
|
||||
| kubelet_version | v1.15.3 |
|
||||
| status | |
|
||||
+-----------------------+-------------------+
|
||||
|
||||
* Either controller can be upgraded first
|
||||
* The control plane upgrade involves the following steps:
|
||||
|
||||
* upgrade kubernetes control plane components (must be done locally):
|
||||
|
||||
* docker login <repo>
|
||||
* kubeadm config images pull --kubernetes-version <version> --image-repository=<repo>
|
||||
* docker logout <repo>
|
||||
* kubeadm upgrade apply <version>
|
||||
|
||||
* update affinity for coredns pod (done through kubernetes API)
|
||||
|
||||
* The local upgrade actions will be done by applying a runtime puppet
|
||||
manifest on the host.
|
||||
|
||||
8. **Show kubernetes upgrade status**
|
||||
|
||||
::
|
||||
|
||||
# system kube-upgrade-show
|
||||
+-------------------+-------------------------+
|
||||
| Property | Value |
|
||||
+-------------------+-------------------------+
|
||||
| from_version | v1.16.4 |
|
||||
| to_version | v1.17.1 |
|
||||
| state | upgraded-first-master |
|
||||
+-------------------+-------------------------+
|
||||
|
||||
9. **Upgrade networking**
|
||||
|
||||
::
|
||||
|
||||
# system kube-upgrade-networking
|
||||
+-------------------+----------------------+
|
||||
| Property | Value |
|
||||
+-------------------+----------------------+
|
||||
| from_version | v1.16.4 |
|
||||
| to_version | v1.17.1 |
|
||||
| state | upgrading-networking |
|
||||
+-------------------+----------------------+
|
||||
|
||||
* The networking upgrade involves the following steps:
|
||||
|
||||
* upgrade calico/multus/sriov if necessary (done through Ansible and
|
||||
kubernetes API)
|
||||
|
||||
* In the future, StarlingX may support different types of networking (e.g.
|
||||
tungsten). The upgrade networking step would perform the steps required
|
||||
for whatever networking was installed.
|
||||
|
||||
10. **Upgrade control plane on second controller**
|
||||
|
||||
::
|
||||
|
||||
# system kube-host-upgrade control-plane controller-1
|
||||
+-----------------------+-------------------+
|
||||
| Property | Value |
|
||||
+-----------------------+-------------------+
|
||||
| target_version | v1.17.1 |
|
||||
| control_plane_version | v1.16.4 |
|
||||
| kubelet_version | v1.16.4 |
|
||||
| status | |
|
||||
+-----------------------+-------------------+
|
||||
|
||||
* The control plane upgrade involves the following steps:
|
||||
|
||||
* upgrade kubernetes control plane components (must be done locally):
|
||||
|
||||
* docker login <repo>
|
||||
* kubeadm config images pull --kubernetes-version <version> --image-repository=<repo>
|
||||
* docker logout <repo>
|
||||
* kubeadm upgrade node
|
||||
|
||||
11. **Show kubernetes upgrade status**
|
||||
|
||||
::
|
||||
|
||||
# system kube-upgrade-show
|
||||
+-------------------+--------------------------+
|
||||
| Property | Value |
|
||||
+-------------------+--------------------------+
|
||||
| from_version | v1.16.4 |
|
||||
| to_version | v1.17.1 |
|
||||
| state | upgraded-second-master |
|
||||
+-------------------+--------------------------+
|
||||
|
||||
12. **Show kubernetes upgrade status for hosts**
|
||||
|
||||
::
|
||||
|
||||
# system kube-host-upgrade-list
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
| id | hostname | personality | target_version | control_plane_version | kubelet_version | status |
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
| 1 | controller-0 | controller | v1.17.1 | v1.16.4 | v1.16.4 | |
|
||||
| 2 | controller-1 | controller | v1.17.1 | v1.16.4 | v1.16.4 | |
|
||||
| 3 | compute-0 | worker | v1.16.4 | N/A | v1.16.4 | |
|
||||
| 4 | compute-1 | worker | v1.16.4 | N/A | v1.16.4 | |
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
|
||||
13. **Apply/install kubelet/kubectl patch**
|
||||
|
||||
This is PATCH.Z in the example above. The existing "sw-patch" CLIs will be
|
||||
used. This will place the v1.17.1 kubelet binary on each host, but will
|
||||
not restart kubelet.
|
||||
|
||||
14. **Upgrade kubelet on each controller**
|
||||
|
||||
The first controller will first be locked using the existing "system
|
||||
host-lock" CLI (either controller can be done first). This results in
|
||||
services being migrated off the host and applies the NoExecute taint, which
|
||||
will evict any pods that can be evicted.
|
||||
|
||||
The kubelet is then upgraded:::
|
||||
|
||||
# system kube-host-upgrade kubelet controller-<n>
|
||||
+-----------------------+-------------------+
|
||||
| Property | Value |
|
||||
+-----------------------+-------------------+
|
||||
| target_version | v1.17.1 |
|
||||
| control_plane_version | v1.17.1 |
|
||||
| kubelet_version | v1.16.4 |
|
||||
| status | |
|
||||
+-----------------------+-------------------+
|
||||
|
||||
* The kubelet upgrade involves the following steps:
|
||||
|
||||
* restart kubelet (must be done locally)
|
||||
|
||||
The controller is then unlocked using the existing "system host-unlock" CLI.
|
||||
The kubelet on the second controller is then upgraded in the same way.
|
||||
|
||||
15. **Show kubernetes upgrade status**
|
||||
|
||||
::
|
||||
|
||||
# system kube-upgrade-show
|
||||
+-------------------+--------------------------+
|
||||
| Property | Value |
|
||||
+-------------------+--------------------------+
|
||||
| from_version | v1.16.4 |
|
||||
| to_version | v1.17.1 |
|
||||
| state | upgrading-kubelets |
|
||||
+-------------------+--------------------------+
|
||||
|
||||
16. **Show kubernetes upgrade status for hosts**
|
||||
|
||||
::
|
||||
|
||||
# system kube-host-upgrade-list
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
| id | hostname | personality | target_version | control_plane_version | kubelet_version | status |
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
| 1 | controller-0 | controller | v1.17.1 | v1.17.1 | v1.17.1 | |
|
||||
| 2 | controller-1 | controller | v1.17.1 | v1.17.1 | v1.17.1 | |
|
||||
| 3 | compute-0 | worker | v1.16.4 | N/A | v1.16.4 | |
|
||||
| 4 | compute-1 | worker | v1.16.4 | N/A | v1.16.4 | |
|
||||
+----+--------------+-------------+----------------+-----------------------+-----------------+-------------------+
|
||||
|
||||
17. **Upgrade kubelet on all worker hosts**
|
||||
|
||||
Each worker host will first be locked using the existing "system
|
||||
host-lock" CLI (worker hosts can be done in any order). This results in
|
||||
services being migrated off the host and applies the NoExecute taint, which
|
||||
will evict any pods that can be evicted.
|
||||
|
||||
The kubelet is then upgraded:::
|
||||
|
||||
# system kube-host-upgrade kubelet worker-<n>
|
||||
+-----------------------+-------------------+
|
||||
| Property | Value |
|
||||
+-----------------------+-------------------+
|
||||
| target_version | v1.17.1 |
|
||||
| control_plane_version | v1.17.1 |
|
||||
| kubelet_version | v1.16.4 |
|
||||
| status | |
|
||||
+-----------------------+-------------------+
|
||||
|
||||
* The kubelet upgrade involves the following steps (must be done locally):
|
||||
|
||||
* download new pause image if the version has changed
|
||||
* kubeadm upgrade node
|
||||
* restart kubelet
|
||||
|
||||
The worker is then unlocked using the existing "system host-unlock" CLI.
|
||||
Multiple worker hosts can be upgraded at the same time, as long as there
|
||||
is sufficient capacity remaining on other worker hosts.
|
||||
|
||||
18. **Show kubernetes upgrade status**
|
||||
|
||||
::
|
||||
|
||||
# system kube-upgrade-show
|
||||
+-------------------+--------------------------+
|
||||
| Property | Value |
|
||||
+-------------------+--------------------------+
|
||||
| from_version | v1.16.4 |
|
||||
| to_version | v1.17.1 |
|
||||
| state | upgrade-complete |
|
||||
+-------------------+--------------------------+
|
||||
|
||||
**Failure Handling**
|
||||
|
||||
* When a failure happens and cannot be resolved without manual intervention,
|
||||
the upgrade state will be set to upgrade-failed. The "kubeadm upgrade"
|
||||
commands will fall back to the previous configuration if (for example) image
|
||||
downloads fail.
|
||||
* To recover, the user will resolve the issue that caused the failure and then
|
||||
re-attempt the upgrade (this will require a "system kube-upgrade-resume"
|
||||
command). Based on the kubernetes versions running on each host, the system
|
||||
will reset the upgrade state to the right point and the upgrade will resume.
|
||||
|
||||
**Health Checks**
|
||||
|
||||
* In order to ensure the health and stability of the system we likely will do
|
||||
health checks both before allowing a kubernetes upgrade to start and then as
|
||||
each upgrade CLI is run.
|
||||
* The health checks will include:
|
||||
|
||||
* basic system health (i.e. system health-query)
|
||||
* new kubernetes specific checks - for example:
|
||||
|
||||
* verify that all kubernetes control plane pods are running
|
||||
* verify that all kubernetes applications are fully applied
|
||||
|
||||
**Interactions with container applications**
|
||||
|
||||
* Before starting an upgrade, we also need to check that all installed
|
||||
applications are compatible with the new kubernetes version. Ideally this
|
||||
checking should be done by invoking a plugin provided by each application.
|
||||
* When a kubernetes upgrade is in progress, we will prevent container
|
||||
application operations (e.g. system application-apply/remove/update). This
|
||||
will be done by introducing semantic checks in these APIs.
|
||||
* When a kubernetes upgrade is in progress, we will prevent helm-override
|
||||
operations (e.g. system helm-override-update/delete). These operations can
|
||||
trigger the applications to be re-applied, which we wouldn't want to do
|
||||
during a kubernetes upgrade. This will be done by introducing semantic
|
||||
checks in these APIs.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Given that StarlingX is using kubeadm to install and manage kubernetes, this
|
||||
tool is the only reasonable choice for upgrading kubernetes.
|
||||
|
||||
The alternative to the approach described above would be to have the user do
|
||||
the kubernetes upgrades by running the docker and kubeadm commands directly.
|
||||
This approach would be very complex and error prone and would not be
|
||||
acceptable to users of StarlingX.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
The following new tables in the sysinv DB will be required:
|
||||
|
||||
* kube_host_upgrade:
|
||||
|
||||
* created/updated/deleted_at: as per other tables
|
||||
* id: as per other tables
|
||||
* uuid: as per other tables
|
||||
* forhostid: foreign key (i_host.id)
|
||||
* target_version: character (255)
|
||||
* status: character (255)
|
||||
|
||||
* kube_upgrade:
|
||||
|
||||
* created/updated/deleted_at: as per other tables
|
||||
* id: as per other tables
|
||||
* uuid: as per other tables
|
||||
* from_version: character (255)
|
||||
* to_version: character (255)
|
||||
* state: character (255)
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
This impacts the sysinv REST API:
|
||||
|
||||
* The new resource /kube_versions is added.
|
||||
|
||||
* URLS:
|
||||
|
||||
* /v1/kube_versions
|
||||
|
||||
* Request Methods:
|
||||
|
||||
* GET /v1/kube_versions
|
||||
|
||||
* Returns all kube_versions known to the system
|
||||
|
||||
* Response body example::
|
||||
|
||||
{"kube_versions": [{"state": "active",
|
||||
"version": "v1.16.4",
|
||||
"target": true}]}
|
||||
|
||||
* GET /v1/kube_versions/{version}
|
||||
|
||||
* Returns details of specified kube_version
|
||||
|
||||
* Response body example::
|
||||
|
||||
{"target": true,
|
||||
"upgrade_from": ["v1.16.4"],
|
||||
"downgrade_to": [],
|
||||
"applied_patches": ["PATCH.Y"],
|
||||
"state": "active",
|
||||
"version": "v1.17.1",
|
||||
"available_patches": ["PATCH.Z"]}
|
||||
|
||||
* The new resource /kube_upgrade is added.
|
||||
|
||||
* URLS:
|
||||
|
||||
* /v1/kube_upgrade
|
||||
|
||||
* Request Methods:
|
||||
|
||||
* POST /v1/kube_upgrade
|
||||
|
||||
* Creates (starts) a new kube_upgrade
|
||||
|
||||
* Request body example::
|
||||
|
||||
{"to_version": "v1.17.1"}
|
||||
|
||||
* Response body example::
|
||||
|
||||
{"from_version": "v1.16.4",
|
||||
"to_version": "v1.17.1",
|
||||
"state": "upgrade-started",
|
||||
"uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
|
||||
"created_at": "2019-10-25T12:04:10.372399+00:00",
|
||||
"updated_at": "2019-10-25T12:04:10.372399+00:00"}
|
||||
|
||||
* GET /v1/kube_upgrade
|
||||
|
||||
* Returns the current kube_upgrade
|
||||
|
||||
* Response body example::
|
||||
|
||||
{"from_version": "v1.16.4",
|
||||
"to_version": "v1.17.1",
|
||||
"state": "upgrade-started",
|
||||
"uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
|
||||
"created_at": "2019-10-25T12:04:10.372399+00:00",
|
||||
"updated_at": "2019-10-25T14:45:43.252964+00:00"}
|
||||
|
||||
* PATCH /v1/kube_upgrade
|
||||
|
||||
* Modifies the current kube_upgrade. Used to update the state of the
|
||||
upgrade (e.g. to upgrade networking).
|
||||
|
||||
* Request body example::
|
||||
|
||||
{"state": "upgrading-networking"}
|
||||
|
||||
* Response body example::
|
||||
|
||||
{"from_version": "v1.16.4",
|
||||
"to_version": "v1.17.1",
|
||||
"state": "upgrade-started",
|
||||
"uuid": "223ba65e-45d1-4383-baa7-f03bb4c46773",
|
||||
"created_at": "2019-10-25T12:04:10.372399+00:00",
|
||||
"updated_at": "2019-10-25T14:45:43.252964+00:00"}
|
||||
|
||||
* DELETE /v1/kube_upgrade
|
||||
|
||||
* Deletes the current kube_upgrade (after it is completed)
|
||||
|
||||
* The existing resource /ihosts is modified to add new actions.
|
||||
|
||||
* URLS:
|
||||
|
||||
* /v1/ihosts/<hostid>
|
||||
|
||||
* Request Methods:
|
||||
|
||||
* POST /v1/ihosts/<hostid>/kube_upgrade_control_plane
|
||||
|
||||
* Upgrades the control plane on the specified host
|
||||
|
||||
* Response body example::
|
||||
|
||||
{"id": "4",
|
||||
"hostname": "controller-1",
|
||||
"personality": "controller",
|
||||
"target_version": "v1.17.1",
|
||||
"control_plane_version": "v1.16.4.",
|
||||
"kubelet_version": "v1.16.4",
|
||||
"status": ""}
|
||||
|
||||
* POST /v1/ihosts/<hostid>/kube_upgrade_kubelet
|
||||
|
||||
* Upgrades the kubelet on the specified host
|
||||
|
||||
* Response body example::
|
||||
|
||||
{"id": "4",
|
||||
"hostname": "controller-1",
|
||||
"personality": "controller",
|
||||
"target_version": "v1.17.1",
|
||||
"control_plane_version": "v1.17.1.",
|
||||
"kubelet_version": "v1.16.4",
|
||||
"status": ""}
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
This story is providing a mechanism to upgrade kubernetes from one version
|
||||
to another. It does not introduce any additional security impacts above
|
||||
what is already there regarding the initial deployment of kubernetes.
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
End users will typically perform kubernetes upgrades using the sysinv (i.e.
|
||||
system) CLI. The new CLI commands are shown in the `Proposed change`_ section
|
||||
above.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
When a kubernetes upgrade is in progress, each host must be taken out of
|
||||
service in order to upgrade the kubelet. This is necessary because running
|
||||
containers can be adversely impacted by the restart of the kubelet. The
|
||||
user must ensure that there is enough capacity in the system to handle
|
||||
the removal from service of one (or more) hosts as the kubelet on each
|
||||
host is upgraded.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
Deployers will now be able to upgrade kubernetes on a running system.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
Developers working on the StarlingX components that manage container
|
||||
applications may need to be aware that certain operations should be
|
||||
prevented when a kubernetes upgrade is in progress. This is discussed in the
|
||||
`Proposed change`_ section above.
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
Kubernetes upgrades are independent from the upgrade of the StarlingX platform.
|
||||
However, when StarlingX platform upgrades are supported, checks must be put
|
||||
in place to ensure that the kubernetes version is not allowed to change due
|
||||
to a platform upgrade. In effect, the system must be upgraded to the same
|
||||
version of kubernetes as is packaged in the new platform release, to ensure
|
||||
this is the case. This will be enforced through semantic checking in the
|
||||
platform upgrade APIs.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
|
||||
* Bart Wensley (bartwensley)
|
||||
|
||||
Other contributors:
|
||||
|
||||
* Al Bailey (albailey)
|
||||
* Don Penney (dpenney)
|
||||
* Kevin Smith (kevin.smith.wrs)
|
||||
|
||||
Repos Impacted
|
||||
--------------
|
||||
|
||||
* config
|
||||
* integ
|
||||
* update
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
Sysinv:
|
||||
|
||||
* Define new metadata for kubernetes versions
|
||||
* DB API for new tables
|
||||
* kube-version-list/show CLI/API
|
||||
|
||||
* basic infrastructure
|
||||
* calculate state for each known version
|
||||
|
||||
* kube-upgrade-start/show CLI/API
|
||||
|
||||
* basic infrastructure
|
||||
* semantic checks for upgrade start
|
||||
|
||||
* applied/available patches
|
||||
* installed applications support new kubernetes version
|
||||
* tiller/armada images support new kubernetes version
|
||||
|
||||
* kube-host-upgrade-list/show CLI/API
|
||||
|
||||
* basic infrastructure
|
||||
* calculate versions for each host
|
||||
|
||||
* kube-host-upgrade control-plane CLI/API
|
||||
|
||||
* basic infrastructure
|
||||
* semantic checks
|
||||
* conductor RPC/implementation (trigger puppet manifest apply, wait for
|
||||
completion, update coredns affinity, etc...)
|
||||
|
||||
* kube-upgrade-networking CLI/API
|
||||
|
||||
* basic infrastructure
|
||||
* semantic checks
|
||||
* conductor RPC/implementation (trigger playbook apply, wait for completion,
|
||||
etc...)
|
||||
|
||||
* kube-host-upgrade kubelet CLI/API
|
||||
|
||||
* basic infrastructure
|
||||
* semantic checks
|
||||
* conductor RPC/implementation (trigger puppet manifest apply, wait for
|
||||
completion, etc...)
|
||||
|
||||
* kube-upgrade-resume CLI/API
|
||||
|
||||
* basic infrastructure
|
||||
* semantic checks
|
||||
* conductor RPC/implementation (determine what state the upgrade should be
|
||||
in, etc...)
|
||||
|
||||
* New KubeOperator functions, including:
|
||||
|
||||
* retrieve versions of each control plane component
|
||||
* retrieve versions of each kubelet
|
||||
* utility to roll up versions into overall kubernetes version
|
||||
* update affinity (for coredns pod)
|
||||
|
||||
* Kubernetes specific health checks
|
||||
|
||||
* add to existing health-query CLI
|
||||
* verify all control plane pods are running/healthy
|
||||
* verify that all applications are fully applied
|
||||
* figure out what else we should check
|
||||
|
||||
* Add semantic checks to existing APIs
|
||||
|
||||
* application-apply/remove/etc... - prevent when kubernetes upgrade in
|
||||
progress
|
||||
* helm-override-update/etc... - prevent when kubernetes upgrade in progress
|
||||
|
||||
Ansible:
|
||||
|
||||
* enhance upgrade networking playbook to support applying different manifests
|
||||
based on what kubernetes version is running
|
||||
|
||||
Puppet:
|
||||
|
||||
* kubernetes runtime manifest for control plane upgrade
|
||||
* kubernetes runtime manifest for kubelet upgrade
|
||||
|
||||
Patching:
|
||||
|
||||
* Pre-apply/remove scripts to check running kubernetes version
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None
|
||||
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Kubernetes upgrades must be tested in the following StarlingX configurations:
|
||||
|
||||
* AIO-SX
|
||||
* AIO-DX
|
||||
* Standard with controller storage
|
||||
* Standard with dedicated storage
|
||||
* Distributed cloud
|
||||
|
||||
The testing can be performed on hardware or virtual environments.
|
||||
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
New user end user documentation will be required to describe how kubernetes
|
||||
upgrades should be done. The config API reference will also need updates.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [1] https://kubernetes.io/docs/setup/release/version-skew-policy
|
||||
.. [2] https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade
|
||||
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
.. list-table:: Revisions
|
||||
:header-rows: 1
|
||||
|
||||
* - Release Name
|
||||
- Description
|
||||
* - stx-4.0
|
||||
- Introduced
|
Loading…
Reference in New Issue