[fedora-atomic][k8s] Support operating system upgrade

Along with the kubernetes version upgrade support we just released, we're
adding the support to upgrade the operating system of the k8s cluster
(including master and worker nodes). It's an inplace upgrade leveraging the
atomic/ostree upgrade capability.

Story: 2002210
Task: 33607

Change-Id: If6b9c054bbf5395c30e2803314e5695a531c22bc
This commit is contained in:
Fei Long Wang 2019-07-08 15:45:21 +12:00 committed by Feilong Wang
parent 9e815f6af4
commit 2a81d62732
8 changed files with 143 additions and 23 deletions

View File

@ -2,29 +2,33 @@ Rolling upgrade is one of most important features user want to see for a
managed Kubernetes service. And in Magnum, we're thinking more deeper to
provide better user experience.
.. note::
The Kubernetes version upgrade is only supported by the Fedora Atomic driver.
Now user can use command as below to trigger the rolling ugprade for
Kubernetes version upgrade or the node operating system version upgrade.
.. code-block:: bash
#!/bin/bash -x
openstack coe cluster upgrade <cluster ID> <new cluster template ID>
IP="192.168.122.1"
CLUSTER="797b39e1-fac2-48d3-8377-d6e6cc443d39"
CT="e32c8cf7-394b-45e6-a17e-4fe6a30ad64b"
As you can the the key parameter in the command is the new template. For
Kubernetes version upgrade, a newer version for label `kube_tag` should be
provided. Downgrade is not supported.
# Upgrade curl
req_body=$(cat << EOF
{
"max_batch_size": 1,
"nodegroup": "master",
"cluster_template": "${CT}"
}
EOF
)
USER_TOKEN=$(openstack token issue -c id -f value)
curl -g -i -X PATCH https://${IP}:9511/v1/clusters/${CLUSTER}/actions/upgrade \
-H "OpenStack-API-Version: container-infra latest" \
-H "X-Auth-Token: $USER_TOKEN" \
-H "Content-Type: application/json" \
-H "Accept: application/json" \
-H "User-Agent: None" \
-d "$req_body"
By now, a simple operating system upgrade is using a new image ID in the new
cluster template. With this way, there could be a downtime for your application
running on the cluster, because all the nodes will be rebuilt one by one.
For Magnum Fedora Atomic driver, it can support a more gradeful operating
system upgrade. Similar like the k8s version upgrade, it will try to cordon
and drain the node before upgrade the operating system with rpm-ostree command.
There are two labels are introduced to support this feature: `ostree_commit`
and `ostree_remote`.
* ostree_commit It's a commit ID of ostree the current system can be
upgraded to. For example, a commit ID like this
`b25bde0109441817f912ece57ca1fc39efc60e6cef4a7a23ad9de51b1f36b742`
* ostree_remote It's a remote name of ostree the current system can be rebased
to. For example, a remote name like this `fedora-atomic:fedora/27/x86_64/atomic-host`

View File

@ -6,8 +6,11 @@ set -x
ssh_cmd="ssh -F /srv/magnum/.ssh/config root@localhost"
kubecontrol="/var/lib/containers/atomic/heat-container-agent.0/rootfs/usr/bin/kubectl --kubeconfig /etc/kubernetes/kubelet-config.yaml"
new_kube_tag="$kube_tag_input"
new_ostree_remote="$ostree_remote_input"
new_ostree_commit="$ostree_commit_input"
HOSTNAME_OVERRIDE="$(cat /etc/hostname | head -1 | sed 's/\.novalocal//')"
if [ ${new_kube_tag}!=${KUBE_TAG} ]; then
function drain {
# If there is only one master and this is the master node, skip the drain, just cordon it
# If there is only one worker and this is the worker node, skip the drain, just cordon it
all_masters=$(${ssh_cmd} ${kubecontrol} get nodes --selector=node-role.kubernetes.io/master= -o name)
@ -17,6 +20,11 @@ if [ ${new_kube_tag}!=${KUBE_TAG} ]; then
else
${ssh_cmd} ${kubecontrol} cordon ${INSTANCE_NAME}
fi
}
if [ "${new_kube_tag}" != "${KUBE_TAG}" ]; then
drain
declare -A service_image_mapping
service_image_mapping=( ["kubelet"]="kubernetes-kubelet" ["kube-controller-manager"]="kubernetes-controller-manager" ["kube-scheduler"]="kubernetes-scheduler" ["kube-proxy"]="kubernetes-proxy" ["kube-apiserver"]="kubernetes-apiserver" )
@ -50,3 +58,55 @@ if [ ${new_kube_tag}!=${KUBE_TAG} ]; then
# Appending the new KUBE_TAG into the heat-parms to log and indicate the current k8s version
echo "KUBE_TAG=$new_kube_tag" >> /etc/sysconfig/heat-params
fi
function setup_uncordon {
# Create a service to uncordon the node itself after reboot
if [ ! -f /etc/systemd/system/uncordon.service ]; then
$ssh_cmd cat > /etc/systemd/system/uncordon.service << EOF
[Unit]
Description=magnum-uncordon
After=network.target kubelet.service
[Service]
Restart=Always
RemainAfterExit=yes
ExecStart=${kubecontrol} uncordon ${HOSTNAME_OVERRIDE}
[Install]
WantedBy=multi-user.target
EOF
${ssh_cmd} systemctl enable uncordon.service
fi
}
remote_list=`${ssh_cmd} ostree remote list`
# Fedora Atomic 29 will be the last release before migrating to Fedora CoreOS, so we're OK to add 28 and 29 remotes directly
if [[ ! " ${remote_list[@]} " =~ "fedora-atomic-28" ]]; then
${ssh_cmd} ostree remote add --set=gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-primary --contenturl=mirrorlist=https://ostree.fedoraproject.org/mirrorlist fedora-atomic-28 https://kojipkgs.fedoraproject.org/atomic/repo/
fi
if [[ ! " ${remote_list[@]} " =~ "fedora-atomic-29" ]]; then
${ssh_cmd} ostree remote add --set=gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-29-primary --contenturl=mirrorlist=https://ostree.fedoraproject.org/mirrorlist fedora-atomic-29 https://kojipkgs.fedoraproject.org/atomic/repo/
fi
# The uri of existing Fedora Atomic 27 remote is not accessible now, so replace it with correct uri
if [[ " ${remote_list[@]} " =~ "fedora-atomic" ]]; then
sed -i '
/^url=/ s|=.*|=https://kojipkgs.fedoraproject.org/atomic/repo/|
' /etc/ostree/remotes.d/fedora-atomic.conf
fi
current_ostree_commit=`${ssh_cmd} rpm-ostree status | grep Commit | awk '{print $2}'`
current_ostree_remote=`${ssh_cmd} rpm-ostree status | awk '/* ostree/{print $0}' | awk '{match($0,"* ostree://([^ ]+)",a)}END{print a[1]}'`
# NOTE(flwang): 1. Either deploy or rebase for only one upgrade
# 2. Using rpm-ostree command instead of atomic command to keep the possibility of supporting fedora coreos 30
if [ "$new_ostree_commit" != "" ] && [ "$current_ostree_commit" != "$new_ostree_commit" ]; then
drain
setup_uncordon
${ssh_cmd} rpm-ostree deploy $new_ostree_commit
shutdown --reboot --no-wall -t 1
elif [ "$new_ostree_remote" != "" ] && [ "$current_ostree_remote" != "$new_ostree_remote" ]; then
drain
setup_uncordon
${ssh_cmd} rpm-ostree rebase $new_ostree_remote
shutdown --reboot --no-wall -t 1
fi

View File

@ -142,7 +142,8 @@ class K8sFedoraTemplateDefinition(k8s_template_def.K8sTemplateDefinition):
'auto_healing_enabled', 'auto_scaling_enabled',
'auto_healing_controller', 'magnum_auto_healer_tag',
'draino_tag', 'autoscaler_tag',
'min_node_count', 'max_node_count', 'npd_enabled']
'min_node_count', 'max_node_count', 'npd_enabled',
'ostree_remote', 'ostree_commit']
for label in label_list:
label_value = cluster.labels.get(label)

View File

@ -672,6 +672,16 @@ parameters:
default:
true
ostree_remote:
type: string
description: The ostree remote branch to upgrade
default: ''
ostree_commit:
type: string
description: The ostree commit to deploy
default: ''
resources:
######################################################################
@ -968,6 +978,8 @@ resources:
min_node_count: {get_param: min_node_count}
max_node_count: {get_param: max_node_count}
npd_enabled: {get_param: npd_enabled}
ostree_remote: {get_param: ostree_remote}
ostree_commit: {get_param: ostree_commit}
kube_cluster_config:
type: OS::Heat::SoftwareConfig
@ -1104,6 +1116,8 @@ resources:
auto_healing_enabled: {get_param: auto_healing_enabled}
npd_enabled: {get_param: npd_enabled}
auto_healing_controller: {get_param: auto_healing_controller}
ostree_remote: {get_param: ostree_remote}
ostree_commit: {get_param: ostree_commit}
outputs:

View File

@ -506,6 +506,14 @@ parameters:
default:
true
ostree_remote:
type: string
description: The ostree remote branch to upgrade
ostree_commit:
type: string
description: The ostree commit to deploy
resources:
######################################################################
#
@ -774,6 +782,8 @@ resources:
group: script
inputs:
- name: kube_tag_input
- name: ostree_remote_input
- name: ostree_commit_input
config:
get_file: ../../common/templates/kubernetes/fragments/upgrade-kubernetes.sh
@ -786,6 +796,8 @@ resources:
actions: ['UPDATE']
input_values:
kube_tag_input: {get_param: kube_tag}
ostree_remote_input: {get_param: ostree_remote}
ostree_commit_input: {get_param: ostree_commit}
outputs:

View File

@ -294,6 +294,16 @@ parameters:
default:
true
ostree_remote:
type: string
description: The ostree remote branch to upgrade
default: ''
ostree_commit:
type: string
description: The ostree commit to deploy
default: ''
resources:
agent_config:
@ -468,6 +478,8 @@ resources:
group: script
inputs:
- name: kube_tag_input
- name: ostree_remote_input
- name: ostree_commit_input
config:
get_file: ../../common/templates/kubernetes/fragments/upgrade-kubernetes.sh
@ -480,6 +492,8 @@ resources:
actions: ['UPDATE']
input_values:
kube_tag_input: {get_param: kube_tag}
ostree_remote_input: {get_param: ostree_remote}
ostree_commit_input: {get_param: ostree_commit}
outputs:

View File

@ -539,6 +539,8 @@ class AtomicK8sTemplateDefinitionTestCase(BaseK8sTemplateDefinitionTestCase):
npd_enabled = mock_cluster.labels.get('npd_enabled')
master_image = mock_cluster_template.image_id
minion_image = mock_cluster_template.image_id
ostree_remote = mock_cluster.labels.get('ostree_remote')
ostree_commit = mock_cluster.labels.get('ostree_commit')
k8s_def = k8sa_tdef.AtomicK8sTemplateDefinition()
@ -620,6 +622,8 @@ class AtomicK8sTemplateDefinitionTestCase(BaseK8sTemplateDefinitionTestCase):
'kube_version': kube_tag,
'master_kube_tag': kube_tag,
'minion_kube_tag': kube_tag,
'ostree_remote': ostree_remote,
'ostree_commit': ostree_commit,
}}
mock_get_params.assert_called_once_with(mock_context,
mock_cluster_template,
@ -955,6 +959,8 @@ class AtomicK8sTemplateDefinitionTestCase(BaseK8sTemplateDefinitionTestCase):
npd_enabled = mock_cluster.labels.get('npd_enabled')
master_image = mock_cluster_template.image_id
minion_image = mock_cluster_template.image_id
ostree_remote = mock_cluster.labels.get('ostree_remote')
ostree_commit = mock_cluster.labels.get('ostree_commit')
k8s_def = k8sa_tdef.AtomicK8sTemplateDefinition()
@ -1038,6 +1044,8 @@ class AtomicK8sTemplateDefinitionTestCase(BaseK8sTemplateDefinitionTestCase):
'kube_version': kube_tag,
'master_kube_tag': kube_tag,
'minion_kube_tag': kube_tag,
'ostree_remote': ostree_remote,
'ostree_commit': ostree_commit,
}}
mock_get_params.assert_called_once_with(mock_context,
mock_cluster_template,

View File

@ -0,0 +1,7 @@
---
features:
- |
Along with the kubernetes version upgrade support we just released, we're
adding the support to upgrade the operating system of the k8s cluster
(including master and worker nodes). It's an inplace upgrade leveraging the
atomic/ostree upgrade capability.