Without this, heat container agents using kubectl version
1.18.x (e.g. ussuri-dev) fail because they do not have the correct
KUBECONFIG in the environment.
Task: 39938
Story: 2007591
Change-Id: Ifc212478ae09c658adeb6ba4c8e8afc8943e3977
- Refactor helm installer to use a single meta chart install job
install job and config which use Helm v3 client.
- Use upstream helm client binary instead of using helm-client container
maintained by us. To verify checksum, helm_client_sha256 label is
introduced for helm_client_tag (or alternatively for URL specified
using new helm_client_url label).
- Default helm_client_tag=v3.2.1.
- Default tiller_tag=v2.16.7, tiller_enabled=false.
Story: 2007514
Task: 39295
Change-Id: I9b9633c81afb08b91576a9a4d3c5a0c445e0cee4
Heapster has been deprecated for a while and the new k8s dashboard
2.0.0 version supports metrics-server now. So it's time to upgrade
the default k8s dashboard to v2.0.0.
Task: 39101
Story: 2007256
Change-Id: I02f8cb77b472142f42ecc59a339555e60f5f38d0
A new config option `post_install_manifest_url` is added to support
installing cloud provider/vendor specific manifest after booted
the k8s cluster. It's an URL pointing to the manifest file. For
example, cloud admin can set their specific storageclass into
this file, then it will be automatically setup after created
the cluster.
Task: 35798
Story: 2006209
Change-Id: Ib5a2c5cd7970085db941f189613e175f622aea3f
Add an ARCH parameter to handle arch specific things, mostly are the
docker image repo names.
Because not all the docker images magnum used support multi-arch
manifest[1] like kubernetes-dashboard, it will need to specific the
arch name in the docker image repo name.
[1]
https://kubernetes.io/docs/concepts/containers/images/#building-multi-architecture-images-with-manifests
Change-Id: Iccb3a030aefd2d4e55a455d1a0401cbc4eb7fd14
Task: 37884
Story: 2007026
Add support for out of tree Cinder CSI. This is installed when the
cinder_csi_enabled=true label is added. This will allow us to eventually
deprecate in-tree Cinder.
story: 2007048
task: 37868
Change-Id: I8305b9f8c9c37518ec39198693adb6f18542bf2e
Signed-off-by: Bharat Kunwar <brtknr@bath.edu>
IPIP Mode to use for the IPv4 POOL created at start up
allowed_values: ["Always", "CrossSubnet", "Never", "Off"]
default: "Off"
Change-Id: Ib834a1f86a6db408047cc8f86fc7744d16d83904
Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>
Given we're using public container registry as the default registry,
so it would be nice to have a verification for the image's digest.
Kubernetes already supports that so user can just use format like
@sha256:xxx for those addons' tags. This patch introduces the support
for hyperkube based on podman and fedora coreos driver.
Task: 37776
Story: 2007001
Change-Id: I970c1b91254d2a375192420a9169f3a629c56ce7
There is harmless but unnecessary indentation in
/etc/sysconfig/heat-params for the master node only which would be nice
to remove going forward. This issue does not exist in minions.
This oddity was introduced in Ibbed59bc135969174a20e5243ff8464908801a23.
Task: 37812
Story: 2002210
Change-Id: I7ca2893b44ea5cb057e51013dbcb34efb18d54ed
Magnum allows to use CONTAINER_INFRA_PREFIX to specify a local
repository from which we can pull container images. This repository
defaults to the upstream one that is specified in the metrics helm
chart.
* This patch allows for the usage of CONTAINER_INFRA_PREFIX to
correctly configure the pull of the metric-server container image
from the specified repo.
* Add label metrics_server_chart_tag to allow user to specify
stable/metrics-server chart tag to use
* Add label metrics_server_enabled to allow enable/disable of
component (defaults: true)
Story: 2004816
Task: 37390
Change-Id: Idc315937a82317b76349bbe8466d900d00194953
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
This will install the prometheus-adapter stable
helm chart. Requires monitoring_enabled=true.
The chart version can be configured using
prometheus_adapter_chart_tag and an option is
available to overwrite the default configuration
rules for a user defined ConfigMap referenced
by using prometheus_adapter_configmap label.
story: 2006765
task: 37278
Change-Id: I5b86f4455f88c8dbeac6e56942e1ca55f1d1726c
Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>
Additioanlly, bumping up the Chart version to 1.24.7 without which the
ingress controller fails to deploy on 1.16.x.
Additionally, bump up nginx_ingress_controller_tag version to 0.26.1.
This is to ensure that we are running an up to date nginx ingress
controller with fixes for known CVEs.
Story: 2006853
Task: 37444
Change-Id: Ibf045a06d19b02095e19d9a21d14a91a39a3751c
Choose whether system containers etcd, kubernetes and the heat-agent will be
installed with podman or atomic. This label is relevant for k8s_fedora drivers.
k8s_fedora_atomic_v1 defaults to use_podman=false, meaning atomic will be used
pulling containers from docker.io/openstackmagnum. use_podman=true is accepted
as well, which will pull containers by k8s.gcr.io.
k8s_fedora_coreos_v1 defaults and accepts only use_podman=true.
Fix upgrade for k8s_fedora_coreos_v1 and magnum-cordon systemd unit.
Task: 37242
Story: 2005201
Change-Id: I0d5e4e059cd4f0458746df7c09d2fd47c389c6a0
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
With this change each node will be labeled with the following:
* --node-labels=magnum.openstack.org/role=${NODEGROUP_ROLE}
* --node-labels=magnum.openstack.org/nodegroup=${NODEGROUP_NAME}
Change-Id: Ic410a059b19a1252cdf6eed786964c5c7b03d01c
Sometimes, the fixed_network value gets rendered as UUID. However OCCM's
internal-network-name requires the network name, it does not support
UUID. This patch introduces a new parameter called fixed_network_name
which converts fixed_network UUID to name if it is UUID-like.
Story: 2005333
Task: 36313
Change-Id: I3453bc0dbea285687d39c9782685cb1f2a3ecd39
We kept introspecting the name of the instance with the assumption
that the network always existed under .novalocal
This is not always the case, with certain variables changed inside
Neutron it is possible to control this, therefore, leading in failing
deploys.
With this change, we pass the instance name directly to the cluster
and therefore we always have the accurate name.
Task: 36160
Story: 2006371
Change-Id: I2ba32844b822ffc14da043e6ef7d071bb62a22ee
When there is more than one NIC attached to an instance, openstack cloud
provider returns a random InternalIP back to the host resulting in instability
with API server which only talks to a default interface.
This patch incorporates the changes made in
https://github.com/kubernetes/cloud-provider-openstack/pull/444 which enables
OpenStack Cloud Controller Manager (OCCM) to respect the
`internal-network-name` in cloud-config file which ensures that InternalIP
remains stable.
Uses a separate cloud-config file for OCCM to ensure in-tree Cinder volumes
remain compatible.
Change-Id: Idfa52ed2d512e7dc383a556371e896205dd542f9
Story: 2005333
Task: 30271
* prometheus-operator chart version upgraded from 0.1.31. to 5.12.3
* Fix an issue where when using Feature Gate Priority the scheduler
would evict the prometheus monitoring node-exporter pods
* Fix an issue where intensive CPU utilization would make the
metrics fail intermitently or completly fail
* Prometheus resources are now calculated based on the MAX_NODE_COUNT
requested
* Change the sampling rate from the standard 30s to 1 minute (Rollback)
* Add the missing tiller CONTAINER_INFRA_PREFIX variable to the ConfigMap
* Add label prometheus_operator_chart_tag to enable the user to
specify the stable/prometheus-operator chart to use
* Fix breaking changes on CoreDNS metrics introduced by
8fb27da2fc
* Fix Graphana dashboard not showing data.
Change-Id: If42873cd6668c07e4e911e4eef5e4ae2232be66f
Task: 30777
Task: 30779
Story: 2005588
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
Rolling ugprade is an important feature for a managed k8s service,
at this stage, two user cases will be covered:
1. Upgrade base operating system
2. Upgrade k8s version
Known limitation: When doing operating system upgrade, there is no
chance to call kubectl drain to evict pods on that node.
Task: 30185
Story: 2002210
Change-Id: Ibbed59bc135969174a20e5243ff8464908801a23
The current magnum traefik deployment will always pull latest traefik
container image. With the new launch of traefik v2
(https://blog.containo.us/back-to-traefik-2-0-2f9aa17be305) this will
have impact on how the ingress is described in k8s.
This patch:
* Sets the traefik version to default tag v1.7.9, stable release
prior to v2.
* Adds a new label <traefik_ingress_controller_tag> to enable user
to specify other than default traefik release.
Task: 30143
Task: 30146
Story: 2005286
Change-Id: I031a594f7b6014d88df055664afcf51b1cd2cd94
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
Using Node Problem Detector, Draino and AutoScaler to support
auto healing for K8s cluster, user can use a new label
"auto_healing_enabled' to turn on/off it.
Meanwhile, a new label "auto_scaling_enabled" is also introduced
to enable the capability to let the k8s cluster auto scale based
its workload.
Task: 28923
Story: 2004782
Change-Id: I25af2a72a7a960205929374d2300bd83d4d20960
Add an nginx based Ingress controller for Kubernetes.
The use case is to provide better support use cases which require either
L4 access or SSL passthrough, which lack proper support in Traefik.
Selection is done via the same label 'ingress_controller' with value
'nginx'. Deployment relies on the upstream nginx-ingress helm chart.
Change-Id: I1db2074fce9d43c03f479a6aaeb4f238d7101555
Story: 2005327
Task: 30255
When there is more than one NIC attached to an instance, openstack cloud
provider returns a random InternalIP back to the host resulting in instability
with API server which only talks to a default interface.
This patch incorporates the changes made in
https://github.com/kubernetes/cloud-provider-openstack/pull/444 which enables
OpenStack Cloud Controller Manager to respect the `internal-network-name` in
cloud-config file which ensures that InternalIP remains stable.
Story: 2005333
Task: 30271
Change-Id: I9e3ad459dd05753b53cb4ce75ee3aed649fef196
The Kubernetes Helm repository includes in its stable distribution
a prometheus-operator Chart.
This stable/prometheus-operator chart can be used to install all the
dependencies and some default configurations to use prometheus.
The installed extra charts are:
* stable/prometheus-node-exporter (data scraping)
* stable/prometheus (prometheus and alertmanager server)
* stable/grafana (visualization dashboard)
* stable/prometheus-operator (supervision and simple configuration)
The prometheus-operator is installed by using the label
monitoring_enabled=True. Also, the label grafana_admin_passwd can be
used to set the admin password for access to the grafana dashboard
This patch allows for transferral of prometheus monitoring maintenance
work to be done by the kubernetes/helm team.
Task: 28544
Story: 2004623
depends_on: I99d3a78085ba10030200f12bbfe58a72964e2326
Change-Id: I80d590785bf30f9d634debeaf51c0d4cce0aeb93
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
Deploying Node Problem Detector to all nodes to detect problems which
can be leverage by auto healing. This is the first step of enabling
the auto healing feature.
Task: 29886
Story: 2004782
Change-Id: I1b6075025c5f369821b4136783e68b16535dc6ef
Similar to calico, deploy flannel as a DS.
Flannel can use the kubernetes API to store
data, so it doesn't need to contact the etcd
server directly anymore.
This patch drops to relatively large files for
flannel's config, flannel-config-service.sh and
write-flannel-config.sh. All required config is
in the manifests.
Additional options to the controller manager:
--allocate-node-cidrs=true and --cluster-cidr.
Change-Id: I4f1129e155e2602299394b5866165260f4ea0df8
story: 2002751
task: 24870
Add enable_tiller label to install tiller in k8s_fedora_atomic
clusters. Defaults to false.
Add tiller_tag label to select the version of tiller. If the
tag is not set the tag that matches the helm client version in
the heat-agent will be picked. The tiller image can be stored
in a private registry and the cluster can pull it using the
container_infra_prefix label.
Install tiller securely using helper container.
TODO:
*add instructions on how RBAC is designed
https://docs.helm.sh/using_helm/#example-deploy-tiller-in-a-namespace-restricted-to-deploying-resources-in-another-namespace
* add docs on how to install addon in the cluster using this tiller
* how users can get the creds to talk to tiller
NOTE:
The main goal of this tiller is internal usage!
Users can still deploy other tillers in other namespaces.
story: 2003902
task: 26780
Change-Id: I99d3a78085ba10030200f12bbfe58a72964e2326
Signed-off-by: dioguerra <dy090.guerra@gmail.com>
- Add "octavia" as one of the "ingress_controller" options.
- Add label "octavia_ingress_controller_tag".
- Use external network ID in the heat templates.
Story: 2004838
Change-Id: I7d889a054cd5feb2eeef523b20607a6c7630d777
Now cloud-provider-openstack of Kubernetes has a webhook to support
Keystone authorization and authentication. With this feature, user
can use a new label 'keystone-auth-enabled' to enable the keystone
authN and authZ.
DocImpact
Task: 21637
Story: 1755770
Change-Id: I3d21ad8f55c0d7308a302f62db9e9af147a604f8
* Use the external cloud-provider [0]
* Label master nodes
* Make the script the deploys the cloud-provider and clusterroles
for the apiserver a SoftwareDeployment
* Rename kube_openstack_config to cloud-config,
for cinder to workm the kubelet expects the cloud config name only
like this. Keep a copy of kube_openstack_config for backwards
compatibility.
Change-Id: Ife5558f1db4e581b64cc4a8ffead151f7b405702
Task: 22361
Story: 2002652
Co-Authored-By: Spyros Trigazis <spyridon.trigazis@cern.ch>
- Start workers as soon as the master VM is created, rather than
waiting all the services ready.
- Move all the SoftwareDeployment outside of kubemaster stack.
- Tweak the scripts in SoftwareDeployment so that they can be combined
into a single script.
Story: 2004573
Task: 28347
Change-Id: Ie48861253615c8f60b34a2c1e9ad6b91d3ae685e
Co-Authored-By: Lingxian Kong <anlin.kong@gmail.com>
To upgrade cluster we need to be able to set image tags
so this change adds to labels for corresponding containers
Task: 23314
Story: 2003171
Change-Id: I4cd0270a69fb889c59bdb28966821adb11fd0292
Add 'cloud_provider_enabled' label for the k8s_fedora_atomic
driver. Defaults to true. For specific kubernetes versions if
'cinder' is selected as a 'volume_driver', it is implied that
the cloud provider will be enabled since they are combined.
The motivation for this change is that in environments with
high load to the OpenStack APIs, users might want to disable
the cloud provider.
story: 1775358
task: 1775358
Change-Id: I2920f699654af1f4ba45644ab60a04a3f70918fe
Kubernetes should initialize its Global configuration for the OpenStack
provider with the region specified in the Heat stack.
This will allow user to create Magnum Kubernetes clusters in
multiregional OpenStack installation with different public endpoint for
services.
Task: 22576
Story: 2002728
Change-Id: I66820369b889e16445cad7a48cd0f458aae1c41f
Multi master deployments for k8s driver use different service account
keys for each api/controller manager server which leads to 401 errors
for service accounts. This patch will create a signed cert and private
key for k8s service account keys explicitly, dedicatedly for the k8s
cluster to avoid the inconsistent keys issue.
Task: 21653
Story: 1766546
Change-Id: I61547405f866d3c5a84da63de66724b55af1066a
When creating a multi-master cluster, all master nodes will attempt to
create kubernetes resources in the cluster at this same time, like
coredns, the dashboard, calico etc. This race conditon shouldn't be
a problem when doing declarative calls instead of imperative (kubectl
apply instead of create). However, due to [1], kubectl fails to apply
the changes and the deployemnt scripts fail causing cluster to creation
to fail in the case of Heat SoftwareDeployments. This patch passes the
ResourceGroup index of every master so that resource creation will be
attempted only from the first master node.
[1] https://github.com/kubernetes/kubernetes/issues/44165
Task: 21673
Story: 1775759
Change-Id: I83f78022481aeef945334c37ac6c812bba9791fd
This patch allows specification of Cgroup driver for Kubelet service.
The necessity of this patch was realised after upgrading Docker to the
new community edition (17.3+) which defaults to `cgroupfs` Cgroup
driver but on the other hand, Fedora Atomic (version 27) comes with
1.13. Cgroup drivers for Docker need to be identical for the two
services, Docker and Kubelet, need to be able to work together.
Story: 2002533
Task: 22079
Change-Id: Ia4b38a63ede59e18c8edb01e93acbb66f1e0b0e4
In the OpenStack deployment with Octavia service enabled, the octavia
service should be used not only for master nodes high availability, but
also for k8s LoadBalancer type service implementation as well.
Change-Id: Ib61f59507510253794a4780a91e49aa6682c8039
Closes-Bug: #1770133
To allow ther api server access pods, we need
flannel to be running on the master node.
* Run flannel on the master node in a system
container.
Change-Id: Ic0996ba36e335e970f3d2255840b24a8b4f738b8
Closes-Bug: #1757936
Define a set of new labels to pass additional options to the kubernetes
daemons - kubelet_options, kubeapi_options, kubescheduler_options,
kubecontroller_options, kubeproxy_options.
In all cases the default value is "", meaning no extra options are
passed to the daemons.
Change-Id: Idabe33b1365c7530edc53d1a81dee3c857a4ea47
Closes-Bug: #1701223