To upgrade cluster we need to be able to set image tags
so this change adds to labels for corresponding containers
Task: 23314
Story: 2003171
Change-Id: I4cd0270a69fb889c59bdb28966821adb11fd0292
Add kubelet on the master nodes. This work was
done already for calico, this patch applies the
same config when calico is used as well.
story: 2003521
task: 24797
Change-Id: Id33fb59ef23da740712d9a9b7ec4205bd6579b35
1. pods with host network can not reach coredns or any svc or resolve
their own hostname
2. If webhooks are deployed in the cluster, the apiserver needs to
contact them, which means kube-proxy is required in the master node with
the cluster-cidr set.
Change-Id: Icb8e7c3b8c75a3ab087c818c8580c0c8a9111d30
story: 2003460
task: 24719
When we create a cluster and pass the ca.key in a software deployment we
must ensure that the apiserver will start before calico, dashboard etc
which require the api to return ok. [0]
The heat agent process the deployment serially, so if coredns arrives
first in the agent, it will wait forever for the coredns script to
complete.
Putting the cert_manager_api first solves the issue.
[0] curl http://127.0.0.1:8080/healthz
Change-Id: I031ab34141045dde171bcf6206e227fa7eb5885d
story: 2003434
task: 24630
This is a part of fixes for k8s v1.11.1 recently we're doing. When
testing the k8s v1.11.1, we just found some small but annoying issues:
1. cgroup-driver with systemd not working well with Fedora Atomic, so
we're going to use cgroupfs as the default cgroup-driver.
2. The $ char need to be escaped wc-notify-master.sh
Task: 23223
Story: 2003103
Change-Id: I995f5b82abadfdb7f78f7c098ac7a7f1e5c34fd3
Add 'cloud_provider_enabled' label for the k8s_fedora_atomic
driver. Defaults to true. For specific kubernetes versions if
'cinder' is selected as a 'volume_driver', it is implied that
the cloud provider will be enabled since they are combined.
The motivation for this change is that in environments with
high load to the OpenStack APIs, users might want to disable
the cloud provider.
story: 1775358
task: 1775358
Change-Id: I2920f699654af1f4ba45644ab60a04a3f70918fe
Kubernetes should initialize its Global configuration for the OpenStack
provider with the region specified in the Heat stack.
This will allow user to create Magnum Kubernetes clusters in
multiregional OpenStack installation with different public endpoint for
services.
Task: 22576
Story: 2002728
Change-Id: I66820369b889e16445cad7a48cd0f458aae1c41f
Scripts are the core of Magnum for COE deployment. To be more
clear and consistent, two changes proposed in this patch:
1. Rename network related script to xxx-flannel-xxx given they
are all for flannel and now we have calico driver.
2. Adding .sh for some scripts to be consistent with others.
Change-Id: I97f3e53b4b43648a4896193fb4ce469dbf42c611
The service account private key needs to be hidden for safety reasons.
This patch fixes it.
Task: 22886
Story: 1766546
Change-Id: I16929c6de2c442865b2978ae6c18c22b8146f645
Scripts are the core of Magnum for COE deployment. To be more
clear and consistent, two changes proposed in this patch:
1. Rename network related script to xxx-flannel-xxx given they
are all for flannel and now we have calico driver.
2. Adding .sh for some scripts to be consistent with others.
Change-Id: I1a8dfe21d4ff0c58f7f52ebea05c9b22dff16bf0
Multi master deployments for k8s driver use different service account
keys for each api/controller manager server which leads to 401 errors
for service accounts. This patch will create a signed cert and private
key for k8s service account keys explicitly, dedicatedly for the k8s
cluster to avoid the inconsistent keys issue.
Task: 21653
Story: 1766546
Change-Id: I61547405f866d3c5a84da63de66724b55af1066a
When creating a multi-master cluster, all master nodes will attempt to
create kubernetes resources in the cluster at this same time, like
coredns, the dashboard, calico etc. This race conditon shouldn't be
a problem when doing declarative calls instead of imperative (kubectl
apply instead of create). However, due to [1], kubectl fails to apply
the changes and the deployemnt scripts fail causing cluster to creation
to fail in the case of Heat SoftwareDeployments. This patch passes the
ResourceGroup index of every master so that resource creation will be
attempted only from the first master node.
[1] https://github.com/kubernetes/kubernetes/issues/44165
Task: 21673
Story: 1775759
Change-Id: I83f78022481aeef945334c37ac6c812bba9791fd
This patch allows specification of Cgroup driver for Kubelet service.
The necessity of this patch was realised after upgrading Docker to the
new community edition (17.3+) which defaults to `cgroupfs` Cgroup
driver but on the other hand, Fedora Atomic (version 27) comes with
1.13. Cgroup drivers for Docker need to be identical for the two
services, Docker and Kubelet, need to be able to work together.
Story: 2002533
Task: 22079
Change-Id: Ia4b38a63ede59e18c8edb01e93acbb66f1e0b0e4
In the OpenStack deployment with Octavia service enabled, the octavia
service should be used not only for master nodes high availability, but
also for k8s LoadBalancer type service implementation as well.
Change-Id: Ib61f59507510253794a4780a91e49aa6682c8039
Closes-Bug: #1770133
After adding the autoscaler for coredns, the limit
for user_data was reached again. Make coredns
config a SoftwareDeployment.
Change-Id: I0a9852e9293842e859947acf0c4b6da20394436a
Closes-Bug: #1757554
When the backend of flanneld service is vxlan(it listens to 8472
UDP port), magnum need open the port from master.
Closes-Bug: #1767546
Change-Id: Iac5beac5a90d9f81a0cd9f481531677a710608a8
By current design, pods under kube-system will run on minion nodes. And
given now we're not running kubelet on master node, so calico-node is
not running on k8s master node. As a result, kubectl proxy is not
working to access dashboard. And it's confirmed with calico team that
the calico-node container must be running on master node if user want
to use kubectl proxy, see [1]. So, the solution is enabling kubelet
on master but disallow the other pods scheduled on master with
taint/tolerations.
Besides, this patch includes another fix about running calico on
Fedora Atomic. Because Fedora Atomic is using NetworkManager, it
manipulates the routing table for interfaces in the default network
namespace where Calico veth pairs are anchored for connections to
containers. This can interfere with the Calico agent’s ability to
route correctly. Please see more information about this at [2].
[1] https://docs.projectcalico.org/v3.0/getting-started/kubernetes/
installation/integration#about-the-calico-components
[2] https://docs.projectcalico.org/master/usage/troubleshooting/
#configure-networkmanager
Closes-Bug: #1751978
Change-Id: Iacd964806a28b3ca6ba3e037c60060f0957d44aa
To allow ther api server access pods, we need
flannel to be running on the master node.
* Run flannel on the master node in a system
container.
Change-Id: Ic0996ba36e335e970f3d2255840b24a8b4f738b8
Closes-Bug: #1757936
Define a set of new labels to pass additional options to the kubernetes
daemons - kubelet_options, kubeapi_options, kubescheduler_options,
kubecontroller_options, kubeproxy_options.
In all cases the default value is "", meaning no extra options are
passed to the daemons.
Change-Id: Idabe33b1365c7530edc53d1a81dee3c857a4ea47
Closes-Bug: #1701223
Add ingress controller configuration and backend to kubernetes clusters.
A new label 'ingress_controller' defines which backend should serve
ingress, with traefik added as the only option for now.
It is defined as a DaemonSet, with instances on all nodes defined with a
certain role. This role is set as an additional cluster label
'ingress_controller_role', with a default value of 'ingress'.
For now no node is automatically set with this role, with users or operators
having to do this manually after cluster creation.
Change-Id: I5175cf91f37e2988dc3d33042558d994810842f3
Closes-Bug: #1738808
In Fedora Atomic 27 etcd and flanneld are removed from the base image.
Install them as a system containers.
* update docker-storage configuration
* add etcd and flannel tags as labels
Change-Id: I2103c7c3d50f4b68ddc11abff72bc9e3f22839f3
Closes-Bug: #1735381
Currently, the default k8s version in Magnum is v1.7.4, but based on the
deprecation policy of k8s. It will be deprecated at March 2018, see
https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/
So it would be nice to change the default k8s version to latest.
Closes-Bug: #1750549
Change-Id: I053e50ac879b031c8438a2587a99de44e0360c47
Add a new label 'cert_manager_api' to kubernetes clusters controlling the
enable/disable of the kubernetes certificate manager api.
The same cluster cert/key pair is used by this api. The heat agent is used
to install the key in the master node(s), as this is required for kubernetes
to later sign new certificate requests.
The master template init order is changed so the heat agent is launched
previous to enabling the services - the controller manager requires the CA key
to be locally available before being launched.
Change-Id: Ibf85147316e3a194d8a3f92cbb4ae9ce8e16c98f
Partial-Bug: #1734318
Due to a few several small connected patches for the
fedora atomic driver, this patch includes 4 smaller patches.
Patch 1:
k8s: Do not start kubelet and kube-proxy on master
Patch [1], misses the removal of kubelet and kube-proxy from
enable-services-master.sh and therefore they are started if they
exist in the image or the script will fail.
https://review.openstack.org/#/c/533593/
Closes-Bug: #1726482
Patch 2:
k8s: Set require-kubeconfig when needed
From kubernetes 1.8 [1] --require-kubeconfig is deprecated and
in kubernetes 1.9 it is removed.
Add --require-kubeconfig only for k8s <= 1.8.
[1] https://github.com/kubernetes/kubernetes/issues/36745
Closes-Bug: #1718926https://review.openstack.org/#/c/534309/
Patch 3:
k8s_fedora: Add RBAC configuration
* Make certificates and kubeconfigs compatible
with NodeAuthorizer [1].
* Add CoreDNS roles and rolebindings.
* Create the system:kube-apiserver-to-kubelet ClusterRole.
* Bind the system:kube-apiserver-to-kubelet ClusterRole to
the kubernetes user.
* remove creation of kube-system namespaces, it is created
by default
* update client cert generation in the conductor with
kubernetes' requirements
* Add --insecure-bind-address=127.0.0.1 to work on
multi-master too. The controller manager on each
node needs to contact the apiserver (on the same node)
on 127.0.0.1:8080
[1] https://kubernetes.io/docs/admin/authorization/node/
Closes-Bug: #1742420
Depends-On: If43c3d0a0d83c42ff1fceffe4bcc333b31dbdaab
https://review.openstack.org/#/c/527103/
Patch 4:
k8s_fedora: Update coredns config to pass e2e
To pass the e2e conformance tests, coredns needs to
be configured with POD-MODE verified. Otherwise, pods
won't be resolvable [1].
[1] https://github.com/coredns/coredns/tree/master/plugin/kuberneteshttps://review.openstack.org/#/c/528566/
Closes-Bug: #1738633
Change-Id: Ibd5245ca0f5a11e1d67a2514cebb2ffe8aa5e7de
Add a new label 'availability_zone' allowing users to specify the AZ
the nodes should be deployed in. Only one AZ can be passed for this
first implementation.
Change-Id: I9e55d7631191fffa6cc6b9bebbeb4faf2497815b
Partially-Implements: blueprint magnum-availability-zones
Currently, there is no guarantee to make sure all nodes of one cluster are
created on different compute hosts. So it would be nice if we can create
a server group and set it with anti-affinity policy to get a better HA
for cluster. This patch is proposing to create a server group for master
and minion nodes with soft-anti-affinity policy by default.
Closes-Bug: #1737802
Change-Id: Icc7a73ef55296a58bf00719ca4d1cdcc304fab86
In the drivers section of magnum.conf add openstack_ca_file.
This file is expected to be a CA Certificate OR CA bundle
which will be passed on every node and it will be installed
on the host's CA bundle.
Update devstack plugin to use the ssl bundle if tls-proxy is
enabled.
Install the CA for drivers:
k8s_coreos_v1
k8s_fedora_atomic_v1
k8s_fedora_ironic_v1
mesos_ubuntu_v1
swarm_fedora_atomic_v1
swarm_fedora_atomic_v2
Add doc in troubleshooting-guide.
Add release notes.
Closes-Bug: #1580704
Partially-Implements: blueprint heat-agent
Change-Id: Id48fbea187da667a5e7334694c3ec17c8e2504db
Use the heat-container-agent from a system container.
It means that the docker daemon can be started later.
Pass as a software deployment with the heat-agent the following
software-configurations:
* prometheus-monitoring
** pin prometheus to v1.8.2 since its config is not 2.0.0
compatible
Add heat-container-agent container image.
Implements: blueprint heat-agent
Related-Bug: #1680900
Change-Id: I084b7fe51eddb7b36c74f9fe76cda37e8b48f646
Allow any value to be passed on the docker_storage_driver field by turning it
into a StringField (was EnumField), and remove the constraints limiting the
values to 'devicemapper' and 'overlay'.
Change the docker storage setup to have a generic setup for all drivers with
the exception of 'devicemapper', which keeps its own specific storage config
function. For all others, do the same we already did for overlay (with two
cases for usage of a cinder volume or not) and simply set the storage driver
in the docker configuration to the value provided in the cluster template.
Change-Id: I9aa8f232ce64ece4d439c0a476f463820a499617
Closes-Bug: #1722522
Added configuration parameter, verify_ca, to magnum.conf with default
value of True. This parameter is passed to the heat templates to
indicate whether the cluster nodes validate the Certificate Authority
when making requests to the OpenStack APIs (Keystone, Magnum, Heat).
This configuration parameter can be set to False to disable CA
validation.
Co-Authored-By: Vijendar Komalla <vijendar.komalla@rackspace.com>
Change-Id: Iab02cb1338b811dac0c147378dbd0e63c83f0413
Partial-Bug: #1663757