This changes the existing cluster APIs and the cluster conductor to
take into consideration nodegroups:
* create: now creates the default nodegroups for the cluster
* update: updates the default nodegroups of the cluster
* delete: deletes also the nodegroups that belong to the cluster
* cluster_resize: takes into account the nodegroup provided by the API
story: 2005266
Change-Id: I5478c83ca316f8f09625607d5ae9d9f3c02eb65a
In Rocky release, the k8s workers security group was wide opened but
in Stein release it is more restrictive which prevent the access of
Kubnertes dashboard(and other serivces) via the command:
$ kubectl proxy
This patch can fix it by allowing traffic from master security group
to workers security group.
Co-Authored: Feilong Wang<flwang@catalyst.net.nz>
Task: 30171
Story: 2005294
Change-Id: I546cd7324b87b267e945477c78539ea80534538f
This is a mechanically generated change to replace openstack.org
git:// URLs with https:// equivalents.
This is in aid of a planned future move of the git hosting
infrastructure to a self-hosted instance of gitea (https://gitea.io),
which does not support the git wire protocol at this stage.
This update should result in no functional change.
For more information see the thread at
http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003825.html
Change-Id: Ie288c147a3cbdd19abd257bf14972c316db6d67c
Add file to the reno documentation build to show release notes for
stable/stein.
Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/stein.
Change-Id: Ib327c9320ec306098769040df8188e8968913ef4
Sem-Ver: feature
The Kubernetes Helm repository includes in its stable distribution
a prometheus-operator Chart.
This stable/prometheus-operator chart can be used to install all the
dependencies and some default configurations to use prometheus.
The installed extra charts are:
* stable/prometheus-node-exporter (data scraping)
* stable/prometheus (prometheus and alertmanager server)
* stable/grafana (visualization dashboard)
* stable/prometheus-operator (supervision and simple configuration)
The prometheus-operator is installed by using the label
monitoring_enabled=True. Also, the label grafana_admin_passwd can be
used to set the admin password for access to the grafana dashboard
This patch allows for transferral of prometheus monitoring maintenance
work to be done by the kubernetes/helm team.
Task: 28544
Story: 2004623
depends_on: I99d3a78085ba10030200f12bbfe58a72964e2326
Change-Id: I80d590785bf30f9d634debeaf51c0d4cce0aeb93
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
Openstack-cloud-controller-manager restarts several times during
cluster creation.
This happens because cloud-controller-manager starts running before
needed secrets exist in kubernetes. Cloud-controller-manager lists secrets
and if the secrets exists it uses it and moves on, but if the secret
doesn't exist it starts a watch until it does. As this is not allowed the
pod fails.
This is triggered by Issue
https://github.com/kubernetes/cloud-provider-openstack/issues/545
Story: 2005270
Change-Id: If8f34dc45b3b8a76e3d561ed41b4d0a783ceecb5
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
- Never allocate floating IP for etcd service.
- Introduce a new label `master_lb_floating_ip_enabled` which controls
if Magnum allocates floating IP for the master load balancer. This
label only takes effect when the `master_lb_enabled` is set. The
default value is the same with `floating_ip_enabled`.
- The `floating_ip_enabled` property now only controls if Magnum
should allocate the floating IPs for the master and worker nodes.
Change-Id: I0a232406deaf112b0cb9e445735d7b49206c676d
Story: #2005153
Task: #29868
Now an OpenStack driver for Kubernetes Cluster Autoscaler is being
proposed to support autoscaling when running k8s cluster on top of
OpenStack. However, currently there is no way in Magnum to let
the external consumer to control which node will be removed. The
alternative option is calling Heat API directly but obviously it
is not the best solution and it's confusing k8s community. So with
this patch, we're going to add a new API:
POST <ClusterID>/actions/resize
And the post body will be:
{
"node_count": 3,
"nodes_to_remove": ["dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2"],
"nodegroup": "production_group"
}
The API will be working in a declarative way. For example, there
are 3 nodes in the cluser now, user can propose an API request
like above. Magnum will call Heat to remove the node
dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2 firstly, then bring the node
count back to 3 again.
Task: 29563
Story: 2005052
Change-Id: I7e36ce82c3f442976cc498153950b19c56a1759f
Deploying Node Problem Detector to all nodes to detect problems which
can be leverage by auto healing. This is the first step of enabling
the auto healing feature.
Task: 29886
Story: 2004782
Change-Id: I1b6075025c5f369821b4136783e68b16535dc6ef
We currently run only vexxhost with nested
virtualization. Due to a kernel change all
functional jobs are failing.
Change-Id: I9ab45da36dbc5618587b4795658b4f4bb264f2c8
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
The scripts run by cloud-init for the master and minion nodes currently
write proxy environment variables into /bin/bashrc when they are defined.
These variables will only be introduced into the running environment
when a new bash shell is started. The /bin/sh used by the fragment
scripts will ignore /etc/bashrc, so the new shells invoked per fragment
will not have the http proxy variables present. This means that the
master/minion node deployment fails when behind an http proxy.
This patch adds explicit exports for HTTP_PROXY and HTTPS_PROXY when those
variables are defined, and not empty.
Task: 29863
Change-Id: Id05c90d5bf99d720ae6002b38d3291e364e1e0c4
Similar to calico, deploy flannel as a DS.
Flannel can use the kubernetes API to store
data, so it doesn't need to contact the etcd
server directly anymore.
This patch drops to relatively large files for
flannel's config, flannel-config-service.sh and
write-flannel-config.sh. All required config is
in the manifests.
Additional options to the controller manager:
--allocate-node-cidrs=true and --cluster-cidr.
Change-Id: I4f1129e155e2602299394b5866165260f4ea0df8
story: 2002751
task: 24870
The commands used by constraints need at least tox 2.0.
Update to reflect reality, which should help with local running of
constraints targets.
Change-Id: Iece749b90ec90bec1f5324bc351878e6252720ed
Fixes the problem with Mesos cluster creation where the
nodes_affinity_policy was not properly conveyed as it is required
in order to create the corresponding server group in Nova.
Change-Id: Ie8d73247ba95f20e24d6cae27963d18b35f8715a
story: 2005116
Now swarm functional job failed due to a a regression issue caused by
If11ba863a2aa538efe1e3e850084bdd33afd27d2 This patch fixes.
Task: 29766
Story: 2004195
Change-Id: I830ab66775e0dd57766cdab25d06500d85651dc1
- Fix the indent in the file.
- Use 'kubectl apply' instead of 'kubectl create' for more robust
service restart.
- Do not retry infinitely when Prometheus datasource already injected
into Grafana
Story: #2005117
Task: #29765
Change-Id: I5857fe62f922d27860946fd318296950834a8797
The scripts included in the Heat kube_cluster_config resource should not exit
if the particular step is skipped.
Change-Id: I2d4cf54631c8ed3a9eb30b3e6c8e1af0007e23d5
Story: #2005109
Task: #29743
All unittests using FakeLoopingCall raise an IOError if an initial
delay is not specified, because the default initial_dealy is -1.
Changing the default initial delay to 0.
story: 2005112
task: 29748
Change-Id: I6cbae0996c2347e25d8be617e4b3fd93f4d9cc95
Defines more strict security group rules for kubernetes worker nodes. The
ports that are open by default: default port range(30000-32767) for
external service ports; kubelet healthcheck port; Calico BGP network ports;
flannel overlay network ports. The cluster admin should manually config the
security group on the nodes where Traefik is allowed.
Story: #2005082
Task: #29661
Change-Id: Idbc67cb95133d3a4029105e6d4dc92519c816288