Commit Graph

335 Commits (06659759f1213df23f1d456503b1dd2ea8e90963)

Author SHA1 Message Date
Andreas Jaeger ae228bb5cc Update hacking for Python3
The repo is Python 3 now, so update hacking to version 3.0 which
supports Python 3.

Fix problems found.

Update local hacking checks for new flake8.

Remove hacking and friends from lower-constraints, those are not needed
for co-installing.

Change-Id: I926efaef501f190e78da9cab40c1e94203277258
3 years ago
Theodoros Tsioutsias 0ac4db955f ng-13: Support nodegroup upgrade
Adds support for upgrading nodegroups. All non-default nodegroups,
are allowed to be upgraded using the CT set in the cluster. The
only label that gets upgraded for now is kube_tag. All other labels
in the new cluster_template are ignored.

Change-Id: Icade1a70f160d5ec1c0e6f06ee642e29fe9b02ff
4 years ago
Theodoros Tsioutsias 5027e0daf8 ng-8: APIs for nodegroup CRUD operations
This adds the changes needed in the API and conductor level to support
creating updating and deleting nodegroups.

Change-Id: I4ad60994ad6b4cb9cac18129557e1e87e61ae98c
4 years ago
Theodoros Tsioutsias cbe05aa97d ng-6: Add new fields to nodegroup objects
Since each nodegroup will be one independent stack, we have to add
more fields to the table and object in order to track each stack
contained in the cluster. This adds the stack_id, version, status,
status_reason and version fields to the nodegroup object.

Change-Id: I6d36b2d3bc6476efbef6a9f702ffc73cfa0fab8c
4 years ago
Emanuel Andrecut e5eade03dc Add information about the cluster in magnum event notifications
Magnum is sending notifications like cluster create but has no
details regarding the cluster, like cluster UUID. Notifications
from other OpenStack projects contain full detailed information
(e.g. instance UUID in Nova instance create notification).
Detailed notifications are important for other OpenStack
projects like Searchlight or third party projects that cache
information regarding OpenStack objects or have custom actions
running on notification. Caching systems can efficiently update
one single object (e.g. cluster), while without notifications
they need to periodically retrieve object list, which is
inefficient.

Change-Id: I820fbe0659222ba31baf43ca09d2bbb0030ed61f
Story: #2006297
Task: 36009
4 years ago
Spyros Trigazis (strigazi) 9b1bd5da54 Add cluster upgrade to the API
To enable the rolling upgrade ability of Kubernetes Cluster, this
patch is proposing a new API /upgrade to support upgrade the
base operating system of nodes and the version of Kubernetes, even
add-ons running on the k8s cluster:

POST <ClusterID>/actions/upgrade

And the post body will be:

{
    "cluster_template": 'dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2',
    "max_batch_size": 1,
    "nodegroup": "production_group"
}

Co-Authored-By: Feilong Wang <flwang@catalyst.net.nz>

Task: 30168
Story: 2002210

Change-Id: Ia168877778aa0d473383eb06b1c8a16dc06b0576
4 years ago
Theodoros Tsioutsias 3f80cbab06 ng-4: Adapt cluster object
This commit removes the fields node_addresses, master_addresses,
node_count and master_count from the cluster object since this info
will be stored in the nodegroups. At the same time, provides the way
to adapt existing clusters to the new schema.

story: 2005266

Change-Id: Iaf2cef3cc50b956c9b6d7bae13dbb716ae54eaf7
4 years ago
Theodoros Tsioutsias 18c77a288d ng-2: Adapt existing cluster APIs and conductor
This changes the existing cluster APIs and the cluster conductor to
take into consideration nodegroups:

* create: now creates the default nodegroups for the cluster
* update: updates the default nodegroups of the cluster
* delete: deletes also the nodegroups that belong to the cluster
* cluster_resize: takes into account the nodegroup provided by the API

story: 2005266

Change-Id: I5478c83ca316f8f09625607d5ae9d9f3c02eb65a
4 years ago
Zuul 0cd35dbcca Merge "Support <ClusterID>/actions/resize API" 4 years ago
Feilong Wang 15ecdb8033 Support <ClusterID>/actions/resize API
Now an OpenStack driver for Kubernetes Cluster Autoscaler is being
proposed to support autoscaling when running k8s cluster on top of
OpenStack. However, currently there is no way in Magnum to let
the external consumer to control which node will be removed. The
alternative option is calling Heat API directly but obviously it
is not the best solution and it's confusing k8s community. So with
this patch, we're going to add a new API:

POST <ClusterID>/actions/resize

And the post body will be:

{
    "node_count": 3,
    "nodes_to_remove": ["dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2"],
    "nodegroup": "production_group"
}

The API will be working in a declarative way. For example, there
are 3 nodes in the cluser now, user can propose an API request
like above. Magnum will call Heat to remove the node
dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2 firstly, then bring the node
count back to 3 again.

Task: 29563
Story: 2005052

Change-Id: I7e36ce82c3f442976cc498153950b19c56a1759f
4 years ago
Jake Yip ea362b1391 python3 fix: decode binary cert data if encountered
We are writing to files opened with text mode ('w+'), so binary data
will have to be decoded before writing

Task: 29577
Story: 2005057
Change-Id: I034d0230c3022e701111bdc71f0af43da1852c3c
4 years ago
Michal Arbet 54bea06b5a Fix python3 compatibility
Change-Id: Id8a0913dde556a3e59b1ffdb22ca5e2aabd257a2
Closes-Bug: #1803972
4 years ago
Erik Olof Gunnar Andersson f2fd732ce2 Trivial code cleanups
Cleaning up comments and logging to make sure they properly adhere
to Openstack standards.

* Consistently use """ instead of ''' for comments.
* Always lazy-load logging parameters.
* Fixed bad log line in cert_manager.

Change-Id: I547f5dfa61609a899aef9b1470be8d8a6d8e4b81
5 years ago
Kirsten G d9e590bdc6 Cache barbican certs for periodic tasks
Added configuration parameter, temp_cache_dir, to magnum.conf with
default value of "/var/lib/magnum/certificate-cache". This local
directory will hold cached cluster TLS credentials that are generated
during periodic tasks, to reduce load as the number of clusters
increases. If the temp_cache_dir does not exist, the certificates
will be created as tempfiles.

Closes-Bug: #1659545

Change-Id: I8808c4098a7c8d22dbfc841142c9f9c8b976dde1
5 years ago
Zuul 7aa0c0a285 Merge "federation api: api endpoints" 5 years ago
Clenimar Filemon ec950be894 federation api: api endpoints
this commit introduces a new '/federations'
endpoint to Magnum API, as well as its controllers,
entities and conductor handlers.

this corresponds to the first phase of the
federation-api spec. please refer to [1] for more
details.

[1] https://review.openstack.org/#/c/489609/

Change-Id: I662ac2d6ddec07b50712109541486fd26c5d21de
Partially-Implements: blueprint federation-api
5 years ago
Spyros Trigazis 2329cb7fb4 k8s: Fix kubelet, add RBAC and pass e2e tests
Due to a few several small connected patches for the
fedora atomic driver, this patch includes 4 smaller patches.

Patch 1:
k8s: Do not start kubelet and kube-proxy on master

Patch [1], misses the removal of kubelet and kube-proxy from
enable-services-master.sh and therefore they are started if they
exist in the image or the script will fail.

https://review.openstack.org/#/c/533593/
Closes-Bug: #1726482

Patch 2:
k8s: Set require-kubeconfig when needed

From kubernetes 1.8 [1] --require-kubeconfig is deprecated and
in kubernetes 1.9 it is removed.

Add --require-kubeconfig only for k8s <= 1.8.

[1] https://github.com/kubernetes/kubernetes/issues/36745

Closes-Bug: #1718926

https://review.openstack.org/#/c/534309/

Patch 3:
k8s_fedora: Add RBAC configuration

* Make certificates and kubeconfigs compatible
  with NodeAuthorizer [1].
* Add CoreDNS roles and rolebindings.
* Create the system:kube-apiserver-to-kubelet ClusterRole.
* Bind the system:kube-apiserver-to-kubelet ClusterRole to
  the kubernetes user.
* remove creation of kube-system namespaces, it is created
  by default
* update client cert generation in the conductor with
  kubernetes' requirements
* Add --insecure-bind-address=127.0.0.1 to work on
  multi-master too. The controller manager on each
  node needs to contact the apiserver (on the same node)
  on 127.0.0.1:8080

[1] https://kubernetes.io/docs/admin/authorization/node/

Closes-Bug: #1742420
Depends-On: If43c3d0a0d83c42ff1fceffe4bcc333b31dbdaab
https://review.openstack.org/#/c/527103/

Patch 4:
k8s_fedora: Update coredns config to pass e2e

To pass the e2e conformance tests, coredns needs to
be configured with POD-MODE verified. Otherwise, pods
won't be resolvable [1].

[1] https://github.com/coredns/coredns/tree/master/plugin/kubernetes

https://review.openstack.org/#/c/528566/
Closes-Bug: #1738633

Change-Id: Ibd5245ca0f5a11e1d67a2514cebb2ffe8aa5e7de
5 years ago
coldmoment ba8ad5e37f Add a hacking rule for string interpolation at logging
String interpolation should be delayed to be handled
by the logging code, rather than being done at the point
of the logging call.
See the oslo i18n guideline
* https://docs.openstack.org/oslo.i18n/latest/user/guidelines.html#adding-variables-to-log-messages
and
* https://github.com/openstack-dev/hacking/blob/master/hacking/checks/other.py#L39

Change-Id: I8a4f5f896865aebbff88ee894f0081e58cfce9ef
6 years ago
yuanpeng 71d25456d2 Remove log translations
Log messages are no longer being translated. This removes all use of
the _LE, _LI, and _LW translation markers to simplify logging and to
avoid confusion with new contributions.

See:
http://lists.openstack.org/pipermail/openstack-i18n/2016-November/002574.html
http://lists.openstack.org/pipermail/openstack-dev/2017-March/113365.html

Change-Id: If1f4bd2f6be967368f52fb367c5a428d3eb58a9d
Closes-Bug:#1674551
6 years ago
Johannes Grassler e93d82e8b3 Fix CVE-2016-7404
This commit addresses multiple potential vulnerabilities in
Magnum. It makes the following changes:

* Permissions for /etc/sysconfig/heat-params inside Magnum
  created instances are tightened to 0600 (used to be 0755).
* Certificate retrieval is modified to work without the need
  for a Keystone trust.
* The cluster's Keystone trust id is only passed into
  instances for clusters where that is actually needed. This
  prevents the trustee user from consuming the trust in cases
  where it is not needed.
* The configuration setting trust/cluster_user_trust (False by
  default) is introduced. It needs to be explicitely enabled
  by the cloud operator to allow clusters that need the
  trust_id to be passed into instances to work. Without this
  setting, attempts to create such clusters will fail.

Please note, that none of these changes apply to existing
clusters. They will have to be deleted and rebuilt to benefit
from these changes.

Change-Id: I643d408cde0d6e30812cf6429fb7118184793400
6 years ago
Jenkins 4e1ada7914 Merge "Integrate OSProfiler in Magnum" 6 years ago
Jason Dunsmore a65ef7d3c3 Add an API to rotate a cluster CA certificate
This will give admins a way to revoke access to an existing cluster
once a user has been granted access.

Bumped the API microversion to 1.5 for the new endpoint.

Deprecated policy certificate:get in favor of certificate:get_ca for
clarity and consistency.

Depends-On: Ie960464e45445e195e75b91e8d65a4046eb21e93
Implements: blueprint revoke-cluster-cert
Change-Id: Ief28bef3a79f212acf4166e443a96e5419fbb757
6 years ago
Tovin Seven 32d088b2c1 Integrate OSProfiler in Magnum
* Add osprofiler wsgi middleware. This middleware is used for 2 things:
  1) It checks that person who wants to trace is trusted and knows
     secret HMAC key.
  2) It starts tracing in case of proper trace headers
     and adds first wsgi trace point, with info about HTTP request

* Add initialization of osprofiler at start of service
  Currently that includes oslo.messaging notifer instance creation
  to send Ceilometer backend notifications.

* Traces HTTP/RPC/DB API calls

Demo: https://hieulq.github.io/cluster-create-false-new-html.html

Co-Authored-By: Hieu LE <hieulq@vn.fujitsu.com>
Implements: blueprint osprofiler-support-in-magnum
Change-Id: I7d68995aab81d365433950aada078ef1fcd5469b
6 years ago
Danil Golov 9cbb142c2e Add cluster record to db right after API request
This commit changes the incorrect behavior of cluster create workflow.
Now db record with status CREATE_IN_PROGRESS is created right after
related API request.

Change-Id: I11692c4126823d49672ba5172fa45774bf0ce544
Closes-bug: #1640729
7 years ago
Randall Burt 759c1b3b2b Move cluster status updates into driver
This is an alternative implementation to:

https://review.openstack.org/#/c/397961

This version implements an earlier proposal from the
spec that adds a driver method for synchronizing
cluster state. This method is optional so that drivers
that do not wish to leverage the existing periodic
synchronization task can do so in whatever manner
they wish and Magnum will not force them to do anything
unnecessarily.

1. add an update_cluster_status method to the driver
   interface
2. implment update_cluster_status for Heat drivers
   using the existing tested logic
3. Remove cluster status updates from the cluster conductor
   in favor of the periodic sync_cluster_status task - this
   should avoid timeouts and race conditions possible in the
   previous implementation
4. Update the periodic sync_cluster_status method to use
   the driver to update cluster status rather than calling
   Heat directly

Change-Id: Iae0ec7af2542343cc51e85f0efd21086d693e540
Partial-Blueprint: bp-driver-consolodation
7 years ago
Randall Burt 7890725c52 Refactor driver interface (pt 1)
Refactor driver interface to encapsulate the orchestration
strategy. This first patch only refactors the main driver
operations. A follow-on will handle the state synchronization
and removing the poller from the conductor.

1. Make driver interface abstract
2. Move external cluster operations into driver interface
3. Make Heat-based driver abstract and update based on
   driver interface changes
4. Move Heat driver code into its own module
5. Update existing Heat drivers based on interface changes

Change-Id: Icfa72e27dc496862d950ac608885567c911f47f2
Partial-Blueprint: bp-driver-consolodation
7 years ago
murali allada 0d6deae21b Move cluster delete method to driver
This patch moves the cluster deletion logic into driver to conform
with the driver spec.
https://github.com/openstack/magnum/blob/master/doc/source/userguide.rst#cluster-drivers

Change-Id: I8a4f9a94b70feddae4a137942bbb98fa36388676
Partially-Implements: blueprint bay-drivers
7 years ago
Jenkins ba11900215 Merge "Cluster Drivers" 7 years ago
murali allada 104501cfe6 Cluster Drivers
- Dynamically load drivers using stevedore
- Changed the entry points to reference drivers instead of
  template definitions
- Implement Create and update driver operations

Change-Id: I5c3259404c796e1935c872cf3109ffecae3cee02
Partially-Implements: blueprint bay-drivers
7 years ago
Wenzhi Yu 4a7d265aeb Implement mesos cluster smart scale down
We currently allow Magnum to scale down mesos cluster by removing nodes
from the cluster's ResourceGroup by updating the heat stack that created
the cluster. The problem with this approach is that Heat decides which
nodes to delete, and all containers on that node will also be deleted.

The smart cluster scale down feature has been implemented for k8s bays(
for k8s cluster, we'll ask Heat to delete nodes that have NO CONTAINERS
on them).

This patch proposes a similar implementation for a mesos cluster.

Change-Id: I00cda7f35c9db978bdc604cf86603ef58e339256
Implements: blueprint mesos-smart-bay-scale-down
7 years ago
Hieu LE 8f9eeb801a Centralize config option: cluster_heat section
Centralize config option of Cluster Heat section.
Replace oslo_conf cfg to magnum.conf.

Change-Id: I9118eeb17061a0aa26269ea9deaba28e79f28b76
Implements: blueprint centralize-config-magnum
7 years ago
Madhuri Kumari 9f7295475d Add exceptions to cluster db to show failures
After changing create to async operation, all the exceptions goes hidden
and no db entry is created to show the actual failure. One has to go through
the logs to find the actual error. This patch creates a db entry in cluster
table to show the actual error in column 'status_reason'.

Change-Id: Iad6e8bfce7326b34dea04914e4552f87d2796e86
Closes-bug: #1623387
7 years ago
Jaycen Grant 729c2d0ab4 Rename Bay DB, Object, and internal usage to Cluster
This is patch 3 of 3 to change the internal usage of the terms
Bay and BayModel.  This patch updates Bay to Cluster in DB and
Object as well as all the usages.  No functionality should be
changed by this patch, just naming and db updates.

Change-Id: Ife04b0f944ded03ca932d70e09e6766d09cf5d9f
Implements: blueprint rename-bay-to-cluster
7 years ago
Cao Xuan Hoang 33bd24252f Clean imports in code
This patch set modifies lines which are importing objects
instead of modules. As per openstack import guide lines, user should
import modules in a file not objects.

http://docs.openstack.org/developer/hacking/#imports

Closes-Bug: #1620161

Change-Id: I7ec9022a6b1cec36c678a2cec2a1856e70a51c68
7 years ago
Jaycen Grant 0b7c6401dd Rename BayModel DB, Object, and internal usage to ClusterTemplate
This patch is the first of 3 patches to change the internal
usage of the terms Bay and BayModel. This patch updates
BayModel to ClusterTemplate. No functionality should be
changed by this patch, just naming and db updates.

Change-Id: I0803e81be6482962be2878a8ea2c7480f89111ac
Implements: blueprint rename-bay-to-cluster
7 years ago
Jenkins 42abb07835 Merge "Rename bay to cluster in certificate object and references" 7 years ago
Jaycen Grant 8e0de76aff Rename bay to cluster in certificate object and references
This is patch #2 of 3 to rename the term bay to cluster within
the internal references and objects of magnum. This patch changes
all references to the certificate objects bay_uuid field to
cluster_uuid.  Certifcate does not have a db table so no db
changes were made. No functionality is changed by this patch,
just internal naming.

Change-Id: I68a3b87b75b49de43a7855355807b50a4ae695f3
Implements: blueprint rename-bay-to-cluster
7 years ago
Stephen Watson 6ead3e4780 Updates CONF usage from bay to cluster.
Renames bay-related CONF options to their respective cluster names.
Adds release notes for CONF changes.

Change-Id: I7bbe0927e54c1f40a47bfdea448a88b467fef106
Implements: blueprint rename-bay-to-cluster
7 years ago
Jenkins 5258fddf9d Merge "Include version info in bay/cluster show operation" 7 years ago
Vijendar Komalla 50bc376c4d Include version info in bay/cluster show operation
Currently bay-show operation does not return bay/cluster
version information. This change contain changes to return
bay/cluster version and container version info.

Change-Id: Ie12b6583e6d85faa3607f87295c04d72698034a5
Closes-Bug: #1613413
7 years ago
ztetfger 23b2610973 Fix bay status: after bay-delete status is not DELETE_IN_PROGRESS
After bay-delete, the bay status will not change until the bay is
deleted. To change the bay status, bay.save() should be added.

Change-Id: Ib16e9896cdf980c9f9c1f2fc581aa723dfc52ac0
Closes-Bug: #1615891
7 years ago
Jaycen Grant eaddb942fd Rename Bay to Cluster in api
This is the first of several patches to add new Cluster commands
that will replace the Bay terminalogy in Magnum. This patch adds
the new Cluster and ClusterTemplate commands in addition to the
Bay and Baymodel commands.  Additional patches will be created
for client, docs, and additional functional tests.

Change-Id: Ie686281a6f98a1a9931158d2a79eee6ac21ed9a1
Implements: blueprint rename-bay-to-cluster
7 years ago
Jenkins 5372cd92b4 Merge "Rollback bay on update failure" 7 years ago
Wenzhi Yu 63b5c21c8d Rollback bay on update failure
There is a rollback mechanism in heat after the stack
update failed. There should be a rollback mechanism in
magnum after bay update failed.

This patch add new microversion 1.3 to add rollback
support for Magnum bay, user can enable rollback on bay
update failure by specifying microversion 1.3 in header(
{'OpenStack-API-Version': 'container-infra 1.3'}) and
passing 'rollback=True'(http://XXX/v1/bays/XXX/?rollback=True)
when issuing bay update reqeust.

Change-Id: Idd02769f98078702404a11dc9f7a3339ce4e22eb
Partially-Implements: blueprint bay-rollback-on-update-failure
7 years ago
yatin karel 52f30927af Set bay status: DELETE_IN_PROGRESS before updated by poll
Set bay status to DELETE_IN_PROGRESS as soon as delete stack
request returns. Currently this status is updated by Heat
poller.

Change-Id: I44a3cea1a5d6e9735b0b74637d7c62db2e9cffa3
Closes-Bug: #1612408
7 years ago
Vijendar Komalla bf30b9b4cb Support for async bay operations
Current implementation of magnum bay operations are synchronous
and as a result API requests are blocked until response from HEAT
service is received. With this change bay-create, bay-update and
bay-delete calls will be asynchronous.
Please note that with this change bay-create/bay-update api calls
will return bay uuid instead of bay object and also microversion
1.2 is added for new behavior.

Change-Id: I4ca1f9f386b6417726154c466e7a9104b6e6e5e1
Closes-Bug: #1588425
7 years ago
Spyros Trigazis e6a29fb252 Remove ReplicationController object
Following the removal of service [1], pod [2] and container [3], remove
COE specific object ReplicationController.

This change also removes k8s_conductor.

[1] I4f06bb779caa0ad369a2b96b4714e1bf2db8acc6
[2] I8c2499ccb97aae39d80868ce02fbef292d762c10
[3] I288fa7a9717519b1ae8195820975676d99b4d6d2

Change-Id: Ica100c8d2dfdd7dc709feb1f5cdc5a3f3d6c7318
Partially-Implements: blueprint delete-container-endpoint
Partially-Implements: blueprint bay-drivers
7 years ago
Tom Cammann 40aa6550f1 Remove container object
Following on from removing the k8s specific APIs in
I1f6f04a35dfbb39f217487fea104ded035b75569 the objects associated with
these APIs need removal.

Remove the container object, drop the db table and remove references to
the container object. The docker_conductor has also been removed as this
was used for managing containers using Magnum objects.

Change-Id: I288fa7a9717519b1ae8195820975676d99b4d6d2
Partially-Implements: blueprint delete-container-endpoint
Co-Authored-By: Spyros Trigazis <strigazi@gmail.com>
7 years ago
Bin-Lu 4427374c06 Fix the permission of these files -rwxr-xr-x
As shown above,these files have executable attribute that need be removed.

Change-Id: I1955360afe17ec83c7b759d44a5bf60d9b055594
7 years ago
qinchunhua 894f03b9f0 Correct the rest of the reraising of exception
When an exception was caught and rethrown, it should call 'raise'
without any arguments because it shows the place where an exception
occured initially instead of place where the exception re-raised.

Change-Id: I696054fc273edb62e87e5d567da6b99cdf5ec358
7 years ago