When an error occurs in a flow, the provisioning status of the load
balancer should be set to ERROR in the revert method of the first task
of the flow. This update acts as an unlock of the LB object and cannot
occur in any other revert method because the API might consider the LB
as mutable before finishing a task/flow.
Remove all occurrences of mark_loadbalancer_prov_status_error calls in
revert method of tasks that are not specifically designed for unlocking
the load balancers. Add a LoadBalancerToErrorOnRevertTask task in the
amphora failover flow to prevent a LB to be in an immutable state.
Story 2009651
Task 43810
Story 2009652
Task 43811
Note for stable/train: the code of the amphorav2 is not updated in this
backport, the source files exist in train but the feature was added
ussuri. Backporting this patch creates many merge conflicts and doesn't
provide anything for train users.
Conflicts:
octavia/controller/worker/v1/tasks/database_tasks.py
octavia/controller/worker/v2/flows/amphora_flows.py
octavia/controller/worker/v2/tasks/amphora_driver_tasks.py
octavia/controller/worker/v2/tasks/database_tasks.py
octavia/tests/unit/controller/worker/v2/flows/test_amphora_flows.py
octavia/tests/unit/controller/worker/v2/tasks/test_amphora_driver_tasks.py
octavia/tests/unit/controller/worker/v2/tasks/test_database_tasks.py
Change-Id: I48b0f5a773209b1c1b056d71c0da05d6fd82ca73
(cherry picked from commit 4b8b198fec)
(cherry picked from commit 4039d35ce2)
(cherry picked from commit 844f1348ea)
(cherry picked from commit cf2a8bdf88)
(cherry picked from commit 95690e251d)
This patch refactors the failover flows to improve the performance
and reliability of failovers in Octavia.
Specific improvements are:
* More tasks and flows will retry when other OpenStack services are
failing.
* Failover can now succeed even when all of the amphora are missing
for a given load balancer.
* It will check and repair the load balancer VIP should the VIP
port(s) become corrupted in neutron.
* It will cleanup extra resources that may be associated with a
load balancer in the event of a cloud service failure.
This patch also removes some dead code.
Conflicts:
octavia/amphorae/backends/agent/api_server/amphora_info.py
octavia/amphorae/drivers/haproxy/rest_api_driver.py
octavia/amphorae/drivers/keepalived/vrrp_rest_driver.py
octavia/api/drivers/utils.py
octavia/api/v2/controllers/load_balancer.py
octavia/common/constants.py
octavia/common/utils.py
octavia/controller/worker/v1/controller_worker.py
octavia/controller/worker/v1/flows/amphora_flows.py
octavia/controller/worker/v1/tasks/amphora_driver_tasks.py
octavia/controller/worker/v1/tasks/compute_tasks.py
octavia/controller/worker/v1/tasks/network_tasks.py
octavia/network/base.py
octavia/tests/unit/amphorae/backends/agent/api_server/test_loadbalancer.py
octavia/tests/unit/controller/worker/v1/flows/test_amphora_flows.py
octavia/tests/unit/controller/worker/v1/flows/test_load_balancer_flows.py
octavia/tests/unit/controller/worker/v1/tasks/test_network_tasks.py
octavia/tests/unit/controller/worker/v1/test_controller_worker.py
octavia/tests/unit/controller/worker/v2/tasks/test_amphora_driver_tasks.py
Change-Id: I04cb2f1f10ec566298834f81df0cf8b100ca916c
Story: 2003084
Task: 23166
Story: 2004440
Task: 28108
(cherry picked from commit 955bb88406)
(cherry picked from commit 2f9dc3693e)
(cherry picked from commit edebde748d0283a9948c8b7f6386d5a8835c617c)
When we create amphora for specific loadbalancer it would be
good to get info about mapping pair loadbalancer-amphora as
soon as possible.
For example, if admin needs to debug connectivity issues with amphora
VMs - loadbalancer_id won't be set until AmphoraComputeConnectivityWait
task succeeds (which is never for such case) so they have to go to worker
logs to understand which amphora is related to a currently creating
loadbalancer.
Conflicts:
octavia/tests/unit/controller/worker/v2/flows/test_amphora_flows.py
NOTE(s10): conflict is due to I55d6c1a0b3e6060d6dacc13ee67d87f0219ef7de
not in Train and older stable branches.
Co-Authored-By: Ann Taraday <akamyshnikova@mirantis.com>
Change-Id: I865445af34bc63b90d965ef5e2c8f9f49aa9c2f3
(cherry picked from commit 005cd1e6a6)
The MapLoadbalancerToAmphora task performs some database queries but
doesn't catch any exception from it.
In case something goes wrong with the database or the query, this commit
catchs any exception from the DB call, logs it and then returns None to
switch to the next decider branch (boot a new amphora).
Story 2007320
Task 38831
Conflicts:
octavia/controller/worker/v1/tasks/database_tasks.py
Change-Id: Ifb6c34426e0927534d332a8bbf2c66aac6c002c5
(cherry picked from commit 3bbd32a2a5)
This patch allows listeners on a load balancer to continue to
operate should one listener fail to access secret content in
barbican. Previously if one listener failed to access barbican
content, all of the listeners would be impacted.
This patch also cleans up some unused code and unnecessary comments.
Change-Id: I300839fe7cf88763e1e0b8c484029662beb64f0a
Story: 2006676
Task: 36951
I had a few minor nits on the volume-based patch. This patch
corrects those.
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Change-Id: I5f9ce36c878973f4ed96527af6f1024a362421d8
This patch creates an Amphora v2 provider driver as well as a
V2 controller worker.
This is in preparation for having the amphora driver use the new
provider driver data models and rely less on native Octavia database
access.
It is also a prepartion step for enabling TaskFlow JobBoard as
this work will move to storing dictionaries in the flows instead
of database models.
Change-Id: Ia65539a8c39560e2276750d8e79a637be4c0f265
Story: 2005072
Task: 30806
Octavia creates certificates and keys to manage encrypted
communication channel to amphorae.
When debug is enabled, the python taskflow module will log
all the information we provide to tasks (and sub-flows)
when we create amphorae or handle with anything related to
certificates and keys management (rotations, etc).
There are ways to tell taskflow to exclude specific things
from being logged (e.g., I136081045787c1bbe3ee846d5845a34201c57864).
While this handles some information in specific flows from being
logged, it is susceptive to code changes.
To avoid an everlasting whack-a-mole game, this patch will merely
encrypt sensitive information so we can safely log it and decrypts
it only when we need to use it.
Change-Id: I06d329ca53bc36bd27f7870ae7c7ca0cf18575b2
This patch validates that a flavor is compatible with using spares
pool amphora. It will also update the amphora-agent config after
a spares pool amphora has been allocated.
This patch enables the ability to update a running amphora's agent
configuration and have the mutatable options be adopted.
The following amphora agent configuration options can be updated:
heartbeat_key
controller_ip_port_list
heartbeat_interval
loadbalancer_topology
This patch adds the support to the amphora-agent and the amphora
driver. A follow on patch will expose this capabililty via the
amphora admin API.
Change-Id: I97bdf5188808193516509f20767e82c0f8d2f5a5
Operators want to have the ability to see amphora flavor information.
But they haven't access permisson of octavia configuration file. So
it is necessary to show amphora flavor information as part of command
'openstack loadbalancer amphora list/show'.
Story: 2002896
Task: 22986
Change-Id: Ib3ca05d816747d08ef7055ec532b81746468cbf9
If a load balancer loses more than one amphora at the same time
the failover process will fail and leave the load balancer in
provisioning status ERROR.
This patch resolves this by failing over one amphora at a time
marking any amphora that are also failed in status ERROR. The health
manager will then failover the other failed amphora in subsequent checks.
This patch will update multiple healthy amphora in parallel and will
timeout failed amphroa using the new "active_connection_max_retries"
configuration setting used for "fail-fast" connections.
The patch also updates the amphora failover flow documentation to
show the full flow and not just the spares failover flow.
It updates the amphora driver "get_diagnostics" method to pass instead
of error.
It also adds a AmphoraComputeConnectivityWait task to explicitly wait
for a compute instance to come up and be reachable. This allows a longer
timeout and clarifies this may fail due to compute (nova) failures.
Previously the first plug vip task would do this wait.
Change-Id: Ief97ddda8261b5bbc54c6824f90ae9c7a2d81701
Story: 2001481
Task: 6202
This patch aligns all of Octavia to use oslo_log instead of the built
in python logging. This should provide consistent log formats.
It adds a hacking check to make sure "logging" doesn't come back into
the code.
Change-Id: I9b76c2bb5a5c396faf85df4606f2ca00f23de913
This patch extend Octavia v2 API to access qos_policy_id from neutron.
Users can pass it as 'vip_qos_policy_id' to Octavia request body to
create/update Loadbalancers, and the vrrp ports will have the qos
abilities.
This patch modifies the Loadbalancer Post/Put request body and response
body. It also extends the 'vip' table with the new column named
'qos_policy_id' to store the qos_id from neutron.
Co-Authored-By: Reedip <reedip.banerjee@nectechnologies.in>
Change-Id: I43aba9d2ae816b1498d16da077936d6bdb62e30a
A patch was added[1] that put an immutable check in the amphora failover
flow. However, when failover was triggered by the failover API, the load
balancer was already in PENDING_UPDATE. This caused the flow to fail and the
load balancer to be left in ERROR. This patch resolves those two code paths
to handle the load balancer locking the same.
[1] https://review.openstack.org/#/c/479109/
Change-Id: I6c8549c15a302add8c69ab95bde57fb2646455b6
This patch makes sure we consistently update the member operating status
when the user adds/removes a health monitor from a pool.
Change-Id: I71184d66ee035b9afbfee62e9a0ebb5147c2b47e
This will enable a number of possible features that need to select
amphorae based on their availability zone.
This would allow for quick-lookups on large lists and could be stale,
but it would be expected that future code that uses this would check
with nova for an update if it needs fully accurate data.
Having it be explicitly "cached" should take care of concerns about
users (operators, in this case) being confused about correctness.
Using simply the word "zone" should address concerns about commonality
between compute providers.
Change-Id: I8e26f99bca3496a454ba7bae2570f517e07d5fc2
Story: 2001221
Task: 5732
pylint will be added in fllow up patch
************* Module octavia.api.handlers.controller_simulator.handler
E: 56,21: Too many positional arguments for method call (too-many-function-args)
E: 59,21: Too many positional arguments for method call (too-many-function-args)
E: 93,26: Too many positional arguments for method call (too-many-function-args)
E: 96,26: Too many positional arguments for method call (too-many-function-args)
E:119,24: Too many positional arguments for method call (too-many-function-args)
E:122,24: Too many positional arguments for method call (too-many-function-args)
E:150,20: Too many positional arguments for method call (too-many-function-args)
************* Module octavia.api.v2.types.load_balancer
E: 58,17: Bad first argument 'BaseLoadBalancerType' given to super() (bad-super-call)
************* Module octavia.controller.worker.tasks.amphora_driver_tasks
E:310, 8: Unsupported logging format character 'a' (0x61) at index 31 (logging-unsupported-format)
************* Module octavia.controller.worker.tasks.database_tasks
E:1437,12: Unexpected keyword argument 'provisioning_status' in method call (unexpected-keyword-arg)
E:1437,12: No value for argument 'pool_dict' in method call (no-value-for-parameter)
Change-Id: I2554a5f94f70058000ce33460fbfc6021e735eeb
This will allow an operator to force the failover of a load
balancer's underlying amphora for upgrades or other
maintenance.
- Adds a new failover endpoint to the queue
- Adds the functionality to the worker
- Adds the failover command to the producer
- Adds a failover controller so
/lodabalancer/123/failover will initiate
a failover and return 202
- Adds logic to insert the server group into the
failover flow
Change-Id: Ic4698066773828ae37b55a8d79bd2df6fc6624be
This patch fixes a revert method that was not handling extra parameters
being passed to it.
It also adds a hacking check to make sure this does not happen in the
future.
The patch also breaks the bad habit of compiling regex strings for every
line of code in the project.
Change-Id: If29e377204432e215bfea97f9d76bce0a442f4c8
Now that all objects have a proper provisioning status, update the revert
methods to properly set the provisioning status to error.
Change-Id: I74e44474e7cd05979f2a9ce24143641230e4b394
Closes-Bug: #1624166
This became a lot more complicated than originally anticipated... When
we made the decision to use the pool_id as the hm_id, someone should
have smacked us.
Depends-On: I8dd385e3c993942473e67d04367cdf74495dbeef
Change-Id: I8c9f3bfe6766eac93642b656efd8876279d3d378
Closes-Bug: #1692171
Still need to fix the entry-points for each individual type, but that
wasn't even in the original spec. Not sure if we even want that.
I think this may not do things EXACTLY how the old one did it, we'll
need to look into whether it matters, as we never published docs for it
and I don't think it ever actually worked properly in neutron-lbaas.
Also closing a few bugs that are only peripherally related, because we
(possibly me) forgot to tag them on the individual CRs, but I'm
considering them closed as of this patch. See below for my reasoning on
each individual bug, and feel free to post counter-arguments.
For #1673546 (single-call create): This is the obvious one!
For #1673499 (lb return pool object): Rolled into this patch as a matter
of course, abandoned the original fix as it is no longer relevant.
For #1544214 (root tags): All existing resources now have root tags. Any
new ones will also need root tags, but I would consider this bug closed.
For #1596636 (tenant facing API): Every object is now creatable via the
v2 API, so I would consider this to be complete. Quotas and some
additional work is being finished, but it's not necessary for this IMO.
For #1665446 (hm id): This was resolved in the HM patch, I just forgot
to close it, and including it here will ensure it is release-tracked.
For #1685789 (listener quota): Just shoving it in here as I do the
single-create quotas.
For #1685827 (hm quota): Same as listener quota.
Closes-Bug: #1673546
Closes-Bug: #1673499
Closes-Bug: #1544214
Closes-Bug: #1596636
Closes-Bug: #1665446
Closes-Bug: #1685789
Closes-Bug: #1685827
Depends-On: I3d86482a2999197a60a81d42afc5ef7a6e71e313
Change-Id: I4ff03593e1cfd8dca00a13c0550d6cf95b93d746
GET all - /v2.0/lbaas/healthmonitors
GET one - /v2.0/lbaas/healthmonitors/<hm_id>
POST - /v2.0/lbaas/healthmonitors {<body>}
PUT - /v2.0/lbaas/healthmonitors<hm_id> {<body>}
DELETE - /v2.0/lbaas/healthmonitors/<hm_id>
Co-Authored-By: Sindhu Devale <sindhu.devale@intel.com>
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Co-Authored-By: Reedip Banerjee <reedip14@gmail.com>
Partial-Bug: #1616643
Change-Id: I7f65bb9370530ae3b533b927fcdabc6c1a295231
The endpoint are as follows:
- /v2.0/lbaas/l7policies/
- /v2.0/l7policies/
GET all - /<policy-id>/l7rules/
GET one - /<policy-id>/l7rules/<rule-id>
POST - /<policy-id>/l7rules/ {<body>}
PUT - /<policy-id>/l7rules/<rule-id> {<body>}
DELETE - /<policy-id>/l7rules/<rule-id>
Partially Closes-Bug: #1616701
Co-Authored-By: Shashank Kumar Shankar <shashank.kumar.shankar@intel.com>
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Change-Id: I247988a2ea19a92f827756504a0ee46679bbc53b
GET all - /v2.0/lbaas/l7policies/<l7policy-id>
GET one - /v2.0/lbaas/l7policies/<l7policy-id>
POST - /v2.0/lbaas/l7policies {<body>}
PUT - /v2.0/lbaas/l7policies/<l7policy-id> {<body>}
DELETE - /v2.0/lbaas/l7policies/<l7policy-id>
Co-Authored-By: Nakul Dahiwade <nakul.dahiwade@intel.com>
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Partially-Implements: #1616655
Change-Id: I91baf79df16d4a1eefd151ed87ec871b57ac6ef8
Octavia has no quota definitions, but needs them for parity with Neutron LBaaS.
This will provide an endpoint and support for retrieving, updating, and deleting
quotas for projects, as well as adding enforcement of those those quotas.
Adds scenario test that simply validates quotas in a lb graph.
Co-Authored-By: Michael Johnson <johnsomor@gmail.com>
Co-Authored-By: Phillip Toohill <phillip.toohill@rackspace.com>
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Change-Id: Ia1d85dcd931a57a2fa3f6276d3fe6dabfeadd15e
Closes-Bug: #1596652
Remove unneeded import_group lines which are not doing anything and just makes
code harder to understand.
Change-Id: I673dd04dd31ae9771e6af982d184eee0e9cbf2d4
This patch adds capstone tasks to our flows that make sure reverts
end in the proper final state for the impacted objects.
Previously some failure scenarios left objects in PENDING_* state.
To accomplish the above, this patch adds provisioning status
to all of the remaining top level objects(pools, members, etc.)
and adds tasks to maintain the proper provisioning status for
these objects.
Change-Id: I857b1aa28e4a6c7466c3abf4e088a74367e78faf
Closes-Bug: #1623686
A number of the revert methods in our flow tasks make database
repository calls but do not handle potential exceptions. This can
lead to the revert flow aborting at these tasks.
This patch adds exception handling to these repository calls inside
the revert methods.
Change-Id: Ia9b07890fe2c56382041175018679b87298c1b10
Closes-Bug: #1624047
During loadbalancer delete in certain corner case where
Amphora agent never reports the heartbeat to the octavia controller.
After load balancer creation the health manager updates the health status of
amphora in the amphora_health table . Since none of the heartbeat received
from the amphora agent and hence no entry is created in 'amphora_health'
table And eventually it goes PENDING_DELETE during delete .
This fix handles UnmappedInstanceError when there is no entries represented
in the octavia DB for the amphora health.
Change-Id: Ib149c7c93ae2f0ad97b09cbcb5668a8872f01879
Closes-Bug: #1609235
This fixes some typos and copy-paste mistakes in docstrings and
logs. Missing docstrings were added as well.
Change-Id: Idc61068f36c3a30743fe7eff033d7e8b0d660661
They are modules, should be imported rather than running in shell,
hence not require for executable privilege.
Change-Id: I869d73411ec8e308a80d780e10e77dcc48097d42
The new taskflow exposed areas we forgot to update
the revert method with the correct object data.
Taskflow now validates the revert data which exposed
those inconsistencies in the project
Change-Id: I319d59b345aab07784ae4dba19e0ffb7fbba0b04
Closes-Bug: #1581615
During failover process, the new amphorae db info will be populated by
the old one, including 'role' and 'vrrp_priority', which both will be
updated again after plug_network. What's more, 'role' will be an
appropriate flag for failover monitoring tool to decide whether or not
the failover process is almost finished.
Change-Id: I9602b92b36ef265f8ae7c9171170cd86353b2944
Our code for updating a pool in the repository was off: You could
potentially update other pool parameters (ie. not session_persistence)
and it would unexpectedly clear the session_persistence. This patch
corrects this problem and cleans up the code dealing with session
persistence so that it's more understandable what it's doing.
There was also a bug in Neutron-LBaaS' handling of session_persistence
which was making troubleshooting this problem more difficult. This patch
is a prerequisite for the Neutron-LBaaS bugfix.
Beyond this, this patch also cleans up calls to
repository.update_pool_and_sp to not require the sp_dict parameter
(which, it turns out, did not need to be split out of the pool data
structure prior to calling this method anyway).
Closes-Bug: #1547157
Change-Id: Idcf12e463fbaa3a61a211f13986d8472f52036d2
Sets up the flows and some new tasks required to create all the
resources needed for an entire load balancer graph. This includes
updating all listeners on all amphorae (depending on topology), and
plugging networks and setting up the routes and rules on every
amphora for a load balancer. Luckily this mostly reuses tasks and
flows that were already created, though some new tasks and flows
were created specifically for handling many listeners.
Co-Authored-By: Trevor Vardeman <trevor.vardeman@rackspace.com>
Change-Id: I43a838e80281a37537e179cd8d4768f45e1ca7f1
Adds a new cascading delete method to the REST API.
When a load balancer is deleted it will alo delete
all associated listeners, pools, memebers, healthmonitors,
and L7 Policies
Change-Id: I0fd88923dc76e573b92d83f68d292ded913b13a6
https://blueprints.launchpad.net/octavia/+spec/anti-affinity
Added a new column in lb table for server group id;
Added a new task in compute tasks for creating server group;
Added a new task in dtabase tasks to update server
group id info for lb;
Add server group id in create method in nova driver to support
anti-affinity when creating compute instance
Change-Id: If0d3a9ba1012651937a2bda9bc95ab4f4c8852d5
This commit modifies the controller worker and its tasks and flows
in order to enable the manipulation of L7 policies and rules
in Octavia as well as unit tests for the same. It is one in a chain
of commits designed to keep the size of each individual commit
manageable / reviewable. Jinja template and documentation updates
will come in later commits.
Partially-Implements: blueprint lbaas-l7-rules
Partially-Implements: blueprint layer-7-switching
Change-Id: I13de3f89d9236cf508744b04c1d8de04296a34f3
The introduction of shared_pools broke one of the flows for
assigning the listener peer_port. This went unnoticed for a little
while since this is presently only used in active-standby
topologies, and we don't have any scenario tests right now which
exercise active-standby regularly.
In looking to fix this flow, I realized that there's no reason we
can't assign the listener peer_port when the listener object is
created in the database. By doing this, we eliminate the need for
a couple controller worker database tasks and simplify the
listener creation flow.
This patch, therefore, updates the repository code to assign the
peer_port on listener creation, and eliminates the now redundant
controller worker database tasks and simplifies the create listener
flow.
Change-Id: I0c15dfa154c7cd57f1626945bb76c0ac0b9de071
Closes-Bug: 1547233
Some of the revert methods on tasks inside Octavia are missing
required parameters. This causes the revert to fail.
This patch corrects those revert methods.
Closes-Bug: #1527428
Change-Id: I2accbe55db710a312d31be5f3da8e06b7ab79025
This patch introduces shared pools functionality to
Octavia. This means that with this patch, listeners and
pools will have the ability to have a N:M relationship
instead of a simple 1:1 relationship, although they must
still be associated with the same loadbalancer object.
This patch includes a schema change to the database: pools
are now associated directly to loadbalancers instead of
listeners. The migration in this patch includes ETL which
should populate this new field in the pool table correctly.
Extensive API changes were necessary to facilitate this
change. However, all the changes to the API should be
backward compatible.
This patch is a necessary precursor to adding L7 switching
functionality to Octavia.
Partially-Implements: blueprint lbaas-l7-rules
Partially-Implements: blueprint layer-7-switching
Change-Id: I797c718412e756be067dd4c304c989a4d43bb8ef