298f729659
Change-Id: I813c2281073521a2c20b3846c4ccd42549884139
301 lines
10 KiB
ReStructuredText
301 lines
10 KiB
ReStructuredText
..
|
||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||
License.
|
||
|
||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||
|
||
=================================
|
||
Neutron/L3 High Availability VRRP
|
||
=================================
|
||
|
||
Launchpad blueprint:
|
||
|
||
https://blueprints.launchpad.net/neutron/+spec/l3-high-availability
|
||
|
||
The aim of this blueprint is to add High Availability Features on
|
||
virtual routers.
|
||
|
||
High availability features will be implemented as extensions and drivers.
|
||
A first driver on the agent side will be based on Keepalived.
|
||
|
||
A new scheduler will be also added in order to be able to spawn multiple
|
||
instances of a same router on many agents for the redundancy.
|
||
|
||
The DVR blueprint will leverage this proposal as a Service node specifically
|
||
for SNAT traffic. See the reference for the DVR BP at the end of this
|
||
specification
|
||
|
||
|
||
Problem description
|
||
===================
|
||
|
||
Currently we are able to spawn more than one l3 agent, and a l3 agent is able
|
||
to handle more than one external network, however each l3 agent is a SPOF.
|
||
|
||
If an l3 agent fails, all virtual routers of this agent will be lost,
|
||
and consequently all VMs connected to these virtual routers will be isolated.
|
||
|
||
Proposed change
|
||
===============
|
||
|
||
For the Neutron server side:
|
||
|
||
The idea of this blueprint is to schedule a virtual router to at least two l3
|
||
agents, but this limit could be increased by changing a parameter in the
|
||
neutron configuration file.
|
||
|
||
For the Neutron L3 agent side:
|
||
|
||
The current router interfaces management in the l3 agent will be abstracted in
|
||
order to introduce the possibility to add drivers for that purpose. As a first
|
||
implementation of a driver, an HA Keepalived driver will be added. All the IPs
|
||
will be converted to VIPs.
|
||
|
||
In order to hide the HA traffic from the tenant point of view a HA network will
|
||
be added and all the virtual router instances will be connected through a HA
|
||
port to this network.
|
||
|
||
Flows::
|
||
|
||
+----+ +----+
|
||
| | | |
|
||
+-------+ QG +------+ +-------+ QG +------+
|
||
| | | | | | | |
|
||
| +-+--+ | | +-+--+ |
|
||
| VIPs| | | |VIPs |
|
||
| | +--+-+ +--+-+ | |
|
||
| + | | | | + |
|
||
| KEEPALIVED+---+ HA +------+ HA +----+KEEPALIVED |
|
||
| + | | | | + |
|
||
| | +--+-+ +--+-+ | |
|
||
| VIPs| | | |VIPs |
|
||
| +-+--+ | | +-+--+ |
|
||
| | | | | | | |
|
||
+-------+ QR +------+ +-------+ QR +------+
|
||
| | | |
|
||
+----+ +----+
|
||
|
||
|
||
As a phase 2 of the keepalived driver implementation, the Keepalived driver
|
||
will start a conntrackd instance in order to not lose the established
|
||
connections when switching from the active to standby.
|
||
|
||
Alternatives
|
||
------------
|
||
|
||
The first driver is going to be based on Keepalived. We could use some
|
||
alternative drivers based on other protocols for ex: Common Address Redundancy
|
||
Protocol (CARP).
|
||
|
||
By default a config parameter will be added in order to specify whether the
|
||
virtual routers will be HA or not. In addition, an admin-only API is introduced
|
||
which will allow admins to migrate existing routers to HA mode.
|
||
|
||
Data model impact
|
||
-----------------
|
||
|
||
Two new columns will be added to the router_extra_attributes table in order to
|
||
specify whether the virtual router will be HA or not and to specify the virtual
|
||
router id.
|
||
|
||
+------------+-------+---------+---------+------------+---------------------+
|
||
|Attribute |Type |Access |Default |Validation/ |Description |
|
||
|Name | | |Value |Conversion | |
|
||
+============+=======+=========+=========+============+=====================+
|
||
|ha |bool |RW, admin|False |N/A |Set router as HA |
|
||
|ha_vr_id |int |RW, admin|N/A |N/A |HA virtual router id |
|
||
+------------+-------+---------+---------+------------+---------------------+
|
||
|
||
The ha_vr_id will be limited to 255 due to VRRP protocol. This limit will have
|
||
to be removed when introducing a new driver without this limitation.
|
||
|
||
A new table will be introduced to specify the association between a router,
|
||
the agents and the HA ports that are going to be used for the HA
|
||
administrative traffic.
|
||
|
||
+------------+-------+---------+---------+------------+----+---------------+
|
||
|Attribute |Type |Access |Default |Validation/ |Key |Description |
|
||
|Name | | |Value |Conversion | | |
|
||
+============+=======+=========+=========+============+====+===============+
|
||
|port_id |UUID |RW, admin|N/A |N/A |PRI |HA port id |
|
||
+------------+-------+---------+---------+------------+----+---------------+
|
||
|router_id |UUID |RW, admin|N/A |N/A | | |
|
||
+------------+-------+---------+---------+------------+----+---------------+
|
||
|l3_agent_id |UUID |RW, admin|N/A |N/A | | |
|
||
+------------+-------+---------+---------+------------+----+---------------+
|
||
|priority |int |RW, admin|50 |N/A | | |
|
||
+------------+-------+---------+---------+------------+----+---------------+
|
||
|state |enum |RW, admin|N/A |N/A | |active/standby |
|
||
+------------+-------+---------+---------+------------+----+---------------+
|
||
|
||
|
||
REST API impact
|
||
---------------
|
||
|
||
router-create Create a router for a given tenant.
|
||
|
||
::
|
||
router-create --name another_router --ha=true
|
||
|
||
Admin can only set this attribute. The tenants need not be aware about
|
||
this attribute in the router table. So it is not visible to the tenant.
|
||
|
||
Request
|
||
|
||
::
|
||
POST /v2.0/routers
|
||
Accept: application/json
|
||
|
||
{
|
||
"router":{
|
||
"name":"another_router",
|
||
"admin_state_up":true,
|
||
"ha":true}
|
||
}
|
||
|
||
|
||
Response
|
||
|
||
::
|
||
{
|
||
"router":{
|
||
"status":"ACTIVE",
|
||
"external_gateway_info":null,
|
||
"name":"another_router",
|
||
"admin_state_up":true,
|
||
"ha":true,
|
||
"tenant_id":"6b96ff0cb17a4b859e1e575d221683d3",
|
||
"id":"8604a0de-7f6b-409a-a47c-a1cc7bc77b2e"}
|
||
}
|
||
|
||
|
||
router-show Show information of a given router.
|
||
|
||
Request
|
||
|
||
::
|
||
GET /v2.0/routers/a9254bdb-2613-4a13-ac4c-adc581fba50d
|
||
Accept: application/json
|
||
|
||
Response
|
||
|
||
::
|
||
{
|
||
"routers":[{
|
||
"status":"ACTIVE",
|
||
"external_gateway_info":{
|
||
"network_id":""
|
||
},
|
||
"name":"router1",
|
||
"admin_state_up":true,
|
||
"ha":true,
|
||
"tenant_id":"33a40233088643acb66ff6eb0ebea679",
|
||
"id":"a9254bdb-2613-4a13-ac4c-adc581fba50d"}]
|
||
}
|
||
|
||
router-update Create a router for a given tenant.
|
||
|
||
Admin can only update the HA mode of a router.
|
||
|
||
Admin only context:
|
||
|
||
::
|
||
neutron router-update router1 --ha=True
|
||
|
||
Security impact
|
||
---------------
|
||
|
||
None
|
||
|
||
Notifications impact
|
||
--------------------
|
||
|
||
None
|
||
|
||
Other end user impact
|
||
---------------------
|
||
|
||
None
|
||
|
||
Performance Impact
|
||
------------------
|
||
|
||
There will be no network performance impact. Spawning a new virtual router may
|
||
be a bit longer due to the delay of starting the Keepalived/Conntrackd
|
||
processes.
|
||
|
||
Other deployer impact
|
||
---------------------
|
||
|
||
Since this implementation relies on Keepalived, Keepalived will have to be
|
||
deployed on each l3 node. The required version of Keepalived is the
|
||
version 1.2.0 in order to have the IPV6 support.
|
||
|
||
In addition, conntrackd will be required to be run on each node.
|
||
|
||
There is no plan to migrate automatically the original virtual routers to
|
||
the HA virtual routers when updating a previous Openstack installation.
|
||
So after a migration and with the l3_ha configuration parameter set to "True",
|
||
the new routers created will be HA while the older ones will be unchanged.
|
||
Cloud admins can migrate existing virtual routers to be HA routers by using
|
||
the new API. This API is not exposed to tenants.
|
||
|
||
Developer impact
|
||
----------------
|
||
|
||
None
|
||
|
||
|
||
Implementation
|
||
==============
|
||
|
||
Assignee(s)
|
||
-----------
|
||
|
||
Primary assignee:
|
||
Sylvain Afchain <sylvain-afchain>
|
||
|
||
Other contributors:
|
||
Assaf Muller <amuller>
|
||
|
||
Work Items
|
||
----------
|
||
|
||
1. HA L3 Extension, DB bases
|
||
2. HA L3 Scheduler
|
||
3. Keepalived manager
|
||
4. L3 agent driver abstraction introduction, Keepalived driver
|
||
5. Conntrackd support
|
||
|
||
|
||
Dependencies
|
||
============
|
||
|
||
None
|
||
|
||
|
||
Testing
|
||
=======
|
||
|
||
The code will be covered by unit tests.
|
||
When multi-nodes test will be available, tempest test will be introduced.
|
||
|
||
A document explaining how to test all the patches during the review
|
||
process will be updated here :
|
||
|
||
https://docs.google.com/document/d/1P2OnlKAGMeSZTbGENNAKOse6B2TRXJ8keUMVvtUCUSM
|
||
|
||
|
||
Documentation Impact
|
||
====================
|
||
|
||
Document deployer impacts.
|
||
|
||
|
||
References
|
||
==========
|
||
|
||
https://review.openstack.org/#/q/topic:bp/l3-high-availability,n,z
|
||
https://git.openstack.org/cgit/openstack/neutron-specs/tree/specs/juno/neutron-ovs-dvr.rst
|
||
https://wiki.openstack.org/wiki/Neutron/L3_High_Availability_VRRP
|