Merge "Make rst file line endings unix style"

This commit is contained in:
Jenkins 2016-10-05 08:56:22 +00:00 committed by Gerrit Code Review
commit 1782b2b756
5 changed files with 547 additions and 547 deletions

View File

@ -1,50 +1,50 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
========================================
Configuration Refactoring and Generation
========================================
https://blueprints.launchpad.net/dragonflow/+spec/oslo-config-generator
This spec refactors the configurations and introduces oslo config generator
to auto-generate all the options for dragonflow.
Problem Description
===================
Currently dragonflow has many options for different modules. They are all
distributed and imported case by case. As more and more modules are going
to be introduced, we need to make them managed and centralized.
A good example is neutron[1].
Proposed Change
===============
1. Use a dedicated plugin conf file, instead of neutron.conf.
2. Use oslo config generator[2] to manages all the options.
Firstly, we use ``tox -e genconfig`` to generate all the conf files.
If tox is not prepared, we introduce ./tools/generate_config_file_samples.sh
instead.
Secondly, we use etc/oslo-config-generator/dragonflow.ini to manage oslo options.
For example::
[DEFAULT]
output_file = etc/dragonflow.ini.sample
wrap_width = 79
namespace = dragonflow
namespace = oslo.log
Finally, we implements dragonflow/opts.py to include all the references of options
from different modules of dragonflow.
References
==========
1. https://github.com/openstack/neutron/commit/71190773e14260fab96e78e65a290356cdc08581
2. http://docs.openstack.org/developer/oslo.config/generator.html
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
========================================
Configuration Refactoring and Generation
========================================
https://blueprints.launchpad.net/dragonflow/+spec/oslo-config-generator
This spec refactors the configurations and introduces oslo config generator
to auto-generate all the options for dragonflow.
Problem Description
===================
Currently dragonflow has many options for different modules. They are all
distributed and imported case by case. As more and more modules are going
to be introduced, we need to make them managed and centralized.
A good example is neutron[1].
Proposed Change
===============
1. Use a dedicated plugin conf file, instead of neutron.conf.
2. Use oslo config generator[2] to manages all the options.
Firstly, we use ``tox -e genconfig`` to generate all the conf files.
If tox is not prepared, we introduce ./tools/generate_config_file_samples.sh
instead.
Secondly, we use etc/oslo-config-generator/dragonflow.ini to manage oslo options.
For example::
[DEFAULT]
output_file = etc/dragonflow.ini.sample
wrap_width = 79
namespace = dragonflow
namespace = oslo.log
Finally, we implements dragonflow/opts.py to include all the references of options
from different modules of dragonflow.
References
==========
1. https://github.com/openstack/neutron/commit/71190773e14260fab96e78e65a290356cdc08581
2. http://docs.openstack.org/developer/oslo.config/generator.html

View File

@ -1,176 +1,176 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unsuported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=============================
Local Controller Reliability
=============================
This spec describe the design of reliability of DragonFlow.
Problem Description
====================
OVS default to reset up flows when it lose connection with controller.
That means both restart of local controller and OVS will delete flows,
result in a disruption in network traffic.
The goal of this design is to describe how to keep the normal function of
DragonFlow if these exceptions occurred. The types of exception include but not
limited to the following:
1. Local controller restart
2. OVS restart
3. Residual flows
4. Missing flows
Proposed Change
================
Solution to local controller restart
-------------------------------------
When local controller restarts OVS drops all existing flows. This break network
traffic until flows are re-created.
The solution add an ability to drop only old flows. controller_uuid_stamp is
added for local controller. This controller_uuid_stamp is set as cookie for
flows and then flows with stale cookies are deleted during cleanup.
The detail is:
1. Change the fail mode to secure, with this setting, OVS won't delete flows
when it lose connection with local controller.
2. Use canary flow to hold cookie.
3. When local controller restart, read canary flow from OVS, get canary flow's
cookie as old cookie, generate new cookie based on old cookie, update
canary flow with new cookie.
4. Notify dragonflow apps to flush flows with new cookie.
5. Delete flows with old cookie.
Since cookie is used by some apps for smart deletion, so we should share the
cookie with those apps. I think we could divide 64-bit cookie into several
parts, each part is used for a specified purpose. e.g. we could use the least
significant part for this solution, the cookie_mask should be 0x1, while apps
could use the left 63-bits to do smart deletion.
OVS 2.5 support connection tracking(CT), we will use it to implement
security group. The aging process will not delete CT zones when
installing the new flows and deleting the old ones, the content of CT
will be deleted by the timeout mechanism provided by CT.
So the aging process will not affect CT.
The aging process is depicted in the following diagram:
::
+------------------+ +------------------+ +------------------+
| | | | | |
| OVS | | Dragonflow | | CentralDB |
| | | | | |
+---------+--------+ +---------+--------+ +---------+--------+
| | |
| set fail mode to secure | |
| <---------------------------+ |
| | |
| +-----+ |
| | |restart |
| | | |
| +-----+ |
| | |
| notify all ports | |
+---------------------------> | get ports' detail info |
| +---------------------------> |
| | |
| | return ports' info |
| | <---------------------------+
| | |
| add flows with new cookie | |
| <---------------------------+ |
| | |
| | |
| get all flows | |
| <---------------------------+ |
| return | |
+---------------------------> | |
| | |
| delete flows with stale cookie |
| <---------------------------| |
| | |
| | |
+ + +
Solution to OVS restart
------------------------
OVS restart will delete all flows and interrupt the traffic.
After startup, OVS will reconnect with controller to setup new flows.
This process is depicted in the following diagram:
::
+------------------+ +------------------+ +------------------+
| | | | | |
| OVS | | Dragonflow | | CentralDB |
| | | | | |
+------------------+ +---------+--------+ +---------+--------+
+----+ | |
| |restart | |
| | | |
+----+ | |
| | |
| notify all ports | |
+---------------------------> | |
| | get ports' detail info |
| +---------------------------> |
| | |
| | return ports' info |
| +<--------------------------- |
| | |
| create bridges if needed | |
| <---------------------------+ |
| | |
| | |
| add flows with new cookie | |
| <---------------------------+ |
| | |
| | |
+ + +
Solution to residual flows
---------------------------
Residual flows means flows which don't take effect any more but stay in flow
table. Backward incompatible upgrade and incorrect implementation may generate
this kind of flows. The residual flows may not affect the forwarding but it will
occupy flow table space and add difficulty for maintenance.
The methods to manage this issue:
We could reuse the solution for 'local controller restart', trigger local
controller to re-flush flows then delete the flows with old cookie.
Pros
"""""
It's easy to implement because we could reuse the solution for 'OVS restart'
Cons
"""""
It's not efficient because we need to regenerate all the flows again.
This method is suited for the residual flows caused by the
'backward incompatible upgrade'.
Solution to missing flows
--------------------------
When there are missing flows, OVS cannot forward the packet by itself, it will
forward the packet to local controller. For example, in the context of DVR
forwarding, if no corresponding host route flow to destination, OVS will forward
the packet to local controller according to the network flow. Upon receive the
packet, local controller forward the packet, regenerate host flow and flush it
to OVS. We don't plan to discuss it in more detail here and it will be processed
by the specific application of Dragonflow.
References
===========
[1] http://www.openvswitch.org/support/dist-docs-2.5/ovs-vswitchd.8.pdf
[2] http://www.openvswitch.org/support/dist-docs-2.5/ovsdb-server.1.pdf
[3] https://bugs.launchpad.net/mos/+bug/1480292
[4] https://bugs.launchpad.net/openstack-manuals/+bug/1487250
[5] https://www.kernel.org/doc/Documentation/networking/openvswitch.txt
..
This work is licensed under a Creative Commons Attribution 3.0 Unsuported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=============================
Local Controller Reliability
=============================
This spec describe the design of reliability of DragonFlow.
Problem Description
====================
OVS default to reset up flows when it lose connection with controller.
That means both restart of local controller and OVS will delete flows,
result in a disruption in network traffic.
The goal of this design is to describe how to keep the normal function of
DragonFlow if these exceptions occurred. The types of exception include but not
limited to the following:
1. Local controller restart
2. OVS restart
3. Residual flows
4. Missing flows
Proposed Change
================
Solution to local controller restart
-------------------------------------
When local controller restarts OVS drops all existing flows. This break network
traffic until flows are re-created.
The solution add an ability to drop only old flows. controller_uuid_stamp is
added for local controller. This controller_uuid_stamp is set as cookie for
flows and then flows with stale cookies are deleted during cleanup.
The detail is:
1. Change the fail mode to secure, with this setting, OVS won't delete flows
when it lose connection with local controller.
2. Use canary flow to hold cookie.
3. When local controller restart, read canary flow from OVS, get canary flow's
cookie as old cookie, generate new cookie based on old cookie, update
canary flow with new cookie.
4. Notify dragonflow apps to flush flows with new cookie.
5. Delete flows with old cookie.
Since cookie is used by some apps for smart deletion, so we should share the
cookie with those apps. I think we could divide 64-bit cookie into several
parts, each part is used for a specified purpose. e.g. we could use the least
significant part for this solution, the cookie_mask should be 0x1, while apps
could use the left 63-bits to do smart deletion.
OVS 2.5 support connection tracking(CT), we will use it to implement
security group. The aging process will not delete CT zones when
installing the new flows and deleting the old ones, the content of CT
will be deleted by the timeout mechanism provided by CT.
So the aging process will not affect CT.
The aging process is depicted in the following diagram:
::
+------------------+ +------------------+ +------------------+
| | | | | |
| OVS | | Dragonflow | | CentralDB |
| | | | | |
+---------+--------+ +---------+--------+ +---------+--------+
| | |
| set fail mode to secure | |
| <---------------------------+ |
| | |
| +-----+ |
| | |restart |
| | | |
| +-----+ |
| | |
| notify all ports | |
+---------------------------> | get ports' detail info |
| +---------------------------> |
| | |
| | return ports' info |
| | <---------------------------+
| | |
| add flows with new cookie | |
| <---------------------------+ |
| | |
| | |
| get all flows | |
| <---------------------------+ |
| return | |
+---------------------------> | |
| | |
| delete flows with stale cookie |
| <---------------------------| |
| | |
| | |
+ + +
Solution to OVS restart
------------------------
OVS restart will delete all flows and interrupt the traffic.
After startup, OVS will reconnect with controller to setup new flows.
This process is depicted in the following diagram:
::
+------------------+ +------------------+ +------------------+
| | | | | |
| OVS | | Dragonflow | | CentralDB |
| | | | | |
+------------------+ +---------+--------+ +---------+--------+
+----+ | |
| |restart | |
| | | |
+----+ | |
| | |
| notify all ports | |
+---------------------------> | |
| | get ports' detail info |
| +---------------------------> |
| | |
| | return ports' info |
| +<--------------------------- |
| | |
| create bridges if needed | |
| <---------------------------+ |
| | |
| | |
| add flows with new cookie | |
| <---------------------------+ |
| | |
| | |
+ + +
Solution to residual flows
---------------------------
Residual flows means flows which don't take effect any more but stay in flow
table. Backward incompatible upgrade and incorrect implementation may generate
this kind of flows. The residual flows may not affect the forwarding but it will
occupy flow table space and add difficulty for maintenance.
The methods to manage this issue:
We could reuse the solution for 'local controller restart', trigger local
controller to re-flush flows then delete the flows with old cookie.
Pros
"""""
It's easy to implement because we could reuse the solution for 'OVS restart'
Cons
"""""
It's not efficient because we need to regenerate all the flows again.
This method is suited for the residual flows caused by the
'backward incompatible upgrade'.
Solution to missing flows
--------------------------
When there are missing flows, OVS cannot forward the packet by itself, it will
forward the packet to local controller. For example, in the context of DVR
forwarding, if no corresponding host route flow to destination, OVS will forward
the packet to local controller according to the network flow. Upon receive the
packet, local controller forward the packet, regenerate host flow and flush it
to OVS. We don't plan to discuss it in more detail here and it will be processed
by the specific application of Dragonflow.
References
===========
[1] http://www.openvswitch.org/support/dist-docs-2.5/ovs-vswitchd.8.pdf
[2] http://www.openvswitch.org/support/dist-docs-2.5/ovsdb-server.1.pdf
[3] https://bugs.launchpad.net/mos/+bug/1480292
[4] https://bugs.launchpad.net/openstack-manuals/+bug/1487250
[5] https://www.kernel.org/doc/Documentation/networking/openvswitch.txt

View File

@ -1,126 +1,126 @@
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
===============
OVSDB Monitor
===============
This blueprint describe the addition of OVSDB monitor support for
Dragonflow. It implements the lightweight OVSDB driver which based
on the OVSDB monitor\notification mechanism, it solves the performance
problem for Dragonflow to fetch vm ports/interfaces info from OVSDB.
===================
Problem Description
===================
In current Dragonflow implementation of fetch OVSDB data,
Dragonflow will start a loop to detect the add\update\delete for logical
ports, for example after Dragonflow finds a new logical port, it will
establish a socket channel to OVSDB, fetch many data from some OVSDB
tables(Bridge\Port\Interface Table) and find several useful
info(ofport\chassis id) for the new logical port. There are some
performance problems for above implementation:
The loop will consume many server resources because it will pull
large amount data from DB cluster and do the comparison with the
local cache frequently;
For each new logical port, Dragonflow will create a socket channel
to fetch data from OVSDB, if we create many new logical ports in the
future or even in a very short time, it will consume the server
resources further;
For each session between Dragonflow and OVSDB for a new logical port,
it will fetch many unnecessary data from many OVSDB tables;
====================
Solution Description
====================
We bring in OVSDB monitor\notification mechanism which has detail
description in OVSDB protocol rfc
(https://tools.ietf.org/html/rfc7047#section-4.1.5)
We have Dragonflow and open vswitch on the same server, when OVS
start up, OVSDB will listen on port 6640, while when Dragonflow start
up, OVSDB driver will attempt to connect the OVSDB and subscribe the
data to OVSDB server which it is interested in, the details show below:
1. OVSDB server start up and listen on port 6640 and Dragonflow start
up while the OVSDB driver try to connect to OVSDB server as OVSDB
client with tcp:127.0.0.1:6640;
2. When OVSDB driver establish the channel with OVSDB server, OVSDB
driver send the OVSDB monitor command with below jsonrpc content:
method:monitor
params:[<db-name>,<json-value>,<monitor-requests>]
id:nonnull-json-value
In our solution, we only monitor the OVSDB "Interface Table",
so OVSDB driver will send the monitor Interface table jsonrpc
message to OVSDB server;
3. When OVSDB server receive the monitor message sent by OVSDB driver,
it will send a reply message which contains all the interfaces detail
info (if it has) back to OVSDB driver;
4. OVSDB driver receives and decodes the monitor reply message, it will
map each interface info to different type events(bridge online, vm online,
tunnel port online, patch port online), OVSDB driver will notify
these events to upper layer modules;
5. When tenant boot a vm on the host and add the vm port to the OVS bridge,
OVSDB server will send a notification to OVSDB driver according to the
update of OVS Interface Table, the notification will only contain the new
vm interface detail info, and after OVSDB driver receive the notification
it will do the same work as step 4;
6. When tenant shutdown a vm on the host and delete the vm port from the
OVS bridge, OVSDB server will send a notification to OVSDB driver according
to the update of OVS Interface Table, the notification will only contain
the delete vm interface detail info, and after OVSDB driver receive the
notification it will do the same work as step 4.
If we restart Dragonflow process or restart the OVSDB, Dragonflow OVSDB
driver will reconnect to OVSDB server, so step1 to 6 will be executed again.
====================
Event Classification
====================
We could judge the event type according to the fields content in the
monitor reply or table change notification, if you want to see the
detail content in the message, you can execute the command on the
OVS(OVSDB monitor Interface -v) , the detail judgement fields show below:
Bridge online\offline:
type internal
name Br-int\br-tun\br-ex
Vm online\offline:
Iface-id 4aa64e21-d9d6-497e-bfa9-cf6dbb574054
name tapxxx
Tunnel port online\offline:
Remote_ip 10.10.10.10
name dfxxx
type Vxlan\gre\geneve
Patch port online\offline:
type patch
options Peer=<peer port name>
=========
Conclusion
=========
Our solution provides a lightweight OVSDB driver functionality which
implements the OVSDB data monitor and synchronize, remove the Dragonflow
loop process, maintain only one socket channel and transfer less data.
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
===============
OVSDB Monitor
===============
This blueprint describe the addition of OVSDB monitor support for
Dragonflow. It implements the lightweight OVSDB driver which based
on the OVSDB monitor\notification mechanism, it solves the performance
problem for Dragonflow to fetch vm ports/interfaces info from OVSDB.
===================
Problem Description
===================
In current Dragonflow implementation of fetch OVSDB data,
Dragonflow will start a loop to detect the add\update\delete for logical
ports, for example after Dragonflow finds a new logical port, it will
establish a socket channel to OVSDB, fetch many data from some OVSDB
tables(Bridge\Port\Interface Table) and find several useful
info(ofport\chassis id) for the new logical port. There are some
performance problems for above implementation:
The loop will consume many server resources because it will pull
large amount data from DB cluster and do the comparison with the
local cache frequently;
For each new logical port, Dragonflow will create a socket channel
to fetch data from OVSDB, if we create many new logical ports in the
future or even in a very short time, it will consume the server
resources further;
For each session between Dragonflow and OVSDB for a new logical port,
it will fetch many unnecessary data from many OVSDB tables;
====================
Solution Description
====================
We bring in OVSDB monitor\notification mechanism which has detail
description in OVSDB protocol rfc
(https://tools.ietf.org/html/rfc7047#section-4.1.5)
We have Dragonflow and open vswitch on the same server, when OVS
start up, OVSDB will listen on port 6640, while when Dragonflow start
up, OVSDB driver will attempt to connect the OVSDB and subscribe the
data to OVSDB server which it is interested in, the details show below:
1. OVSDB server start up and listen on port 6640 and Dragonflow start
up while the OVSDB driver try to connect to OVSDB server as OVSDB
client with tcp:127.0.0.1:6640;
2. When OVSDB driver establish the channel with OVSDB server, OVSDB
driver send the OVSDB monitor command with below jsonrpc content:
method:monitor
params:[<db-name>,<json-value>,<monitor-requests>]
id:nonnull-json-value
In our solution, we only monitor the OVSDB "Interface Table",
so OVSDB driver will send the monitor Interface table jsonrpc
message to OVSDB server;
3. When OVSDB server receive the monitor message sent by OVSDB driver,
it will send a reply message which contains all the interfaces detail
info (if it has) back to OVSDB driver;
4. OVSDB driver receives and decodes the monitor reply message, it will
map each interface info to different type events(bridge online, vm online,
tunnel port online, patch port online), OVSDB driver will notify
these events to upper layer modules;
5. When tenant boot a vm on the host and add the vm port to the OVS bridge,
OVSDB server will send a notification to OVSDB driver according to the
update of OVS Interface Table, the notification will only contain the new
vm interface detail info, and after OVSDB driver receive the notification
it will do the same work as step 4;
6. When tenant shutdown a vm on the host and delete the vm port from the
OVS bridge, OVSDB server will send a notification to OVSDB driver according
to the update of OVS Interface Table, the notification will only contain
the delete vm interface detail info, and after OVSDB driver receive the
notification it will do the same work as step 4.
If we restart Dragonflow process or restart the OVSDB, Dragonflow OVSDB
driver will reconnect to OVSDB server, so step1 to 6 will be executed again.
====================
Event Classification
====================
We could judge the event type according to the fields content in the
monitor reply or table change notification, if you want to see the
detail content in the message, you can execute the command on the
OVS(OVSDB monitor Interface -v) , the detail judgement fields show below:
Bridge online\offline:
type internal
name Br-int\br-tun\br-ex
Vm online\offline:
Iface-id 4aa64e21-d9d6-497e-bfa9-cf6dbb574054
name tapxxx
Tunnel port online\offline:
Remote_ip 10.10.10.10
name dfxxx
type Vxlan\gre\geneve
Patch port online\offline:
type patch
options Peer=<peer port name>
=========
Conclusion
=========
Our solution provides a lightweight OVSDB driver functionality which
implements the OVSDB data monitor and synchronize, remove the Dragonflow
loop process, maintain only one socket channel and transfer less data.

View File

@ -1,111 +1,111 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unsuported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=============================
Redis Availability
=============================
This spec describe the design of availability of Redis of DragonFlow.
Problem Description
====================
Dragonflow's Redis driver read the Redis cluster topology and cache it
locally in Redis driver's initialization and then it connects to the Redis
master nodes to operate read/write/pub/sub commands.
This cluster topology maybe changed and then start HA in some scenario like
db master node restarting and Dragonflow should detect it that it could
move the connections from the old master node to the new one.
There are two scenarios in Redis cluster topology changing:
1. The connection will be lost when master node restarting.
2. The connection will not be lost while master node changed to slave
without restarting as using "CLUSTER FAILOVER" command.
In this case one slave will be promoted to master and the client
could not get connection error but a MOVED error from server after
sending request to the new slave node.
Some data maybe lost in Redis HA because Redis does not
provide strong consistency. So for this case,
driver should notify DB Consistency module to resynchronize
local data to Redis cluster after the Redis cluster finishing HA.
The goal of this design is to describe how to
keep available of Redis cluster if node crashing occurred.
It could be divided into 2 steps:
1. Detecting changes of cluster topology
2. Processing HA after detection
Proposed Change
================
Description to step 1
-------------------------------------
If this step is done in each controller, there may have too many
Dragonflow compute nodes read the DB cluster in the same time and
redis cluster could hardly handle it.
So create a detecting thread in NB plugin to read the DB topology information
periodically when Neutron server starting and then send the information
to all Dragonflow controllers to check if the DB cluster nodes changed.
And controllers should subscribe a "HA" topic to receive messages from
plugin.
In Dragonflow controller, it never read nodes information from Redis cluster
after initialization but only listen the messages from detecting task from plugin.
There are 2 types of connections between Redis client and cluster:
1. read/write connection, client connects to every Redis master nodes.
2. pub/sub connection, client connects to one of the cluster nodes by hash.
For type 2 connection failure, it should hash to other node immediately.
For type 1 connection failure, it will be updated after receiving messages sent
by detecting task.
Either connection error or MOVED error detected in Redis driver refers to
cluster topology maybe changed.
Note that there will be a reconnection after connection error and
if the reconnection failed too, it means that a HA occurred.
Description to step 2
------------------------
After receiving the cluster information from plugin, local controller will
compare the new nodes with the old nodes and update the topology information
and connections,
then a "dbrestart" message will be sent to db consist module.
The following diagram shows the procedure of Dragonflow:
NB
+-------------------------------+
| 1.notify |
+--------+------> +----------+ |
||driver | |DB consist| |
|--------+ +----------+ |
+-------------------------------+
|
2.resync data|
|
+-------------------v------+
| |
| |
| Redis cluster |
| |
| |
+--------------------+-----+
^
2.resync data |
|
+-------------------------------+
| 1.notify | |
+--------+------> +--+-------+ |
||driver | |DB consist| |
|--------+ +----------+ |
+-------------------------------+
SB
References
===========
[1] http://redis.io/topics/cluster-tutorial
[2] http://redis.io/topics/cluster-spec
..
This work is licensed under a Creative Commons Attribution 3.0 Unsuported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=============================
Redis Availability
=============================
This spec describe the design of availability of Redis of DragonFlow.
Problem Description
====================
Dragonflow's Redis driver read the Redis cluster topology and cache it
locally in Redis driver's initialization and then it connects to the Redis
master nodes to operate read/write/pub/sub commands.
This cluster topology maybe changed and then start HA in some scenario like
db master node restarting and Dragonflow should detect it that it could
move the connections from the old master node to the new one.
There are two scenarios in Redis cluster topology changing:
1. The connection will be lost when master node restarting.
2. The connection will not be lost while master node changed to slave
without restarting as using "CLUSTER FAILOVER" command.
In this case one slave will be promoted to master and the client
could not get connection error but a MOVED error from server after
sending request to the new slave node.
Some data maybe lost in Redis HA because Redis does not
provide strong consistency. So for this case,
driver should notify DB Consistency module to resynchronize
local data to Redis cluster after the Redis cluster finishing HA.
The goal of this design is to describe how to
keep available of Redis cluster if node crashing occurred.
It could be divided into 2 steps:
1. Detecting changes of cluster topology
2. Processing HA after detection
Proposed Change
================
Description to step 1
-------------------------------------
If this step is done in each controller, there may have too many
Dragonflow compute nodes read the DB cluster in the same time and
redis cluster could hardly handle it.
So create a detecting thread in NB plugin to read the DB topology information
periodically when Neutron server starting and then send the information
to all Dragonflow controllers to check if the DB cluster nodes changed.
And controllers should subscribe a "HA" topic to receive messages from
plugin.
In Dragonflow controller, it never read nodes information from Redis cluster
after initialization but only listen the messages from detecting task from plugin.
There are 2 types of connections between Redis client and cluster:
1. read/write connection, client connects to every Redis master nodes.
2. pub/sub connection, client connects to one of the cluster nodes by hash.
For type 2 connection failure, it should hash to other node immediately.
For type 1 connection failure, it will be updated after receiving messages sent
by detecting task.
Either connection error or MOVED error detected in Redis driver refers to
cluster topology maybe changed.
Note that there will be a reconnection after connection error and
if the reconnection failed too, it means that a HA occurred.
Description to step 2
------------------------
After receiving the cluster information from plugin, local controller will
compare the new nodes with the old nodes and update the topology information
and connections,
then a "dbrestart" message will be sent to db consist module.
The following diagram shows the procedure of Dragonflow:
NB
+-------------------------------+
| 1.notify |
+--------+------> +----------+ |
||driver | |DB consist| |
|--------+ +----------+ |
+-------------------------------+
|
2.resync data|
|
+-------------------v------+
| |
| |
| Redis cluster |
| |
| |
+--------------------+-----+
^
2.resync data |
|
+-------------------------------+
| 1.notify | |
+--------+------> +--+-------+ |
||driver | |DB consist| |
|--------+ +----------+ |
+-------------------------------+
SB
References
===========
[1] http://redis.io/topics/cluster-tutorial
[2] http://redis.io/topics/cluster-spec

View File

@ -1,84 +1,84 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
===========================
Remote Device Communication
===========================
https://blueprints.launchpad.net/dragonflow/+spec/remote-device-communication
This spec proposes the solution of communicating to a remote device which
is not managed by Dragonflow.
Problem Description
===================
In the common scenario, a VM not only needs to communicate with another VM
but also a physical machine, however, the virtual or physical machine
may not be managed by Dragonflow, in this spec we call them remote device,
if a VM in Dragonflow wants to communicate to remote device, Dragonflow
needs to know some info of the remote device.
Usually we would deploy a VTEP for virtual or physical machine in DC network,
such as the Openvswitch vxlan port, the VTEP TOR(top of rack) and the
physical router which support VTEP, so if Dragonflow knows the correct VTEP
IP, VM in Dragonflow could access remote device by the overlay network.
The remote device may belong to one tenant or it has no tenant info at all.
It could be managed by another cloud OS and how the remote device knows the
location of the VM in Dragonflow and accesses it is out of the scope of this
spec.
Proposed Change
===============
To resolve the problem, the general idea is we should tell the info of remote
device to Dragonflow. We can invoke the Neutron API create_port and provide
the info of remote device, plugin will assign a specific chassis name for
the remote device and publish the create_port message. After the chassis
receives the message, it will create corresponding tunnel port to the remote
chassis and install the forwarding rules.
Neutron Plugin
--------------
When we invoke the create_port Neutron API provided by Neutron plugin in
Dragonflow, it will process it:
1. We put the info that indicates the Neutron port is a remote device port
into the binding_profile field so that Neutron plugin could recognize it:
binding_profile = {"port_key": "remote_port",
"host_ip": remote_chassis_ip}
2. When the Neutron plugin finds it is a remote port by the binding_profile
field in the create_port message, it will assign the remote_chassis_ip as
the chassis name of the remote port, because the remote_chassis_ip should be
unique in DC network. Then it will store the lport in DF DB and publish the
message with corresponding topic, if the lport belongs to some tenant, we
could use tenant_id as the topic.
DF Local Controller
-------------------
DF local controller will process above notification message:
1. DF local controller will analyse the create_port message and find it is a
remote device port by the specific chassis name, and also it will fetch
the remote tunnel ip by the chassis name.
2. Local controller will check whether local chassis has the tunnel port from
itself to the specific remote chassis, if not, it will create the tunnel
port and establish the tunnel to the remote chassis.
3. After the tunnel port has been created, local controller will notify the
create_lport message to Apps, it will be considered a normal remote port as
in current implementation.
On the other hand, when the remote device port is deleted from local cache,
it means there are no need to communicate to the remote chassis anymore
for the local controller, it should delete the corresponding tunnel port and
forwarding rules.
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
===========================
Remote Device Communication
===========================
https://blueprints.launchpad.net/dragonflow/+spec/remote-device-communication
This spec proposes the solution of communicating to a remote device which
is not managed by Dragonflow.
Problem Description
===================
In the common scenario, a VM not only needs to communicate with another VM
but also a physical machine, however, the virtual or physical machine
may not be managed by Dragonflow, in this spec we call them remote device,
if a VM in Dragonflow wants to communicate to remote device, Dragonflow
needs to know some info of the remote device.
Usually we would deploy a VTEP for virtual or physical machine in DC network,
such as the Openvswitch vxlan port, the VTEP TOR(top of rack) and the
physical router which support VTEP, so if Dragonflow knows the correct VTEP
IP, VM in Dragonflow could access remote device by the overlay network.
The remote device may belong to one tenant or it has no tenant info at all.
It could be managed by another cloud OS and how the remote device knows the
location of the VM in Dragonflow and accesses it is out of the scope of this
spec.
Proposed Change
===============
To resolve the problem, the general idea is we should tell the info of remote
device to Dragonflow. We can invoke the Neutron API create_port and provide
the info of remote device, plugin will assign a specific chassis name for
the remote device and publish the create_port message. After the chassis
receives the message, it will create corresponding tunnel port to the remote
chassis and install the forwarding rules.
Neutron Plugin
--------------
When we invoke the create_port Neutron API provided by Neutron plugin in
Dragonflow, it will process it:
1. We put the info that indicates the Neutron port is a remote device port
into the binding_profile field so that Neutron plugin could recognize it:
binding_profile = {"port_key": "remote_port",
"host_ip": remote_chassis_ip}
2. When the Neutron plugin finds it is a remote port by the binding_profile
field in the create_port message, it will assign the remote_chassis_ip as
the chassis name of the remote port, because the remote_chassis_ip should be
unique in DC network. Then it will store the lport in DF DB and publish the
message with corresponding topic, if the lport belongs to some tenant, we
could use tenant_id as the topic.
DF Local Controller
-------------------
DF local controller will process above notification message:
1. DF local controller will analyse the create_port message and find it is a
remote device port by the specific chassis name, and also it will fetch
the remote tunnel ip by the chassis name.
2. Local controller will check whether local chassis has the tunnel port from
itself to the specific remote chassis, if not, it will create the tunnel
port and establish the tunnel to the remote chassis.
3. After the tunnel port has been created, local controller will notify the
create_lport message to Apps, it will be considered a normal remote port as
in current implementation.
On the other hand, when the remote device port is deleted from local cache,
it means there are no need to communicate to the remote chassis anymore
for the local controller, it should delete the corresponding tunnel port and
forwarding rules.