Tune up report and downtime intervals for l2 agent

If the neutron server erroneously thinks than the l2 agent is down it will fail to bind a port, which can lead to VM's spawn errors. However, the issue is only transient because the agent effectively is only 'late' in reporting back. Best solution would be an alpha-count algorithm (so that we can detect persistent failures more reliably), but for now let's be more tolerant assuming that the agent is down by waiting at least twice the report interval plus a tiny teeny bit. Change-Id: I544135ce1f6b7eaefb34ac44af8f5844d92ddd95 Close-bug: #1244255
2013-11-01 15:47:22 -07:00 · 2013-11-01 15:47:22 -07:00 · 291048aba2
commit 291048aba2
parent 54d79acaab
3 changed files with 12 additions and 7 deletions
--- a/etc/neutron.conf
+++ b/etc/neutron.conf
@ -211,8 +211,9 @@ notification_driver = neutron.openstack.common.notifier.rpc_notifier
 # max_fixed_ips_per_port = 5
 # =========== items for agent management extension =============
-# Seconds to regard the agent as down.
+# Seconds to regard the agent as down; should be at least twice
-# agent_down_time = 5
+# report_interval, to be sure the agent is down for good
 # agent_down_time = 9
 # ===========  end of items for agent management extension =====
 # =========== items for agent scheduler extension =============
@ -301,8 +302,8 @@ notification_driver = neutron.openstack.common.notifier.rpc_notifier
 # root_helper = sudo
 # =========== items for agent management extension =============
-# seconds between nodes reporting state to server, should be less than
+# seconds between nodes reporting state to server; should be less than
-# agent_down_time
+# agent_down_time, best if it is half or less than agent_down_time
 # report_interval = 4
 # ===========  end of items for agent management extension =====
--- a/neutron/agent/common/config.py
+++ b/neutron/agent/common/config.py
@ -33,7 +33,9 @@ ROOT_HELPER_OPTS = [
 AGENT_STATE_OPTS = [
    cfg.FloatOpt('report_interval', default=4,
-                 help=_('Seconds between nodes reporting state to server')),
+                 help=_('Seconds between nodes reporting state to server; '
                        'should be less than agent_down_time, best if it '
                        'is half or less than agent_down_time.')),
 ]
--- a/neutron/db/agents_db.py
+++ b/neutron/db/agents_db.py
@ -31,8 +31,10 @@ from neutron.openstack.common import timeutils
 LOG = logging.getLogger(__name__)
 cfg.CONF.register_opt(
-    cfg.IntOpt('agent_down_time', default=5,
+    cfg.IntOpt('agent_down_time', default=9,
-               help=_("Seconds to regard the agent is down.")))
+               help=_("Seconds to regard the agent is down; should be at "
                      "least twice report_interval, to be sure the "
                      "agent is down for good.")))
 class Agent(model_base.BASEV2, models_v2.HasId):