Call update_all_ha_network_port_statuses on agent start
As explained in bug [1] when l3 agent fails to report state to the server, its state is set to AGENT_REVIVED, triggering fetch_and_sync_all_routers, which will set all its HA network ports to DOWN, resulting in 1) ovs agent rewiring these ports and setting status to ACTIVE 2) when these ports are active, server sends router update to l3 agent As server, ovs and l3 agents are busy with this processing, l3 agent may fail again reporting state, repeating this process. As l3 agent is repeatedly processing same routers, SIGHUPs are frequently sent to keepalived, resulting in multiple masters. To fix this, we call update_all_ha_network_port_statuses in l3 agent start instead of calling from fetch_and_sync_all_routers. [1] https://bugs.launchpad.net/neutron/+bug/1731595/comments/7 Change-Id: Ia9d5549f7d53b538c9c9f93fe6aa71ffff15524a Related-bug: #1597461 Closes-Bug: #1731595
This commit is contained in:
parent
c3ab39a279
commit
9ed693228f
|
@ -255,6 +255,14 @@ class L3NATAgent(ha.AgentMixin,
|
|||
consumers = [[topics.NETWORK, topics.UPDATE]]
|
||||
agent_rpc.create_consumers([self], topics.AGENT, consumers)
|
||||
|
||||
# We set HA network port status to DOWN to let l2 agent update it
|
||||
# to ACTIVE after wiring. This allows us to spawn keepalived only
|
||||
# when l2 agent finished wiring the port.
|
||||
try:
|
||||
self.plugin_rpc.update_all_ha_network_port_statuses(self.context)
|
||||
except Exception:
|
||||
LOG.exception('update_all_ha_network_port_statuses failed')
|
||||
|
||||
def _check_config_params(self):
|
||||
"""Check items in configuration files.
|
||||
|
||||
|
@ -600,10 +608,6 @@ class L3NATAgent(ha.AgentMixin,
|
|||
lib_const.L3_AGENT_MODE_DVR_SNAT)
|
||||
try:
|
||||
router_ids = self.plugin_rpc.get_router_ids(context)
|
||||
# We set HA network port status to DOWN to let l2 agent update it
|
||||
# to ACTIVE after wiring. This allows us to spawn keepalived only
|
||||
# when l2 agent finished wiring the port.
|
||||
self.plugin_rpc.update_all_ha_network_port_statuses(context)
|
||||
# fetch routers by chunks to reduce the load on server and to
|
||||
# start router processing earlier
|
||||
for i in range(0, len(router_ids), self.sync_routers_chunk_size):
|
||||
|
|
Loading…
Reference in New Issue