neutron/neutron
Sławek Kapłoński b22a9d5ad5 Fix race condition with enabling SG on many ports at once
When there are many calls to enable security groups on ports there
can be sometimes race condition between refresh recource_cache
with data get by "pull" call to neutron server and data received
with "push" rpc message from neutron server.
In such case when "push" message comes with information about
updated port (with enabled port_security), in local cache this port
is already updated so local AFTER_UPDATE call is not called for
such port and its rules in firewall are not updated.

It happend quite often in fullstack security groups test because
there are 4 ports created in this test and all 4 are updated to
apply SG to it one by one.
And here's what happen then in details:
1. port 1 was updated in neutron-server so it sends push notification
   to L2 agent to update security groups,
2. port 1 info was saved in resource cache on L2 agent's side and agent
   started to configure security groups for this port,
3. as one of steps L2 agent called
   SecurityGroupServerAPIShim._select_ips_for_remote_group() method;
   In that method RemoteResourceCache.get_resources() is called and this
   method asks neutron-server for details about ports from given
   security_group,
4. in the meantime neutron-server got port update call for second port
   (with same security group) so it sends to L2 agent informations about 2
   ports (as a reply to request sent from L2 agent in step 3),
5. resource cache updates informations about two ports in local cache,
   returns its data to
   SecurityGroupServerAPIShim._select_ips_for_remote_group() and all
   looks fine,
6. but now L2 agent receives push notification with info that port 2 is
   updated (changed security groups), so it checks info about this port
   in local cache,
7. in local cache info about port 2 is already WITH updated security
   group so RemoteResourceCache doesn't trigger local notification about
   port AFTER UPDATE and L2 agent doesn't know that security groups for this
   port should be changed

This patch fixes it by changing way how items are updated in
the resource_cache.
For now it is done with record_resource_update() method instead of
writing new values directly to resource_cache._type_cache dict.
Due to that if resource will be updated during "pull" call to neutron
server, local AFTER_UPDATE will still be triggered for such resource.

Change-Id: I5a62cc5731c5ba571506a3aa26303a1b0290d37b
Closes-Bug: #1742401
(cherry picked from commit 725df3e038)
2018-06-20 18:16:07 +00:00
..
agent Fix race condition with enabling SG on many ports at once 2018-06-20 18:16:07 +00:00
api Fix lack of routes for neighbour IPv4 subnets 2018-06-01 19:58:18 +00:00
callbacks Merge "service: add callback AFTER_SPAWN" 2017-05-20 09:09:12 +00:00
cmd More efficiently clean up OVS ports 2018-01-25 15:29:49 +00:00
common Fix eventlet imports issue 2018-06-01 10:00:10 +00:00
conf DVR: Provide options for DVR North/South routing centralized 2017-08-18 22:09:37 +00:00
core_extensions Fix default qos policy when creating network 2017-08-24 14:35:38 +00:00
db Only allow SG port ranges for whitelisted protocols 2018-05-11 00:24:34 +02:00
debug Make code follow log translation guideline 2017-08-14 10:53:33 -07:00
extensions Only allow SG port ranges for whitelisted protocols 2018-05-11 00:24:34 +02:00
hacking hacking: Remove dead code 2017-07-19 13:43:44 +02:00
ipam Always pass device_owner to _ipam_get_subnets() 2018-02-17 18:32:02 +00:00
locale Imported Translations from Zanata 2017-07-18 08:36:04 +00:00
notifiers Make code follow log translation guideline 2017-08-14 10:53:33 -07:00
objects Fix Port OVO filtering based on security groups 2018-06-08 13:25:16 +00:00
pecan_wsgi Dont log about skipping notification in normal case 2017-09-23 20:14:57 +00:00
plugins [OVS] Fix for cleaning after skipped_devices 2018-06-12 20:57:08 +00:00
privileged DVR: Fix allowed_address_pair IP, ARP table update by neutron agent 2018-03-26 17:24:20 -07:00
quota CountableResource: try count/get functions for all plugins 2017-09-12 16:23:22 +00:00
scheduler Avoid redundant HA port creation during migration 2017-09-11 19:56:19 +00:00
server Make code follow log translation guideline 2017-08-14 10:53:33 -07:00
services Merge "Fix error message when duplicate QoS rule is created" into stable/pike 2018-03-08 14:29:07 +00:00
tests Fix race condition with enabling SG on many ports at once 2018-06-20 18:16:07 +00:00
__init__.py Hacking rule to check i18n usage 2016-03-30 21:28:37 -04:00
_i18n.py Make code follow log translation guideline 2017-08-14 10:53:33 -07:00
auth.py Use oslo.context class method to construct context object 2017-03-23 09:02:46 +00:00
manager.py Do not load default service plugins if core plugin is not DB based 2017-11-20 15:36:35 +00:00
neutron_plugin_base_v2.py Do not load default service plugins if core plugin is not DB based 2017-11-20 15:36:35 +00:00
opts.py fix missing l2pop config option docs 2017-10-23 17:40:00 +02:00
policy.py Log policy filters in one line 2017-08-23 21:23:01 +00:00
service.py Make code follow log translation guideline 2017-08-14 10:53:33 -07:00
version.py
worker.py replace WorkerSupportServiceMixin with neutron-lib's WorkerBase 2017-06-14 06:56:48 -06:00
wsgi.py Make code follow log translation guideline 2017-08-14 10:53:33 -07:00