Refactor the security_group_rules_for_devices call.

Refactoring the call into security_group_rules_for_devices_compact
which will obsolete the old one during the next cycle.

blueprint security-group-rules-for-devices-rpc-call-refactor

Change-Id: I6830c48082d3886bbd9c3a9c29f9311e45e2cb17
This commit is contained in:
Miguel Angel Ajo 2014-07-03 13:19:19 +02:00 committed by Miguel Angel Ajo Pelayo
parent 70a86bc782
commit b6a088cfb6
1 changed files with 420 additions and 0 deletions

View File

@ -0,0 +1,420 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=======================================================
'security_group_rules_for_devices' RPC call refactoring
=======================================================
https://blueprints.launchpad.net/neutron/+spec/security-group-rules-for-devices-rpc-call-refactor
Security group rules synchronization from neutron-server to L2 agents scales
poorly for high density clouds, leading to a blocked neutron-server in some
situations.
Problem description
===================
The security_group_rules_for_devices RPC call from L2 agents to
neutron-server doesn't scale well because all the security group rule entries
are expanded with each specific IP address (see [#call_results]_) in a security
group, when that group is referenced in rules like:
allow all from 'default' group for IPv4 and IPv6
This leads to:
* huge AMQP messages (>20-600 MB)
* very long processing time at neutron-server side when we have lots of
instances under the same tenant/security group (>60 seconds)
* neutron-server lockups, when RPC call times out at agent side, and the same
security_group_rules_for_devices call is issued back to neutron.
For a more detailed insight:
* security_group_rules_for_devices is an RPC call from the L2 agents to
neutron-server, see [#call_results]_.
This call's arguments are a list of device_ids, device_ids are connected
to ports. Neutron builds a list [#call_results]_ of security group rules and
returns the list of security group rules per device_id
The message size between neutron-server and L2 agent will grow
according to the following formula:
MessageSize ~= base + L * Instances_in_host * (Instances_in_security_group-1)
Where L = len(str(security_group_rule)) ~=440 bytes
This problem is more likely to happen in big clouds, or denser
ones, but it's an issue, as nova can scale to 10x instances
without incident: see [#n2nov_scaling]_.
Proposed change
===============
Refactor the security_group_rules_for_devices into a new call
'security_group_rules_for_devices_compact', which instead of returning
[#call_results]_ , returns a more compact result
that can be expanded by the l2 agent.
The new 'security_groups_rules_for_devices_compact' RPC call returns:
.. code-block:: python
{'security_groups': {'sg-id1': {'rules': [ {},{},{},{}]}
},
'security_group_member_ips': { 'sg-id1' : {'ipv4': ['192.168.11.2/32'],
'ipv6': [] },
'sg-id2-referenced-from-id1' : {...},
},
'devices': {'dev-id1': { ... ,
'fixed_ips': ['192.168.11.4'],
'security_groups': ['sg-id1', 'sg-id2', ...] }}
}
Security group rules would be passed in a non-expanded way, which implies
that any reference to src or dst security groups in rules would be stored
as security group ids.
A response like this (old result):
.. code-block:: python
{'dev-id1':
{...,
'security_group_rules': [{'direction': u'egress',
'ethertype': u'IPv6',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e'},
{'direction': u'egress',
'ethertype': u'IPv4',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'protocol': u'icmp',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.11.2/32'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.11.3/32'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.11.4/32'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'23138476-4fde-454e-33ad-abc123456782',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.33.4/32'}
]
},
'dev-id2': {
...,
'security_group_rules': [{'direction': u'egress',
'ethertype': u'IPv6',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e'},
{'direction': u'egress',
'ethertype': u'IPv4',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'protocol': u'icmp',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.11.2/32'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.11.3/32'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.11.4/32'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.11.5/32'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'23138476-4fde-454e-33ad-abc123456782',
'security_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e',
'source_ip_prefix': '192.168.33.4/32'}
]
}
}
Would be like this in the new version:
.. code-block:: python
{'security_groups': {u'1809f907-4b0c-4445-a366-ff28eaab9c2e':
{'rules': [
{'direction': u'egress', 'ethertype': u'IPv6'},
{'direction': u'egress', 'ethertype': u'IPv4'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'protocol': u'icmp',},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'1809f907-4b0c-4445-a366-ff28eaab9c2e'},
{'direction': u'ingress',
'ethertype': u'IPv4',
'remote_group_id':
u'23138476-4fde-454e-33ad-abc123456782'}
]
}
},
'security_group_member_ips': { u'1809f907-4b0c-4445-a366-ff28eaab9c2e' :
{u'ipv4': ['192.168.11.2/32',
'192.168.11.3/32',
'192.169.11.4/32',
'192.168.11.5/32'],
u'ipv6': []
},
u'23138476-4fde-454e-33ad-abc123456782' :
{u'ipv4': ['192.168.33.2/32'],
u'ipv6': []
}
},
'devices': {'dev-id1': { ... ,
'fixed_ips': ['192.168.11.4'],
'security_groups':
['1809f907-4b0c-4445-a366-ff28eaab9c2e'] },
'dev-id2': { ... ,
'fixed_ips': ['192.168.11.4'],
'security_groups':
['1809f907-4b0c-4445-a366-ff28eaab9c2e'] },
}
All security groups referenced from devices will be included in the
response.
All security group members ip addresses from all remote_group_id referenced
groups will be included in the response.
The old call could be marked as deprecated during this J cycle,
and removed during K cycle.
Making the refactor into a new call would have the following advantages:
#. Compatibility during neutron-server upgrade with older agents
#. Ability to split patches (server/agents) in more steps, as we will have
the ability to address the new call, while keeping the agents calling
the old one, and then refactor the agents in further steps.
The resulting message size would be:
MessageSize ~= base +
D * Instances_in_host +
L * Referenced_security_groups +
I * Instances_in_referenced_security_groups
Where L = len(str(compact_security_group_rule)) ~= 220 bytes
D = len(str(device_description_including_ips_and_sg_ids))
I = len(str(ip_address + ',')) ~= 17 bytes
In the new message format no data is replicated, thus now two variables
become multiplication factors.
Next steps:
There is a proposal from Édouard Thuleau to use an RPC topic per security
group [#sec_fanout]_ which would be addressed in a second iteration after
this one.
Alternatives
------------
* Instead of including all the security groups in one rpc call, this could
be split to a second call, 'security_groups_and_referenced_members', which
would receive a list of security groups ids, and would return a list of
security groups and a list of security group ip addresses.
A full sync would require 2 calls to neutron server.
* We could have a 'security_groups' and a 'security_groups_members', which
would provide the security groups, without member IP addresses and,
a list of IPv4 and IPv6 addresses members for each security group.
A full sync would require 3 calls to neutron server. But this approach
would allow separate communication of new members in security groups,
or new rules in security groups, further reducing the information transmited
in those cases. As for the first alternative, reducing the traffic volume
by increasing the number of calls seems like a bad tradeoff due to the
overhead/latency generated by each call.
* We could just compact CIDR ranges in rules generation, that wouldn't
require modifications to the agents, but that would increase the
rpc request processing time.
Data model impact
-----------------
None
REST API impact
---------------
None
Security impact
---------------
None
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
The performance impact should be very positive in the next situations:
* Security group changes: neutron-server load, and AMQP message sizes.
At this moment oslo messaging serializes structures to JSON because of
AMQP version limitations (string sizes for dictionaries is
one of those). Reducing the message size will reduce the dictionary
building time and also the serialization time.
* Creation time for new ports
Even higher performance impact in packet processing would be achieved
by the optimization at iptables level which is proposed in the ipset spec
[#ipset_spec]_.
If tranmission times become our bottleneck instead of processing times
we may consider compression if that's available at the AMQP level.
Other deployer impact
---------------------
None
Developer impact
----------------
* I'm unsure if there are proprietary l2-agents talking to the RPC.
We would allow some time for those to be upgraded by introducing
it as a new RPC call instead of modifying the existing one.
* All the agents that use this RPC call would need to be updated
to the new call before removing the old one.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
https://launchpad.net/~mangelajo
Other contributors:
http://launchpad.net/~shihanzhang
Work Items
----------
* Refactor the rpc call into a new one in neutron-server and
and the matching rpc call at the agent mixin.
* Add functional testing to validate the approach.
* Upgrade the agents to use the new call, one by one.
* Analyze db access and file a new spec if improvements can
be made in this area.
Dependencies
============
None
Testing
=======
Functional testing will validate the approach and make sure
the result that was possible with the old method can be
consistently replicated (only faster) with the new method.
Testing both rpc calls with functional testing will avoid
regressions in both of the code paths, while testing this
in Tempest only allows testing the default rpc call.
Documentation Impact
====================
None
References
==========
.. [#call_results] http://www.fpaste.org/104401/14008522/
.. [#n2nov_scaling] http://javacruft.wordpress.com/2014/06/18/168k-instances/
.. [#iptables] http://www.fpaste.org/104431/40085672/
.. [#sec_fanout] http://lists.openstack.org/pipermail/openstack-dev/2014-June/038374.html
.. [#ipset_spec] https://review.openstack.org/#/c/100761/