heal_and_optimize flow retrieves a list of vifs
using a method in util module. For SR-IOV agent,
_find_vifs methid is invoked, it is common to CNA
and VNIC objects. The vswitch_id is retrieved and
validated, since vswitch_id is not present in VNIC
object, getattr call fails. Since this method is
common to both CNA and VNIC, a default return value
of None should be included in the list of parameters.
_find_vifs method :
Instead of getattr(vif, 'vswitch_id')
getattr(vif, 'vswitch_id', None) should be used.
The current implementation of list_vifs will return a VIF even if the
VIF is not intended for an SEA. This change filters VIFs that aren't
in use by SEAs on either the RMC management switch or an Open vSwitch
and removes them from the list_vifs return.
(cherry picked from commit c32da1a58b)
The context.py file has been moved from neutron to neutron-lib.
This updates the import statement accordingly.
(cherry picked from commit 240fa28d66)
VIFEventHandler creates ProvisionRequest and passed down to agents for
processing. Currently, there is nothing in ProvisionRequest to
distinguish it between CNA or VNIC related. Due to this, SEA agent
gets VNIC related ProvisionRequests and processes to provision vlan
on SEA. This needs to be avoided. This patchset introduces vif_type
attribute into ProvisionRequest.
Refer to https://review.openstack.org/#/c/396739/ for a change in
nova_powervm. vif_type is now part of the event generated when a vif
is plugged or unplugged. This patchset uses the newly introduced
vif_type in the event while generating ProvisionRequest.
A new abstract property has been added to base_agent to carry
corresponding vif_type. Both sea_agent and sriov_agent have been
updated to return appropriate vif_type property. If vif_type in
incoming event does not match with vif_type supported by agent,
ProvisionRequest is not generated. And so, agent is not notified to
For example, in the case of SR-IOV based plug, event carries
vif_type as pvm_sriov, and SEA agent supports vif_type as pvm_sea.
No ProvisionRequest is generated and sea_agent provison device
function is not invoked, so no vlan is provisioned.
exceptions was previously being imported from neutron.common. This
has now been deprecated in favor of importing from neutron_lib.
This change updates to the correct import.
Add debug logs for when the agent reports updated bridge mappings back
to neutron. If a vNIC gets deployed on backing devices with the wrong
label (physical network name), these logs help figure out why.
Removal of the setup_adapter helper via Change-Id
Ie8a13da0b0cdc29d350ed91d96ff2384430c2676 introduced compatibility
issues; this change set restores it. The setup_adapter method simply
assigns self.adapter a valid pypowervm.adapter.Adapter.
This change set represents a major refactor of the neutron SEA and
- Proper abstract properties/methods allow much code to be consolidated
into the BasePVMNeutronAgent from the individual subclasses.
- Conversely, CNA-specific code, such as helpers for the
sea_agent-specific heal_and_optimize loop, is moved to the sea_agent
- PVMRpcCallbacks methods are folded into the agent base class itself.
The agent is now the thing that's registered for those callbacks.
- The network_delete callback is gone (it wasn't being used).
- The port_update callback is a no-op by default. In the SR-IOV agent,
it's refreshing the physical port label:physloc mappings back to
- The vnic_required_vfs and vnic_vf_capacity conf options are moved to
- There is no longer a short timer in the rpc_loop; everything is
event-driven. Thus, the polling_interval conf option is removed.
- The rpc_loop is common to both agents. It only runs
heal_and_optimize, on the configurable interval (default 30min).
- ProvisionRequest is moved into its own module, prov_req.py. It is now
a factory class capable of producing ProvisionRequest instances for:
- Custom VIF Events from the nova-powervm VIF driver;
- pypowervm Wrappers of VIF types (CNA, VNIC);
- The VIFEventHandler (formerly CNAEventHandler) is now driven by VIF
events produced by the nova-powervm VIF driver. This entails a
significant processing/performance/load improvement over the previous
code, which acted on all LPAR events.
- Utility methods (utils.py) are renamed and genericized to handle all
VIF types (CNA, VNIC) rather than just CNAs.
- Obsolete utility methods and their unit tests are removed.
- Unit tests overhauled and extended.
When the SR-IOV agent receives a port update notice from neutron, it
polls the server waiting for the corresponding vNIC to be created before
marking the port as active. If the vNIC create errors or doesn't
happen, the agent would keep polling for that vNIC indefinitely.
With this change set, if the vNIC doesn't spring into existence within
20 minutes, the agent gives up and stops polling. (If whatever caused
the vNIC not to be created hasn't already triggered rollback/cleanup of
the corresponding compute process, that process also times out after 20
Commit 0fcfc379de removed the PVID looper,
but left references to the corresponding config option. This change set
removes the config option from sea_agent and references thereto from the
usage.rst and devstack README.rst.
The PowerVM SR-IOV neutron agent was sending update_device_up (almost)
immediately in response to the update_port push from neutron. If the
resulting network-vif-plugged event arrives at the compute process
*before* the PlugVifs task starts waiting for it, we'll miss it and thus
time out "waiting" for the vif to get "plugged" (even though the vNIC
gets created successfully).
With this change set, when the sriov_agent's rpc_loop pulls a port off
the queue, it will search the REST server for a vNIC with the matching
MAC address. If not found, it'll push the port back on the queue and go
back to sleep. It only sends the update_device_up once the vNIC is
This is not an ideal solution for a number of reasons, including:
1) We have no way to know if the VNIC creation failed, in which case we
may wind up with stale port entries in the queue forever.
2) The current mechanism for finding the vNIC is very heavy: a feed GET
of all LPARs followed by a VNIC feed GET on each. We should find a more
lightweight way of doing this.
With the ability to pass the VLAN over to nova, via the mechanism driver
and vif binding details, we no longer need the PVID Updater.
Previously the function would wait for nova to provision a VIF on VLAN
1. Then the sea networking agent would detect this new port, update the
SEA to support the VLAN (if needed) and then update the VLAN on the VIF.
Now that we can pass the VLAN to Nova at the VIF create time, that last
step is not needed.
Previously, the VF capacity for SR-IOV-backed vNICs had to come from the
vif's binding:profile; else it would default to None (allowing the
platform to default it).
With this change, the capacity can also be specified in the ml2
configuration file via the new 'vnic_vf_capacity' option in the AGENT
section. The binding:profile value takes precedence, followed by the
configuration value. The configuration default is None if unspecified.
Since binding:profile values will not be supported in the current
release (see Related-Bug), references thereto are removed from the
A new mechanism driver is included in this patchset. This has a
corresponding SR-IOV agent implementation as well.
The mechanism driver supports vnic_type direct. Both flat and vlan
network types are suported. The mappings reference maintained by this
driver is derived from agent configuration. Agent periodically updates
mechanism driver with a list of physical networks derived from port
labels of physical ports of SR-IOV adapters. Mechanism driver validates
incoming port for binding if it has valid physical network attached to
it. To do this, the check_segment_for_agent method is used by
try_to_bind_segement_for_agent. Under the covers, mappings from the
agent are matched with physical network attribute of the segment.
This mechanism driver also provides vlan information during binding to
vif plug mechanism on nova side. To do this, this mechanism driver
provides binding:vif_details with vlan id. On nova side, during vif plug
operation, vlan can be retrieved from ['details']['vlan'] attribute of
This patchset also includes changes to setup.cfg. A console script
driver to setup networking-powervm-sriov-agent along with its sea
setup.cfg changes also include Mechanism driver setup in ml2.conf
and entry_points.txt file in runtime environment.
If the system has an excessive amount of VLANs to clean up, the heal and
optimization path can block the requests coming in for an extensive
period of time. This may eventually lead to VIF plugging timeouts on
the nova component.
This change set will stop doing deletes of VLANs after three passes. It
logs the remaining VLANs that need to be cleaned up, but does not
actually execute the delete. It assumes the next pass through will do
the appropriate cleanup.
The related_href was used to parse out the bridge_mappings table.
Unfortunately, *how* you queried for the data identified what the URI
would be. An earlier change to remove some host_uuid's ended up
malforming the URI and breaking the bridge_mapping logic.
This change fixes how we look at the URIs to see if a SEA is on a given
VIOS. This should be more reliable moving forward.
This change set reduces the number of host_uuids being passed around in
networking-powervm. It also switches many of the calls from the very
verbose adapter.read() format to the wrapper get method.
The VIOS type element can be running true Client Network Adapters (VEAs
without additional trunk adapters) on it. The heal and optimize code
should take these adapters into account.
This change set updates some of the utility methods to account for VIOS
types. It also updates the heal and optimize code to take these VEAs
into account for the whole system (LPARs and VIOSes).
This change allows for the utils list_lpar_uuids to take in a new
parameter that can exclude the mgmt UUID. This is an important change
because we may wish to explicitly exclude it to avoid erronerous logging
messages. This differs from pypowervm's base implementation where it
explicitly includes it (if that explicit include is not set to True, it
does not explicitly exclude it).
Reworks to use the pypowervm get_partitions method where able.
This change set re-aligns the networking-powervm source code with
the global requirements. It also removes some of the deprecations from
that were popping up from neutron.
Periodic rpc loop ends up invoking get_device_details_list even
if there are no ports to update. This makes an unnecessary call to
neutron server. This fix avoids such a call if there are no ports
Existing unit test test_build_prov_requests_from_neutron already
handles empty port list scenario.
Currently heal and optimize loop interval is set to 300 seconds
Due to this, it has been observed that CPU utilization is very
high. This is due to frequent port updates. During every loop,
port is updated to DOWN and made UP.
This work will increase this interval to 1800 seconds (30 minutes).
We have seen scenarios where users have rebooted their systems and the
networking-powervm process has started before the backing PowerVM REST
server. This generally indicates a packaging issue (where the
networking-powervm package should depend on the others). However,
there is the ability in the PowerVM Session to make multiple attempts.
This change set takes advantage of the multiple attempts provided by the
pypowervm session. That means that if the REST server is booting, the
networking-powervm will wait up to 10 minutes for the REST server to
finish starting. The REST server should start much faster than that,
and this is just an upper limit timeout.
1) We were not importing '_' in exceptions.py, so we were using the
default symbol from builtins, which isn't the right thing for
translation. Import from this project's i18n module instead (and use
_LE, for they are exceptions). It would also seem as though it is now a
requirement to import _ from a module at <project>._i18n; so the
networking_powervm.plugins.ibm.agent.powervm.i18n module has been moved
2) The VIOS.xags property is deprecated. Rebase to use
The heal code within the networking-powervm project would ensure that
the VLAN and client device was routed out to the network. However, due
to it calling 'get_device_details', the neutron code was changing the
state back to BUILD.
Given this behavior, it became apparent that the best path forward was
to have the heal code call a full provision request for the client
device. This actually will no-op very quickly if the VLAN is already on
the client device, but tells Neutron that it is not in fact in a build
state...but rather is now ACTIVE.
This allows for a more robust provisioning scheme and allows the neutron
state to reflect reality. It also updates any existing ports in the
field that may be affected by this with the next 'heal' cycle.
The agent should report the mappings so that the neutron server can
determine if the agent supports a given physical network. This change
set updates the powervm agents to properly return the physical network.
For compatibility sake (ex. when the neutron server is mitaka but an
agent is still liberty), it will fall back to the original logic. This
logic will be removed in the newton release of OpenStack.
As we evaluate bringing new Neutron agents in for the PowerVM platform,
the CNA Event Handler is seen as needed by those agents (to identify
when a new request comes in).
This refactoring moves the Event Handler up to the base class so that
other agents can make use of it.
The list_cnas() method is logging via the pypowervm log_helper
each time a VM is not found. This can generate a lot of
unnecessary logging. This change removes the logger for
this method only as it's just listing the CNAs for the VM.
Since the translated messages will be in a message catalog
name networking-powervm, all the translation functions must
point to that domain. This change introduces the i18n.py module
to setup translation.
If the REST API has a significant number of conflicting requests come
in, there are times when it can cause a conflict. These get
encapsulated into errors. The retry operator allows us to catch a set
of known errors and retry them automatically (ex. etag conflicts).
The retry 'domain' indicates how much of an operation would be retried.
Usually you want to keep this as small as possible. Retry as few
commands as needed to the API.
The networking-powervm code has a rather large retry domain in the
list_cnas code, and didn't have any retry in the list_lpar_uuids. This
change set optimizes the retry domains to be small (one API call instead
The existing test cases cover the refactoring.
This change set provides operators with a flag to limit the automated
clean up of the VLANs from the neutron agent. If set to True, the
system will not periodically clean up VLANs that are not in use.
This change set allows for the PowerVM agents to listen to events from
the system and provision a port based off of an event in the system.
The original support to listen to ports from Neutron is still enabled.
However, the length of time that it waits is now dramatically reduced as
the system events themselves will also be considered.