fix sr-iov support on Cavium ThunderX hosts.

This change is a partial revert of
Ibf8dca4bd57b3bddb39955b53cc03564506f5754
to reintoduce a try-except which is required for
some non standard hardware.

On the Cavium ThunderX platform, it's possible to have
virutal functions which are netdevs which are not associated
to a PF. This causes the PF name lookup to fail.
Prior to Ibf8dca4bd57b3bddb39955b53cc03564506f5754
when the lookup failed it was caught and we skipped
populating the parent PF interface name.

This change restores that behavior.

Closes-Bug: #1915255
Change-Id: Ia10ccdd9fbed3870d0592e3cbbff17f292651dd2
This commit is contained in:
Sean Mooney 2021-05-21 14:45:45 +01:00
parent 771ea5bf1e
commit a569a51fed
2 changed files with 29 additions and 4 deletions

View File

@ -1269,11 +1269,24 @@ class Host(object):
}
parent_ifname = None
# NOTE(sean-k-mooney): if the VF is a parent of a netdev
# the PF should also have a netdev.
# the PF should also have a netdev, however on some exotic
# hardware such as Cavium ThunderX this may not be the case
# see bug #1915255 for details. As such we wrap this in a
# try except block.
if device.name() in net_dev_parents:
parent_ifname = pci_utils.get_ifname_by_pci_address(
pci_address, pf_interface=True)
result['parent_ifname'] = parent_ifname
try:
parent_ifname = (
pci_utils.get_ifname_by_pci_address(
pci_address, pf_interface=True))
result['parent_ifname'] = parent_ifname
except exception.PciDeviceNotFoundById:
# NOTE(sean-k-mooney): we ignore this error as it
# is expected when the virtual function is not a
# NIC or the VF does not have a parent PF with a
# netdev. We do not log here as this is called
# in a periodic task and that would be noisy at
# debug level.
pass
if device.name() in vdpa_parents:
result['dev_type'] = fields.PciDeviceType.VDPA
return result

View File

@ -0,0 +1,12 @@
---
fixes:
- |
On some hardware platforms, an SR-IOV virtual function for a NIC port may
exist without being associated with a parent physical function that has
an assocatied netdev. In such a case the the PF interface name lookup
will fail. As the ``PciDeviceNotFoundById`` exception was not handled
this would prevent the nova compute agent from starting on affected
hardware. See: https://bugs.launchpad.net/nova/+bug/1915255 for more
details. This edgecase has now been addressed, however, features
that depend on the PF name such as minimum bandwidth based QoS cannot
be supported on these platforms.