Current code doesn't disconnect NVMe-oF subsystems when doing a disconnect_volume or on connect_volume failure. This is very problematic for systems that don't share subsytems for multiple namespaces, because both the device (i.e., /dev/nvme0n1) and the subsystem (i.e., /dev/nvme0) will stay forever (now that we connect with the controller loss timeout set to infinite, before it was for 10 minutes) in the system (until manually removed) while the host keeps trying to connect to the remote subsystem, but it won't be able to connect because in this case drivers usually destroy both the namespace and the subsystem simultaneously (so there's no AER message to indicate the change in available namespaces within the subsystem). We'll experience multiple issues with all these leftover devices, such as an ever increasing number of kernel log messages with the connection retries, possible exhaustion of number of connected NVMe subsystems and/or files in /dev, and so on. This patch makes sure the nvmeof connector disconnects a subsystem when there is no longer a namespace present or when the only namespace present is the one we are disconnecting. This is done both on the disconnect_volume call as well as failures during connect_volume. This is not a full solution to the problem of leaving leftover devices, because for drivers that share the subsystem there are race conditions between unexport/unmap of volumes on the cinder side and os-brick disconnect_volume calls. To fully prevent this situation Cinder needs to start reporting the shared_targets value for NVMe volumes (something it's already doing for iSCSI). Partial-Bug: #1961102 Change-Id: Ia00be53420307d6ac1f100420d039da7b65dc349
7.3 KiB
7.3 KiB