
Recently openstacksdk functional test, test_volume_attachment, started failing frequently. It mostly failed during the tearDown step trying to delete the volume as the volume delete was already issued by server delete (which it shouldn't be). Looking into the issue, I found out the problem to be in a race between the BDM record of instance being deleted (during volume attachment delete) and trying to delete the server. The sequence of operations that trigger this issue are: 1. Delete volume attachment 2. Wait for volume to become available 3. Delete server In step (2), nova sends a request to Cinder to delete the volume attachment[1], making the volume in available state[2], BUT the operation is still ongoing on nova side to delete the BDM record[3]. Hence we end up in a race, where nova is trying to delete the BDM record and we issue a server delete (overlapping request), which in turn consumes that BDM record and sends request to (which it shouldn't): 1. delete attachment (which is already deleted, hence returns 404) 2. delete volume Later when the functional test issue another request to delete the volume, we fail since the volume is already in the process of being deleted (by the server delete operation -- delete_on_termination is set to true). This analysis can yield a number of fixes in nova and cinder, namely: 1. Nova to prevent the race of BDM being deleted and being used at the same time. 2. Cinder to detect the volume being deleted and return success for subsequent delete requests (and not fail with 400 BadRequest). This patch focuses on fixing this on the SDK side where the flow of operations happens too fast triggering this race condition. We introduce a wait mechanism to wait for the VolumeAttachment resource to be deleted and later verify that the number of attachments for the server to be 0 before moving to the tearDown that deletes the server and the volume. there is a 1 second gap race happening which can be seen here: 1. server delete starting at 17:13:49 2024-06-05 17:13:49,892 openstack.iterate_timeout ****Timeout is 300 --- wait is 2.0 --- start time is 1717607629.892198 ---- 2024-06-05 17:13:49,892 openstack.iterate_timeout $$$$ Count is 1 --- time difference is 299.99977254867554 2024-06-05 17:13:50,133 openstack.iterate_timeout Waiting 2.0 seconds 2. BDM being deleted at 17:13:50 (already used by server delete to do attachment and volume delete calls) *************************** 2. row *************************** created_at: 2024-06-05 17:13:11 ... deleted_at: 2024-06-05 17:13:50 ... device_name: /dev/vdb volume_id: c13a3070-c5ab-4c8a-bb7e-5c7527fdf0df attachment_id: a1280ca9-4f88-49f7-9ba2-1e796688ebcc instance_uuid: 98bc13b2-50fe-4681-b263-80abf08929ac ... [1]7dc4b1ea62/nova/virt/block_device.py (L553)
[2]9f1292ad06/cinder/volume/api.py (L2685)
[3]7dc4b1ea62/nova/compute/manager.py (L7658-L7659)
Closes-Bug: #2067869 Change-Id: Ia59df9640d778bec4b22e608d111f82b759ac610
openstacksdk
openstacksdk is a client library for building applications to work with OpenStack clouds. The project aims to provide a consistent and complete set of interactions with OpenStack's many services, along with complete documentation, examples, and tools.
It also contains an abstraction interface layer. Clouds can do many things, but there are probably only about 10 of them that most people care about with any regularity. If you want to do complicated things, the per-service oriented portions of the SDK are for you. However, if what you want is to be able to write an application that talks to any OpenStack cloud regardless of configuration, then the Cloud Abstraction layer is for you.
More information about the history of openstacksdk can be found at https://docs.openstack.org/openstacksdk/latest/contributor/history.html
Getting started
Authentication and connection management
openstacksdk aims to talk to any OpenStack cloud. To do this, it
requires a configuration file. openstacksdk favours
clouds.yaml
files, but can also use environment variables.
The clouds.yaml
file should be provided by your cloud
provider or deployment tooling. An example:
clouds:
mordred:
region_name: Dallas
auth:
username: 'mordred'
password: XXXXXXX
project_name: 'demo'
auth_url: 'https://identity.example.com'
openstacksdk will look for clouds.yaml
files in the
following locations:
- If set, the path indicated by the
OS_CLIENT_CONFIG_FILE
environment variable .
(the current directory)$HOME/.config/openstack
/etc/openstack
You can create a connection using the openstack.connect
function. The cloud name can be either passed directly to this function
or specified using the OS_CLOUD
environment variable. If
you don't have a clouds.yaml
file and instead use
environment variables for configuration then you can use the special
envvars
cloud name to load configuration from the
environment. For example:
import openstack
# Initialize connection from a clouds.yaml by passing a cloud name
conn_from_cloud_name = openstack.connect(cloud='mordred')
# Initialize connection from a clouds.yaml using the OS_CLOUD envvar
conn_from_os_cloud = openstack.connect()
# Initialize connection from environment variables
conn_from_env_vars = openstack.connect(cloud='envvars')
Note
How this is all achieved is described in more detail below.
The cloud layer
openstacksdk consists of four layers which all build on top of each
other. The highest level layer is the cloud layer. Cloud layer
methods are available via the top level Connection
object
returned by openstack.connect
. For example:
import openstack
# Initialize and turn on debug logging
openstack.enable_logging(debug=True)
# Initialize connection
conn = openstack.connect(cloud='mordred')
# List the servers
for server in conn.list_servers():
print(server.to_dict())
The cloud layer is based on logical operations that can potentially touch multiple services. The benefit of this layer is mostly seen in more complicated operations that take multiple steps and where the steps vary across providers. For example:
import openstack
# Initialize and turn on debug logging
openstack.enable_logging(debug=True)
# Initialize connection
conn = openstack.connect(cloud='mordred')
# Upload an image to the cloud
image = conn.create_image(
'ubuntu-trusty', filename='ubuntu-trusty.qcow2', wait=True)
# Find a flavor with at least 512M of RAM
flavor = conn.get_flavor_by_ram(512)
# Boot a server, wait for it to boot, and then do whatever is needed
# to get a public IP address for it.
conn.create_server(
'my-server', image=image, flavor=flavor, wait=True, auto_ip=True)
The proxy layer
The next layer is the proxy layer. Most users will make use
of this layer. The proxy layer is service-specific, so methods will be
available under service-specific connection attributes of the
Connection
object such as compute
,
block_storage
, image
etc. For example:
import openstack
# Initialize and turn on debug logging
openstack.enable_logging(debug=True)
# Initialize connection
conn = openstack.connect(cloud='mordred')
# List the servers
for server in conn.compute.servers():
print(server.to_dict())
Note
A list of supported services is given below.
The resource layer
Below this there is the resource layer. This provides
support for the basic CRUD operations supported by REST APIs and is the
base building block for the other layers. You typically will not need to
use this directly but it can be helpful for operations where you already
have a Resource
object to hand. For example:
import openstack
import openstack.config.loader
import openstack.compute.v2.server
# Initialize and turn on debug logging
openstack.enable_logging(debug=True)
# Initialize connection
conn = openstack.connect(cloud='mordred')
# List the servers
for server in openstack.compute.v2.server.Server.list(session=conn.compute):
print(server.to_dict())
The raw HTTP layer
Finally, there is the raw HTTP layer. This exposes raw HTTP
semantics and is effectively a wrapper around the requests API with
added smarts to handle stuff like authentication and version management.
As such, you can use the requests
API methods you know and
love, like get
, post
and put
, and
expect to receive a requests.Response
object in response
(unlike the other layers, which mostly all return objects that subclass
openstack.resource.Resource
). Like the resource
layer, you will typically not need to use this directly but it can be
helpful to interact with APIs that have not or will not be supported by
openstacksdk. For example:
import openstack
# Initialize and turn on debug logging
openstack.enable_logging(debug=True)
# Initialize connection
conn = openstack.connect(cloud='mordred')
# List servers
for server in openstack.compute.get('/servers').json():
print(server)
Configuration
openstacksdk uses the openstack.config
module to parse
configuration. openstack.config
will find cloud
configuration for as few as one cloud and as many as you want to put in
a config file. It will read environment variables and config files, and
it also contains some vendor specific default values so that you don't
have to know extra info to use OpenStack
- If you have a config file, you will get the clouds listed in it
- If you have environment variables, you will get a cloud named envvars
- If you have neither, you will get a cloud named defaults with base defaults
You can view the configuration identified by openstacksdk in your
current environment by running openstack.config.loader
. For
example:
More information at https://docs.openstack.org/openstacksdk/latest/user/config/configuration.html
Supported services
The following services are currently supported. A full list of all available OpenStack service can be found in the Project Navigator.
Note
Support here does not guarantee full-support for all APIs. It simply means some aspect of the project is supported.
Service | Description | Cloud Layer | Proxy & Resource Layer |
---|---|---|---|
Compute | |||
Nova | Compute | ✔ | ✔ (openstack.compute ) |
Hardware Lifecycle | |||
Ironic | Bare metal provisioning | ✔ | ✔ (openstack.baremetal ,
openstack.baremetal_introspection ) |
Cyborg | Lifecycle management of accelerators | ✔ | ✔ (openstack.accelerator ) |
Storage | |||
Cinder | Block storage | ✔ | ✔ (openstack.block_storage ) |
Swift | Object store | ✔ | ✔ (openstack.object_store ) |
Cinder | Shared filesystems | ✔ | ✔ (openstack.shared_file_system ) |
Networking | |||
Neutron | Networking | ✔ | ✔ (openstack.network ) |
Octavia | Load balancing | ✔ | ✔ (openstack.load_balancer ) |
Designate | DNS | ✔ | ✔ (openstack.dns ) |
Shared services | |||
Keystone | Identity | ✔ | ✔ (openstack.identity ) |
Placement | Placement | ✔ | ✔ (openstack.placement ) |
Glance | Image storage | ✔ | ✔ (openstack.image ) |
Barbican | Key management | ✔ | ✔ (openstack.key_manager ) |
Workload provisioning | |||
Magnum | Container orchestration engine provisioning | ✔ | ✔ (openstack.container_infrastructure_management ) |
Orchestration | |||
Heat | Orchestration | ✔ | ✔ (openstack.orchestration ) |
Senlin | Clustering | ✔ | ✔ (openstack.clustering ) |
Mistral | Workflow | ✔ | ✔ (openstack.workflow ) |
Zaqar | Messaging | ✔ | ✔ (openstack.message ) |
Application lifecycle | |||
Masakari | Instances high availability service | ✔ | ✔ (openstack.instance_ha ) |