From 8d374dd715ddb6e83ddd36fb83e471ccff2a0e9e Mon Sep 17 00:00:00 2001
From: Ghanshyam <gmaan@ghanshyammann.com>
Date: Tue, 18 Nov 2025 02:46:07 +0000
Subject: [PATCH] Add backlog spec for Graceful shutodwn of nova services

PoC:
- Code change: https://review.opendev.org/c/openstack/nova/+/967261

Partial implement blueprint nova-services-graceful-shutdown

Change-Id: Ic1d5f039c4f1d9cc1c474d8750bff8c5274fd14e
Signed-off-by: Ghanshyam <gmaan@ghanshyammann.com>
---
 .../nova-services-graceful-shutdown.rst       | 711 ++++++++++++++++++
 1 file changed, 711 insertions(+)
 create mode 100644 specs/backlog/approved/nova-services-graceful-shutdown.rst
diff --git a/specs/backlog/approved/nova-services-graceful-shutdown.rst b/specs/backlog/approved/nova-services-graceful-shutdown.rst
new file mode 100644
index 000000000..d0c4d4ea0
--- /dev/null
+++ b/specs/backlog/approved/nova-services-graceful-shutdown.rst
@@ -0,0 +1,711 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+==================================
+Graceful Shutdown of Nova Services
+==================================
+
+https://blueprints.launchpad.net/nova/+spec/nova-services-graceful-shutdown
+
+This is backlog spec proposing the design of graceful shutdown.
+
+Nova services do not shut down gracefully. When services are stopped, it also
+stops all the in-progress operations, which not only interrupt the in-progress
+operations, but can leave instances in an unwanted or unrecoverable state. The
+idea is to let services stop processing the new request, but complete the
+in-progress operations before service is terminated.
+
+Problem description
+===================
+
+Nova services do not have a way to shutdown gracefully means they do not wait
+for the in-progress operations to be completed. When shutdown is initiated,
+services wait for the RPC server to stop and wait so that they can consume all
+the existing request messages (RPC call/cast) from the queue, but the service
+does not complete the operation.
+
+Each Nova compute service has a single worker running and listening on a single
+RPC server (topic: compute.<host>). The same RPC server is used for the new
+requests as well as for in-progress operations where other compute or conductor
+services communicate. When shutdown is initiated, the RPC server is stopped
+means it will stop handling the new request, which is ok, but at the same
+time it will stop the communication needed for the in-progress operations. For
+example, if live migration is in progress, the source and destination compute
+communicate (sync and async way) multiple times with each other. Once the RPC
+server on the compute service is stopped, it cannot communicate with the other
+compute and fail the live migration. It will lead the system as well as the
+instance to be in an unwanted or unrecoverable state
+
+Use Cases
+---------
+
+As an operator, I want to be able to gracefully shut down (SIGTERM) the Nova
+services so that it will not impact the users' in-progress operations or
+keep resources in usable state.
+
+As an operator, I want to be able to keep instances and other resources in a
+usable state even if service is gracefully terminated (SIGTERM).
+
+As an operator, I want to be able to take the actual benefits of the k8s pod
+graceful shutdown when Nova services are running in k8s pods.
+
+As a user, I want in-progress operations to be completed before the service
+is gracefully terminated (SIGTERM).
+
+Proposed change
+===============
+
+Scope: The proposed solution is to gracefully shutdown the services for
+the SIGTERM signal.
+
+The graceful shutdown is based on the following design principles:
+
+* When service shutdown is initiated by SIGTERM:
+
+  * Do not process any new requests
+  * New requests should not be lost. Once service is restarted, it should
+    process the requests.
+  * Allow in-progress operations to reach their quickest safe termination
+    point, either completion or abort.
+  * Proper logging of the state of in-progress operations
+  * Keep instances or other resources in a usable state
+
+* When service shutdown is completed:
+
+  * Proper logging of unfinished operations.
+    Ideally, all the in-progress operations should be completed before service
+    is terminated, but if graceful shutdown times out (due to a configured
+    timeout, adding the timeout details in later section) then there should be
+    a proper logging of all the unfinished operations. This will help to
+    recover the system or instances.
+
+* When service is started again:
+
+  * Start processing the new requests in the normal way.
+  * If the requests were not processed due to the shutdown being initiated,
+    then they stay in message broker queue and there are multiple
+    possibilities:
+
+    * Requests might have been picked by the other worker of that service.
+      For example, you can run more than one Nova scheduler (or conductor)
+      worker. If one of the worker is shutting down, then other worker will
+      process the request. This is not the case for Nova compute which is
+      always a single worker per compute service on specific host.
+    * If a service has single worker running, then request can be picked up
+      once service is up again.
+    * There is an opportunity for the compute service to cleanup or recover
+      the interrupted operation on instances during init_host(). The action
+      taken will depends on the tasks and its status.
+    * If the service is in the stopped state for a long time, based on the
+      RPC and message queue timeout, there is chance that:
+
+      * The RPC client or server will timeout the call.
+      * The message broker queue may drop messages due to timeout.
+      * The order of requests and messages can be stale.
+
+As a graceful shutdown goal, we need to do two things:
+
+#. A way to stop new requests, but do not interrupt in-progress operations.
+   This is proposed to be done via RPC.
+
+#. Give services enough time to finish the operations. As a first step,
+   this is proposed to be done via time-based wait and later with a proper
+   tracking mechanism.
+
+This backlog spec proposes achieving the above goals in two steps. Each step
+will be proposed as a separate spec for a specific release.
+
+The Nova services which already gracefully shutdown:
+----------------------------------------------------
+
+For the below services, their graceful shutdown is handled by their
+deployment servers or used library.
+
+* Nova API & Nova metadata API:
+
+  Those services are deployed using a server with WSGI support. That server
+  will ensure that Nova API services shuts down gracefully, meaning it
+  finishes the in-progress requests and rejects the new requests.
+
+  I investigate with uWSGI/mod_proxy_uwsgi (devstack env). On service start,
+  uWSGI server pre-spawn the number of workers for API service which will
+  handle the API requests in distributed way. When shutdown is initiated
+  by SIGTERM, the uWSGI server SIGTERM handler check if there are any
+  in-progress request on any worker. It wait for all the workers to finish
+  the request and then terminates each worker. Once all worker are terminated
+  then it will terminate the Nova API service.
+
+  If any new request comes after the shutdown is initiated, it will be rejected
+  with "503 Service Unavailable" error.
+
+  Testing:
+
+  I tested two types of requests:
+
+  #. Sync request: 'openstack server list':
+
+     * To observe the graceful shutdown, I added 10 seconds of sleep in the
+       server list API code.
+     * Start a API request 'request1': ``openstack server list``
+     * Wait till the server list request reaches the Nova API (you can see
+       the log from the controller)
+     * Because of sleep(10), the server list takes time to finish.
+     * Initiate the Nova API service shutdown.
+     * Start a new API request 'request2': ``openstack server list``. This new
+       requests came after shutdown is initiated so it should be denied.
+     * Nova API service will wait because 'request1' is not finished.
+     * 'request1' will get the response of the server list before the service
+       is terminated.
+     * 'request2' is denied and will receive the error
+       "503 Service Unavailable"
+
+  #. Async request: ``openstack server pause <server>``:
+
+     * To observe the graceful shutdown, I added 10 seconds of sleep in the
+       server pause API code.
+     * Start a API request 'request1': ``openstack server pause server1``
+     * Wait till the pause server request reaches the Nova API (you can see
+       the log from the controller)
+     * Because of sleep(10), the pause server takes time to finish.
+     * Initiate the Nova API service shutdown.
+     * Service will wait because 'request1' is not finished.
+     * Nova API will make an RPC cast to the Nova compute service and return.
+     * 'request1' is completed, and the response is returned to the user.
+     * Nova API service is terminated now.
+     * Nova compute service is operating the pause server request.
+     * Check if server is paused ``openstack server list``
+     * You can see the server is paused.
+
+* Nova console proxy services: nova-novncproxy, nova-serialproxy, and
+  nova-spicehtml5proxy:
+
+  All the console proxy services run as websockify.websocketproxy_ service.
+  The websockify_ library handles the SIGTERM signal and the graceful shutdown,
+  which is enough for the Nova services.
+
+  When a user access the console, websockify library starts a new process
+  in start_service_ and calls Nova new_websocket_client_ . Nova will be
+  authorizing the token, creating a socket on the host & port, which will
+  be used to send the data/frames. After that, user can access the console.
+
+  If a shutdown request is initiated, websockify  handle the signal. First,
+  it will terminate all the child processes and then raise the terminate
+  exception, which ends up calling the Nova close_connection_ method. The
+  Nova close_connection_ method calls shutdown() on the socket first and
+  then close(), which makes sure to send the remaining data/frame before
+  closing the socket.
+
+  This way, user console sessions will be terminated gracefully, and they will
+  get "Disconnected" message. Once service is up, the user can refresh the
+  browser, and the console will be up again (if the token has not expired).
+
+Spec 1: Split the new and in-progress requests via RPC:
+-------------------------------------------------------
+
+RPC communication is an important part of services to finish a particular
+operation. During shutdown, we need to make sure we keep the required RPC
+servers/buses up. If we stop the RPC communication, then it is nothing
+different than service termination.
+
+Nova implements, and this spec talks a lot about RPC server ``start``,
+``stop``, and ``wait``, so let's cover them briefly from oslo.messaging/RPC
+resources point of view, and to understand this proposal in an easy way.
+Most of you might know this, so you can skip this section.
+
+* RPC server:
+
+  * creation and start():
+
+    * It will create the required resources on oslo.messaging side, for
+      example, dispatcher, consumer, listener, and queues.
+    * It will handle the binding to the required exchanges.
+
+  * stop():
+
+    * It will disable the listener ability to pick up any new message
+      from the queue, but will dispatch the already picked message to
+      the dispatcher.
+    * It will delete the consumer.
+    * It will not delete the queues and exchange on the message broker side.
+    * It will not stop RPC clients sending new messages to the queue, however,
+      they will not be picked because the consumer and listener are stopped.
+  * wait():
+
+    * It will wait for the thread pool to finish dispatching all the already
+      picked messages. Basically, this will make sure methods are called on the
+      manager.
+
+Analysis per services and the required proposed RPC design change:
+
+* The services listed below communicate with other Nova services' RPC servers.
+  Since they do not have their own RPC server, no change needed:
+
+  * Nova API
+  * Nova metadata API
+  * nova-novncproxy
+  * nova-serialproxy
+  * nova-spicehtml5proxy
+
+* Nova scheduler: No RPC change needed.
+
+  * Requests handling:
+    Nova scheduler service runs as multiple workers, each having its own RPC
+    server, but all the Nova scheduler workers will listen to the same RPC
+    topic and queue ``scheduler`` with fanout way.
+
+    Currently, nova.service.py->stop() calls stop() and wait() on RPC server.
+    Once RPC server is stopped, it will stop listening to any new messages.
+    But it will not impact anything on the other scheduler worker, and they
+    continue listening to the same queue and process the request. If any of
+    the scheduler worker is stopped, then the other workers will process the
+    request.
+
+  * Response handling:
+    Whenever there is a RPC call, oslo.messaging creates another reply queue
+    connected with the unique message id. This reply queue will be used to
+    send the RPC call response to the caller. Even if the RPC server is stopped
+    on this worker, it will not impact the reply queue.
+
+    We still need to keep the worker up until all the responses are sent via
+    the reply queue, and for that, we need to implement the in-progress task
+    tracking in scheduler services, but that will be handled in step 2.
+
+  This way, stopping a Nova scheduler worker will not impact the RPC
+  communication on the scheduler service.
+
+* Nova conductor: No RPC change needed.
+
+  The Nova conductor binary is a stateless service that can spawn multiple
+  worker threads. Each instance of the Nova conductor has its own RPC server,
+  but all the Nova conductor instances will listen to  the same RPC topic
+  and queue ``conductor``. This allows the conductor instance to ack as a
+  distributed worker pool such that stopping an individual conductor instance
+  will not impact the RPC communication for the pool of conductor instances,
+  allowing other available workers to process the request. Each cell has its
+  own pool of conductors meaning as long as one conductor is up for any given
+  cell the RPC communication will continue to function even when one or more
+  conductors are stopped.
+
+  The request and response handling is done in the same way as mentioned for
+  the scheduler.
+
+  .. note::
+
+     This spec does not cover the conductor single worker case. That might
+     requires the RPC designing for conductor as well but it need more
+     investigation.
+
+* Nova compute: RPC design change needed
+
+  * Request handling:
+    The Nova compute runs as a single worker per host, and each compute per
+    host has their own RPC server, listener, and separate queues. It handles
+    the new request as well as the communication needed for in-progress
+    operations on the same RPC server. To achieve the graceful shutdown, we
+    need to separate communication for the new requests and in-progress
+    operations. This will be done by adding a new RPC server in the compute
+    service.
+
+    For easy readability, we will be using a different term for each RPC
+    server:
+
+    * 'ops RPC server': This will be used for the new RPC server, which
+      will be used to finish the in-progress requests and will stay up during
+      shutdown.
+    * 'new request RPC server': This will be used for the current RPC server,
+      which is used for the new requests and will be stopped during shutdown.
+
+  * 'new request RPC server' per compute:
+    No change in this RPC server, but it will be used for all the new requests,
+    so that we can stop it during shutdown and stop the new requests on the
+    compute.
+
+  * 'ops RPC server' per compute:
+
+    * Each compute will have a new 'ops RPC server' which will listen to a new
+      topic ``compute-ops.<host>``. ``compute-ops`` name is used because it
+      is mainly for compute operations, but a better name can be used if
+      needed.
+    * It will use the same transport layer/bus and exchange that the
+      'new request RPC server' uses.
+    * It will create its own dispatcher, listener, and queue.
+    * Both RPC server will be bound to the same endpoints (same compute
+      manager), so that requests coming from either server are handled by
+      the same compute manager.
+    * This server will be mainly used for the compute-to-compute operations and
+      server external events. The idea is to keep this RPC server up during
+      shutdown so that the in-progress operations can be finished.
+    * In shutdown, nova.service will wait for the compute to tell if they
+      finished all their tasks, so that it can stop the 'ops RPC server' and
+      finish the shutdown.
+
+  * Response handling:
+    Irrespective of request is coming from either RPC server, whenever there
+    is a RPC call, oslo.messaging creates another reply queue connected with
+    the unique message id. This reply queue will be used to send the RPC call
+    response to the caller. Even RPC server is stopped on this worker, it
+    will not impact the reply queue.
+
+  * Compute service workflow:
+
+    * SIGTERM signal is handled by oslo.service, it will call stop on
+      nova.service
+    * nova.service will stop the 'new request RPC server' so that no new
+      requests are picked by the compute. The 'ops RPC server' is running and
+      up.
+    * nova.service will wait for the manager to signal once all in-progress
+      operations are finished.
+    * Once compute signal to nova.service, then it will stop the
+      'ops RPC server' and proceed with service shutdown.
+
+  * Timeout:
+
+    * There is an existing graceful_shutdown_timeout_ config option present
+      on oslo.service which can be set per service.
+    * That is honoured to timeout the service stop, and it will stop service
+      irrespective of the compute finishing the things.
+
+  * RPC client:
+
+    * The RPC client stays as a singleton class, which is created with the
+      topic  ``compute.<host>``, meaning that by default message will be
+      sent via 'new request RPC server'.
+    * If any RPC cast/call wants to send a message via the 'ops RPC server',
+      they need to override the ``topic`` to ``compute-ops.<host>`` during
+      client.prepare() call.
+    * Which RPC cast/call will be using the 'ops RPC server' will be decided
+      during implementation, so that we can have a better judgment on what all
+      methods are used for the operations we want to finish during shutdown.
+      A draft list where we can use the 'ops RPC server':
+
+      .. note::
+
+         This is draft list and can be changed during implementation.
+
+      * Migrations:
+
+        - Live migration:
+
+          .. note::
+
+             We will be using the 'new request RPC server' for
+             check_can_live_migrate_destination and
+             check_can_live_migrate_source methods, as this is the very initial
+             phase where the compute service has not started the live
+             migration. If shutdown is initiated before live migration request,
+             came then migration should be rejected.
+
+          - pre_live_migration()
+          - live_migration()
+          - prep_snapshot_based_resize_at_dest()
+          - remove_volume_connection()
+          - post_live_migration_at_destination()
+          - rollback_live_migration_at_destination()
+          - drop_move_claim_at_destination()
+
+        - resize methods
+        - cold migration methods
+
+      * Server external event
+      * Rebuild instance
+      * validate_console_port()
+        This is when the console is already requested, and if port validation
+        request is going on, the compute should finish it before shutdown so
+        that users can get their requested console.
+
+* Time based waiting for services to finish the in-progress operations:
+
+  .. note::
+
+     The time based waiting is a temporary solution in spec 1. In spec 2,
+     it will be replaced by the proper tracking of in-progress tasks.
+
+  * To make the graceful shutdown less complicated, spec 1 proposes to
+    configurable time-based waiting for services to complete their operations.
+  * The wait time should be less than global graceful shutdown timeout. So that
+    external system or oslo.service does not shut down the service before the
+    service wait time is over.
+  * It will be configurable per service.
+  * Proposal for the default value:
+
+    * compute service: 150 sec, considering long-running operations on compute.
+    * conductor service: 60 sec should be enough.
+    * scheduler service: 60 sec should be enough.
+
+* PoC:
+  This PoC shows the working of the spec 1 proposal.
+
+  * Code change: https://review.opendev.org/c/openstack/nova/+/967261
+  * PoC results: https://docs.google.com/document/d/1wd_VSw4fBYCXgyh5qwnjvjticNa8AnghzRmRH3H8pu4/
+
+* Some specific examples of the shutdown issues which will be solved by this
+  proposal:
+
+  * Migrations:
+
+    * Migration operations will use the 'ops RPC server'.
+
+      * If migration is in-progress then the service shutdown will not
+        terminate the migration; instead will be able to wait for the migration
+        to complete.
+    * Instance boot:
+
+      * Instance boot operations will continue to use the
+        'new request RPC server'. Otherwise, we will not be able to stop the
+        new requests.
+      * If instance boot requests are in progress by compute services, then
+        shutdown will wait for compute to boot them successfully.
+      * If a new instance boot request arrives after the shutdown is initiated,
+        then it will stay in the queue, and the compute will handle it once it
+        is started again.
+    * Any operations which is reached to compute will be completed before the
+      service is shut down.
+
+.. note::
+
+   As per my PoC and manual testing till now, it does not require any
+   change on oslo.messaging side.
+
+Spec 2: Smartly track and wait for the in-progress operations:
+--------------------------------------------------------------
+
+* The below services graceful shutdown is handled by their deployed server or
+  library so no work is needed for Spec 2:
+
+  * Nova API
+  * Nova metadata API
+  * nova-novncproxy
+  * nova-serialproxy
+  * nova-spicehtml5proxy
+
+* The below services need to implement the tracking system:
+
+  * Nova compute
+  * Nova conductor
+  * Nova scheduler
+
+This proposal is to make the service wait time based on tracking the
+in-progress tasks. Once the service finishes the tasks, then they can signal
+to nova.service to proceed with shutting down the service. Basically, this
+replaces the wait time approach mentioned above with a tracker-based approach.
+
+* There will be a task tracker introduced to track the in-progress tasks.
+* It will be a singleton object.
+* It maintains a list of 'method names' and ``request-id``. If task is related
+  to instance, then we can add the instance UUID also that can help to filter
+  or know what all operations on specific instance is in-progress. The unique
+  ``request-id`` will help to track multiple calls to the same method.
+* Whenever a new request comes to compute, it will add that to the task list
+  and remove it once the task is completed. Modification to the tracker will be
+  done under lock.
+* Once shutdown is initiated:
+
+  * The task tracker will either add the new tasks to the tracker list or
+    reject them. The decision will be made by case, for example, reject the
+    tasks if they are not critical to handle during shutdown.
+  * During shutdown, any new periodic tasks will be denied, but in-progress
+    periodic tasks will be finished.
+  * An exact list of tasks which will be rejected and accepted will be decided
+    during implementation.
+  * The task tracker will start logging the tasks which are in progress, and
+    log when they are completed. Basically, log the detail view of in-progress
+    things during shutdown.
+* nova.service will wait for the task tracker to finish the in-progress tasks
+  until timeout.
+* Example of the flow of RPC servers stop, wait, and task tacker wait will be
+  something like:
+
+  * We can signal tast tracker to start logging the in-progress tasks.
+  * RPCserver1.stop()
+  * RPCserver1.wait()
+  * manager.finish_tasks(): wait for manager to finish the in-progress tasks.
+  * RPCserver2.stop()
+  * RPCserver2.wait()
+
+Graceful Shutdown Timeouts:
+---------------------------
+
+* Nova service timeout:
+
+  * oslo.service already has the timeout (graceful_shutdown_timeout_)
+    which is configurable per service and used to timeout the SIGTERM signal
+    handler.
+  * oslo.service will terminate the Nova service based on
+    graceful_shutdown_timeout_, even Nova service graceful shutdown is not
+    finished.
+  * No new configurable timeout will be added for the Nova, instead it will use
+    the existing graceful_shutdown_timeout_.
+  * Its default value is 60 sec, which is less for Nova services. The proposal
+    is to override its default value per Nova services:
+
+      * compute service: 180 sec (Considering the long running tasks).
+      * conductor service: 80 sec
+      * scheduler service: 80 sec
+
+* External system timeout:
+
+  Depending on how Nova services are deployed, there might be an external
+  system (for example, Nova running on k8s pods) timeout for graceful shutdown.
+  That can impact the Nova graceful shutdown, so we need to document it
+  clearly that if there is external system timeout, then Nova service timeout
+  graceful_shutdown_timeout_ should be set accordingly. The external
+  system timeout should be higher than graceful_shutdown_timeout_,
+  otherwise external system will timeout and will interrupt the Nova graceful
+  shutdown.
+
+Alternatives
+------------
+
+One alternative for the RPC redesign is to handle the two topics per RPC
+server. This needs a good amount of changes in oslo.messaging framework as well
+as driver implementations. The idea is to allow oslo.messaging Target to take
+more than one topic (take topic as a list) and ask the driver to create
+separate consumers, listeners, dispatchers, and queues for each topic. Create
+each topic binding to the exchange. This also requires oslo.messaging to
+provide a new way to let the RPC server unsubscribe from a particular topic
+and continue listening on other topics. We also need to redesign how RPC server
+stop() and wait() works for now. This is too complicated and almost
+re-designing the oslo.messaging RPC concepts.
+
+One more alternative is to track and stop sending the request from Nova api or
+the scheduler service, but that will not be able to stop all the new requests
+(compute to compute tasks) or let in-progress things to complete.
+
+Data model impact
+-----------------
+
+None
+
+REST API impact
+---------------
+
+None
+
+Security impact
+---------------
+
+None
+
+Notifications impact
+--------------------
+
+None
+
+Other end user impact
+---------------------
+
+This should provide a positive impact on end users so that the shutdown will
+not stop their in-progress operations.
+
+Performance Impact
+------------------
+
+No impact on normal operations, but the service shutdown will take more time.
+There is a configurable timeout to control the service shutdown wait time.
+
+Other deployer impact
+---------------------
+
+None other than a longer shutdown process, but they can configurable an
+appropriate timeout for service shutdown.
+
+Developer impact
+----------------
+
+None
+
+Upgrade impact
+--------------
+
+Adding a new RPC server will impact the upgrade. The old compute will not have
+the new 'ops RPC server' listening on topic RPC_TOPIC_OPS, so we need to handle
+it with RPC versioning. If the RPC client detects an old compute (based on
+version_cap), then it will fall back to send the message to the original RPC
+server (listening to RPC_TOPIC).
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+
+gmaan
+
+Other contributors:
+
+None
+
+Feature Liaison
+---------------
+
+gmaan
+
+Work Items
+----------
+
+* Implement the 'ops RPC server' on the compute service
+* Use the 'ops RPC server' for the operations we need to finish during
+  shutdown, for example, compute-to-compute tasks and server external events.
+* RPC versioning due to upgrade impact.
+* Implement a task tracker for services to track and report the in-progress
+  tasks during shutdown.
+
+Dependencies
+============
+
+* No dependency as of now, but we will see during implementation if any change
+  is needed in oslo.messaging.
+
+
+Testing
+=======
+
+* We cannot write tempest tests for this because tempest will not be able to
+  stop the services.
+* We can try (with some heavy live migration which will takes time) some
+  testing in 'post-run' phase like it is done for evacuate tests.
+* Unit and functional tests will be added.
+
+
+Documentation Impact
+====================
+
+Graceful shutdown working will be documented along with other considerations,
+for example, timeout or wait time considered for the graceful shutdown.
+
+References
+==========
+
+* PoC:
+
+  * Code change: https://review.opendev.org/c/openstack/nova/+/967261
+  * PoC results: https://docs.google.com/document/d/1wd_VSw4fBYCXgyh5qwnjvjticNa8AnghzRmRH3H8pu4/
+
+* PTG discussions:
+
+  * https://etherpad.opendev.org/p/nova-2026.1-ptg#L860
+  * https://etherpad.opendev.org/p/nova-2025.1-ptg#L413
+  * https://etherpad.opendev.org/p/r.3d37f484b24bb0415983f345582508f7#L180
+
+.. _`websockify.websocketproxy`: https://github.com/novnc/websockify/blob/e9bd68cbb81ab9b0c4ee5fa7a62faba824a142d1/websockify/websocketproxy.py#L300
+.. _`websockify`: https://github.com/novnc/websockify
+.. _`start_service`: https://github.com/novnc/websockify/blob/e9bd68cbb81ab9b0c4ee5fa7a62faba824a142d1/websockify/websockifyserver.py#L861
+.. _`new_websocket_client`: https://github.com/openstack/nova/blob/23b462d77df1a1d09c43d0918bca853ef3af1e3f/nova/console/websocketproxy.py#L164C9-L164C29
+.. _`close_connection`: https://github.com/openstack/nova/blob/23b462d77df1a1d09c43d0918bca853ef3af1e3f/nova/console/websocketproxy.py#L150
+.. _`graceful_shutdown_timeout`: https://github.com/openstack/oslo.service/blob/8969233a0a45dad06c445fdf4a66920bd5f3eef0/oslo_service/_options.py#L60
+
+History
+=======
+
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release Name
+     - Description
+   * - 2026.1 Gazpacho
+     - Introduced