diff --git a/doc/source/admin/concurrency.rst b/doc/source/admin/concurrency.rst new file mode 100644 index 000000000000..5d1813961ead --- /dev/null +++ b/doc/source/admin/concurrency.rst @@ -0,0 +1,106 @@ +Nova service concurrency +======================== + +For a long time nova services relied almost exclusively on the Eventlet library +for processing multiple API requests, RPC requests and other tasks that needed +concurrency. Since Eventlet is not expected to support the next major cPython +version the OpenStack TC set a `goal`__ to replace Eventlet and therefore Nova +has started transitioning its concurrency model to native threads. During this +transition Nova maintains the Eventlet based concurrency mode while building +up support for the native threading mode. + +.. __: https://governance.openstack.org/tc/goals/selected/remove-eventlet.html + +.. note:: + + The native threading mode is not ready yet. Do not use it in production. + +Selecting concurrency mode for a service +---------------------------------------- + +Nova still uses Eventlet by default, but allows switching services to native +threading mode at service startup via setting the environment variable +``OS_NOVA_DISABLE_EVENTLET_PATCHING=true``. + +.. note:: + + Since nova 32.0.0 (2025.2 Flamingo) the nova-scheduler can be switched to + native threading mode. + + +Tunables for the native threading mode +-------------------------------------- +As native threads are more expensive resources than greenthreads Nova provides +a set of configuration options to allow fine tuning the deployment based on +load and resource constraints. The default values are selected to support a +basic, small deployment without consuming substantially more memory resources, +than the legacy Eventlet mode. Increasing the size of the below thread pools +means that the given service will consume more memory but will also allow more +tasks to be executed concurrently. + +* :oslo.config:option:`cell_worker_thread_pool_size`: Used to execute tasks + across all the cells within the deployment. + + E.g. To generate the result of the ``openstack server list`` CLI command, the + nova-api service will use one native thread for each cell to load the nova + instances from the related cell database. + + So if the deployment has many cells then the size of this pool probably needs + to be increased. + + This option is only relevant for nova-api, nova-metadata, nova-scheduler, and + nova-conductor as these are the services doing cross cell operations. + +* :oslo.config:option:`executor_thread_pool_size`: Used to handle incoming RPC + requests. Services with many more inbound requests will need larger pools. + For example, a single conductor serves requests from many computes as well + as the scheduler. A compute node only serves requests from the API for + lifecycle operations and other computes during migrations. + + This option is only relevant for nova-scheduler, nova-conductor, and + nova-compute as these are the services acting as RPC servers. + +* :oslo.config:option:`default_thread_pool_size`: Used by various concurrent + tasks in the service that are not categorized into the above pools. + + This option is relevant to every nova service using ``nova.utils.spawn()``. + +Seeing the usage of the pools +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When new work is submitted to any of these pools in both concurrency modes +Nova logs the statistics of the pool (work executed, threads available, +work queued, etc). +This can be useful when fine tuning of the pool size is needed. +The parameter :oslo.config:option:`thread_pool_statistic_period` defines how +frequently such logging happens from a specific pool in seconds. A value of +60 seconds means that stats will be logged from a pool maximum once every +60 seconds. The value 0 means that logging happens every time work is submitted +to the pool. The default value is -1 meaning that the stats logging is +disabled. + +Preventing hanging threads +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Threads from a pool are not cancellable once they are executing a task, +therefore it is important to ensure external dependencies cannot hold up a +task execution indefinitely as that will lead to having fewer threads in the +pool available for incoming work and therefore reduced overall capacity. + +Nova's RPC interface already uses proper timeout handling to avoid hanging +threads. But adding timeout handling to the Nova's database interface is +database server and database client library dependent. + +For mysql-server the `max_execution_time`__ configuration option can be used +to limit the execution time of a database query on the server side. Similar +options exist for other database servers. + +.. __: https://dev.mysql.com/doc/refman/8.4/en/server-system-variables.html#sysvar_max_execution_time + +For the pymysql database client a client side timeout can be implemented by +adding the `read_timeout`__ connection parameter to the connection string. + +.. __: https://pymysql.readthedocs.io/en/latest/modules/connections.html#module-pymysql.connections + +We recommend using both in deployments where Nova services are running in +native threading mode. diff --git a/doc/source/admin/index.rst b/doc/source/admin/index.rst index 1bbfd6aa7683..a856075afcae 100644 --- a/doc/source/admin/index.rst +++ b/doc/source/admin/index.rst @@ -105,6 +105,9 @@ the defaults from the :doc:`install guide ` will be sufficient. * :doc:`Running nova-api on wsgi `: Considerations for using a real WSGI container instead of the baked-in eventlet web server. +* :doc:`Nova service concurrency `: Considerations on how + to use and tune Nova services in threading mode. + .. toctree:: :maxdepth: 2 @@ -113,6 +116,7 @@ the defaults from the :doc:`install guide ` will be sufficient. default-ports availability-zones configuration/index + concurrency Basic configuration diff --git a/doc/source/reference/threading.rst b/doc/source/reference/threading.rst index 5b33e2dd7704..1fff71e76590 100644 --- a/doc/source/reference/threading.rst +++ b/doc/source/reference/threading.rst @@ -1,8 +1,11 @@ Threading model =============== -All OpenStack services use *green thread* model of threading, implemented -through using the Python `eventlet `_ and +Eventlet +-------- +Before the Flamingo release all OpenStack services used the *green thread* +model of threading, implemented through using the Python +`eventlet `_ and `greenlet `_ libraries. Green threads use a cooperative model of threading: thread context @@ -18,7 +21,7 @@ In addition, since there is only one operating system thread, a call that blocks that main thread will block the entire process. Yielding the thread in long-running tasks ------------------------------------------ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If a code path takes a long time to execute and does not contain any methods that trigger an eventlet context switch, the long-running thread will block any pending threads. @@ -37,7 +40,7 @@ time module is patched through eventlet.monkey_patch(). To be explicit, we recom contributors use ``greenthread.sleep()`` instead of ``time.sleep()``. MySQL access and eventlet -------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~ There are some MySQL DB API drivers for oslo.db, like `PyMySQL`_, MySQL-python etc. PyMySQL is the default MySQL DB API driver for oslo.db, and it works well with eventlet. MySQL-python uses an external C library for accessing the MySQL database. @@ -54,3 +57,21 @@ a discussion of the `impact on performance`_. .. _mailing list thread: https://lists.launchpad.net/openstack/msg08118.html .. _impact on performance: https://lists.launchpad.net/openstack/msg08217.html .. _PyMySQL: https://wiki.openstack.org/wiki/PyMySQL_evaluation + +Native threading +---------------- +Since the Flamingo release OpenStack started to transition away form +``eventlet``. During this transition Nova maintains support for running +services with ``eventlet`` while working to add support for running services +with ``native threading``. + +To support both modes with the same codebase Nova started using the +`futurist`_ library. In native threading mode ``futurist.ThreadPoolsExecutors`` +are used to run concurrent tasks and both the oslo.service and the +oslo.messaging libraries are configured to use native threads to execute tasks +like periodics and RPC message handlers. + +.. _futurist: https://docs.openstack.org/futurist/latest/ + +To see how to configure and tune the native threading mode read the +:doc:`/admin/concurrency` guide.