diff --git a/doc/source/specs/stx-10.0/approved/starlingx-2011105-cstate-management.rst b/doc/source/specs/stx-10.0/approved/starlingx-2011105-cstate-management.rst new file mode 100644 index 0000000..1e7d062 --- /dev/null +++ b/doc/source/specs/stx-10.0/approved/starlingx-2011105-cstate-management.rst @@ -0,0 +1,289 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. http://creativecommons.org/licenses/by/3.0/legalcode + +.. + Many thanks to the OpenStack Nova team for the Example Spec that formed the + basis for this document. + +=========================================== +C-state Management Application on StarlingX +=========================================== + +Storyboard: `#2011105`_ + +The objective of this spec is to introduce the C-state Management +Application in StarlingX Platform. + +Problem description +=================== + +StarlingX, in its current version, offers a comprehensive set of features +for power management. Allowing users and applications to control acceptable +frequency ranges (minimum and maximum frequency) per core; the behavior of +cores in such ranges (governor); which idle sleep states (C-states) a given +core can access, as well as the behavior of the system in the face of +workloads with known intervals/demands. `Kubernetes Power Manager`_ powers +the control of the aforementioned features in targeted CPUs/cores, allowing +individualized configurations. + +Oftentimes, containerized applications require greater granularity +by controlling their CPU idle states (C-states) in execution time. The +`C-state Management Application` offers a set of endpoints that enable pods to +dynamically consult and adjust their C-states. Therefore, it allows users to +save energy by offering fine-grained control of the C-states of the cores +assigned to its applications. + +Use Cases +--------- + +With the introduction of these new capabilities for C-state management, +StarlingX end users and deployers gain enhanced control over the CPU core +configurations. These new features are beneficial for optimizing power +consumption and performance. + +We identify the following potential impacts to StarlingX's stakeholders with +this dynamic C-state management integration: + +* End users: The ability to adjust the maximum C-state level of CPU cores + assigned to pods through REST API requests offers increased flexibility + without disrupting existing workflows. This feature ensures seamless + integration with applications running on StarlingX, enhancing user + experience. + +* Deployers: The introduction of dynamic C-state management may necessitate + minor adjustments for deployers, primarily related to ensuring that assigned + CPU cores are appropriately configured as application-isolated or + exclusively allocated to the pods. Additionally, deployers may need to ensure + that REST API requests for C-state adjustments originate from the same node + where the application's pods are deployed, maintaining security and + efficiency. + +* Developers: The integration of C-state management brings significant + enhancements to the development workflow within StarlingX. By incorporating + a dynamic C-state management functionality, developers gain a more granular + level of control over CPU core configurations, allowing for finer + optimization of power usage and system performance. + +Proposed change +=============== + +The new `C-state Management Application` will be introduced to StarlingX, +resulting in the addition of a REST API that empowers pods to dynamically +control their C-states. When disabled, the application will not add changes to +StarlingX's standard behavior. When enabled, the Kubernetes pods will be +able to programmatically manage their C-state. + +`C-state Management Application` essentially provides endpoints that enable the +following functionalities: + +* Change the maximum C-state Level of CPU Cores. + + * The application, via its REST API, initiates a request to modify the + maximum C-state level of the CPU cores allocated to its pods. + * The assigned CPU cores must either adhere to application isolation or be + exclusively assigned to the pods. + * The request originates from the node on which the application's pods + are deployed. + +* Query the Maximum Available C-state Levels. + + * The application, through its REST API, sends a request to inquire about + the maximum C-state levels available for modification. + +* Query the Maximum C-state Configuration + + * The application, utilizing its REST API, requests information regarding + the configured maximum C-state from the node where its pods are currently + deployed. + + +This specification also requires that the cloud platform shall be able to: + +* Process the C-state level requests (change/query) and respond if the change + occurred or to report the current max c-state level. + +* Process the max C-state level requests (change/query) on the Platform + cores, in other words, it shall run the API producer on the Platform cores. + +* Fulfill the request to change the max c-state within a granularity of + seconds. + +Alternatives +------------ + +None + +Data model impact +----------------- + +None + +REST API impact +--------------- + +None + +Security impact +--------------- + +None + +Other end user impact +--------------------- + +A new REST API will be available, resulting in procedural changes for +dynamically managing C-states on StarlingX. The users should be aware that +the `C-state Management Application` is not designed to work in tandem with +`Kubernetes Power Manager`_. Therefore, we recommend the use of only one of +the aforementioned applications at a time. + +C-state availability might be conditioned to the presence of a label such as +`power-management`_. The `C-state Management Application` is able to manage the +available C-states independently of the applied labels. + +Performance Impact +------------------ + +Given the nature of dynamic C-state management, impacts related to power +consumption and latency are expected to vary based on the usage of +`C-state Management Application`. The following shall be considered: + +* Power Consumption: By actively monitoring and controlling the C-states, + applications can optimize power consumption based on workload demands, + reducing the overall energy consumption in the cluster. On the other hand, + an incorrect or inconsistent configuration might lead to performance + degradation. + +* Latency: C-States range from C0 to Cn. C0 indicates an active state. All + other C-states (C1-Cn) represent idle sleep states with different parts of + the processor powered down. As the C-States get deeper, the exit latency + duration becomes longer (the time to transition to C0) and the power savings + become greater. This potentially increases the time required for processing + varying workloads based on pre-defined parameters. + +Other deployer impact +--------------------- + +None + +Developer impact +---------------- + +Please see the `Use Cases`_ section. + +Upgrade impact +-------------- + +None + + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + +* Guilherme Batista Leite (guilhermebatista) + +Other contributors: + +* Alyson Deives Pereira (adeivesp) +* Eduardo Juliano Alberti (ealberti) +* Fabio Studyny Higa (fstudyny) +* Guilherme Henrique Pereira dos Santos (gsantos1) +* Vinicius Fernando Rocha Lobo (vrochalo) + +Repos Impacted +-------------- + +* starlingx/docs +* starlingx/config +* starlingx/app-cstate-management (new) + + +Work Items +---------- + +The following work items are expected to be carried out, with the understanding +that the storyboard will be updated as more work items are found to be +necessary. + +Spikes and Design +***************** + +* Basic testing of per-cpu latency specification. +* Review of the proposed design. +* Evaluation of options to reduce latency and expected latency reduction. + +Development Work Items +********************** + +* Merge proof of concept to StarlingX codebase. +* Create FluxCD manifest for C-state DaemonSet. +* Create StarlingX application to wrap the FluxCD manifest. +* Enhance C-state application to support IPv6 addresses. +* Enhance C-state application to prevent modification of CPUs allocated to + other Pods. +* Installation via system application. + +Customer Documentation +********************** + +* Publish the usage guide for what functionality is available and how to make + use of it. +* Sample code showing how to make use of the functionality. + +Dependencies +============ + +None + +Testing +======= + +System configuration +-------------------- +The tests will be conducted in the following system configurations: + +* AIO-SX +* AIO-DX +* Standard + +Test Scenarios +-------------- + +* Functional tests for `C-state Management Application` and its customizations. +* Unit testing the impacted code areas. +* Performance testing to identify and address any performance impacts. +* Backup and restore tests. + +Documentation Impact +==================== + +The end-user documentation must be created, adding a guide to +`C-state Management Application` deployments, configurations and +customizations. + +References +========== +#. `Kubernetes Power Manager`_ + + +History +======= + +.. list-table:: Revisions + :header-rows: 1 + + * - Release Name + - Description + * - stx-10.0 + - Introduced + +.. Links +.. _#2011105: https://storyboard.openstack.org/#!/story/2011105 +.. _Kubernetes Power Manager: https://github.com/intel/kubernetes-power-manager +.. _power-management: https://docs.starlingx.io/node_management/kubernetes/configurable-power-manager-04c24b536696.html