watcher/doc/source/strategies/workload-stabilization.rst
Lucian Petrut 00fea975e2 Handle deprecated "cpu_util" metric
The "cpu_util" metric has been deprecated a few years ago.
We'll obtain the same result by converting the cumulative cpu
time to a percentage, leveraging the rate of change aggregation.

Change-Id: I18fe0de6f74c785e674faceea0c48f44055818fe
2023-10-24 10:47:23 +00:00

5.9 KiB

Watcher Overload standard deviation algorithm

Synopsis

display name: Workload stabilization

goal: workload_balancing

watcher.decision_engine.strategy.strategies.workload_stabilization.WorkloadStabilization

Requirements

Metrics

The workload_stabilization strategy requires the following metrics:

metric service name plugins comment

compute.node.cpu.percent

hardware.memory.used cpu instance_ram_usage

ceilometer

ceilometer ceilometer ceilometer

none

SNMP none none

need to set the compute_monitors option to cpu.virt_driver in the nova.conf.

Cluster data model

Default Watcher's Compute cluster data model:

watcher.decision_engine.model.collector.nova.NovaClusterDataModelCollector

Actions

Default Watcher's actions:

action description
migration

watcher.applier.actions.migration.Migrate

Planner

Default Watcher's planner:

watcher.decision_engine.planner.weight.WeightPlanner

Configuration

Strategy parameters are:

parameter type default Value description

metrics

array

["instance_cpu_usage", "instance_ram_usage"]

Metrics used as rates of cluster loads.

thresholds

object

{"instance_cpu_usage": 0.2, "instance_ram_usage": 0.2}

Dict where key is a metric and value is a trigger value.

weights

object

{"instance_cpu_usage_weight": 1.0, "instance_ram_usage_weight": 1.0}

These weights used to calculate common standard deviation. Name of weight contains meter name and _weight suffix.

instance_metrics

object

{"instance_cpu_usage": "compute.node.cpu.percent", "instance_ram_usage": "hardware.memory.used"}

Mapping to get hardware statistics using instance metrics.

host_choice

string

retry

Method of host's choice. There are cycle, retry and fullsearch methods. Cycle will iterate hosts in cycle. Retry will get some hosts random (count defined in retry_count option). Fullsearch will return each host from list.

retry_count

number

1

Count of random returned hosts.

periods

object

{"instance": 720, "node": 600}

These periods are used to get statistic aggregation for instance and host metrics. The period is simply a repeating interval of time into which the samples are grouped for aggregation. Watcher uses only the last period of all received ones.

Efficacy Indicator

watcher.decision_engine.goal.efficacy.specs.ServerConsolidation.get_global_efficacy_indicator

Algorithm

You can find description of overload algorithm and role of standard deviation here: https://specs.openstack.org/openstack/watcher-specs/specs/newton/implemented/sd-strategy.html

How to use it ?

$ openstack optimize audittemplate create \
  at1 workload_balancing --strategy workload_stabilization

$ openstack optimize audit create -a at1 \
  -p thresholds='{"instance_ram_usage": 0.05}' \
  -p metrics='["instance_ram_usage"]'