
This patch adds a periodic healing mechanism into the monitor module and monitoring plugins. With this change, the heal_reservations() method of resource plugin was changed to receive a period (start/end_date arguments) to heal. This change is for not healing (reallocating) all of reservations for failed resources immediately because failed resources are expected to recover sometime in the future. The monitor tries to heal only reservations which are active or will start soon. Remaining reservations are expected to be healed by the periodic healing. Implements: blueprint healing-time Change-Id: I6971c952fcde101ff2408f567fee9a7dab97b140
1.4 KiB
Compute Host Monitor
Compute host monitor detects failure and recovery of compute hosts. If it detects failures, it triggers healing of host reservations and instance reservations. This document describes the compute host monitor plugin in detail.
Monitoring Type
Both of the push-based and the polling-based monitoring types are supported for the compute host monitor. These monitors can be enabled/disabled by the following configuration options:
- enable_notification_monitor: Set True to enable it.
- enable_polling_monitor: Set True to enable it.
Failure Detection
Compute host monitor detects failure and recovery hosts by subscribing Nova notifications or polling the List Hypervisors of Nova API. If any failure is detected, Blazar sets the reservable field of the failed host False and heals suffering reservations as follows.
Reservation Healing
If a host failure is detected, Blazar tries to heal host/instance reservations which use the failed host by reserving alternative host. The length of the healing interval can be configured by the healing_interval option.
Configurations
To enable the compute host monitor, enable
enable_notification_monitor or enable_polling_monitor
option, and set healing_interval as appropriate for your cloud.
See also the ../configuration/blazar-conf
in detail.