From 487d14aafc6a56d40ddbe40a301bef2e564c61bb Mon Sep 17 00:00:00 2001 From: Eric MacDonald Date: Fri, 25 Jul 2025 08:37:41 -0400 Subject: [PATCH] Increase Maintenance Heartbeat period from 100 to 1000 msecs This update changes the default Maintenance Heartbeat period from 100 msecs to 1 second (1000 msecs). Test Plan: PASS: Verify full deployment of WRCP AIO DX Plus 1 worker PASS: Verify full deployment of 2+4+2 Standard System PASS: Verify heartbeat period default is 1 second Regression: PASS: Verify AIO DX enable handler heartbeat soak PASS: Verify AIO DX add handler heartbeat soak PASS: Verify Standard controller enable handler heartbeat soak PASS: Verify Standard controller add handler heartbeat soak PASS: Verify Standard worker node enable handler heartbeat soak PASS: Verify Standard worker node add handler heartbeat soak PASS: Verify heartbeat loss handling with new default heartbeat period PASS: Verify MNFA handling with with new default heartbeat period PASS: Verify hostwd quorum process failure fault detection and handling timing is not effected by new default heartbeat period. PASS: Run WRCP DX Sanity on AIO DX and 2+4+2 Sandard system Depends-On: https://review.opendev.org/c/starlingx/config/+/955893 Partial-Fix: 2117252 Change-Id: Iaae2cc0efca92aa751e9404c886ac569d238be86 Signed-off-by: Eric MacDonald --- .../adjusting-the-boot-timeout-interval.rst | 2 +- ...t-heartbeat-interval-and-heartbeat-response-thresholds.rst | 4 ++-- .../configuring-heartbeat-failure-action.rst | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/adjusting-the-boot-timeout-interval.rst b/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/adjusting-the-boot-timeout-interval.rst index 53f2281b0..abdc735e0 100644 --- a/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/adjusting-the-boot-timeout-interval.rst +++ b/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/adjusting-the-boot-timeout-interval.rst @@ -32,7 +32,7 @@ see :ref:`The Life Cycle of a Host `. | c3a9... | platform | maintenance | heartbeat_degrade_threshold | 6 | | 9089... | platform | maintenance | heartbeat_failure_action | fail | | 8df8... | platform | maintenance | heartbeat_failure_threshold | 10 | - | 16b5... | platform | maintenance | heartbeat_period | 100 | + | 16b5... | platform | maintenance | heartbeat_period | 1000 | | 4712... | platform | maintenance | mnfa_threshold | 2 | | 4ba7... | platform | maintenance | mnfa_timeout | 0 | +---------+----------+---------------+-----------------------------+-------+ diff --git a/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/adjusting-the-host-heartbeat-interval-and-heartbeat-response-thresholds.rst b/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/adjusting-the-host-heartbeat-interval-and-heartbeat-response-thresholds.rst index 9485ab137..958436d05 100644 --- a/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/adjusting-the-host-heartbeat-interval-and-heartbeat-response-thresholds.rst +++ b/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/adjusting-the-host-heartbeat-interval-and-heartbeat-response-thresholds.rst @@ -35,7 +35,7 @@ see :ref:`The Life Cycle of a Host `. | c3a9... | platform | maintenance | heartbeat_degrade_threshold | 6 | | 9089... | platform | maintenance | heartbeat_failure_action | fail | | 8df8... | platform | maintenance | heartbeat_failure_threshold | 10 | - | 16b5... | platform | maintenance | heartbeat_period | 100 | + | 16b5... | platform | maintenance | heartbeat_period | 1000 | | 4712... | platform | maintenance | mnfa_threshold | 2 | | 4ba7... | platform | maintenance | mnfa_timeout | 0 | +---------+----------+---------------+-----------------------------+-------+ @@ -54,7 +54,7 @@ see :ref:`The Life Cycle of a Host `. **heartbeat_period** The time in milliseconds between heartbeat challenges from the controller to the other hosts (100–1000 ms). The default is - 100 ms. + 1000 ms. **heartbeat_degrade_threshold** The number of consecutive missing responses to heartbeat challenges diff --git a/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/configuring-heartbeat-failure-action.rst b/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/configuring-heartbeat-failure-action.rst index 6e7019490..03cab9085 100644 --- a/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/configuring-heartbeat-failure-action.rst +++ b/doc/source/node_management/kubernetes/customizing_the_host_life_cycles/configuring-heartbeat-failure-action.rst @@ -36,7 +36,7 @@ immediately in the event of a persistent loss of maintenance heartbeat. | c3a9... | platform | maintenance | heartbeat_degrade_threshold | 6 | | 9089... | platform | maintenance | heartbeat_failure_action | fail | | 8df8... | platform | maintenance | heartbeat_failure_threshold | 10 | - | 16b5... | platform | maintenance | heartbeat_period | 100 | + | 16b5... | platform | maintenance | heartbeat_period | 1000 | | 4712... | platform | maintenance | mnfa_threshold | 2 | | 4ba7... | platform | maintenance | mnfa_timeout | 0 | +---------+----------+---------------+-----------------------------+-------+