Merge "Fixed exceptions in nova pods"

This commit is contained in:
Zuul 2019-06-06 20:42:31 +00:00 committed by Gerrit Code Review
commit 241a30e016
12 changed files with 94 additions and 10 deletions

View File

@ -29,6 +29,7 @@ Patch07: 0007-Horizon-Disable-apache2-status_module.patch
Patch08: 0008-Neutron-Add-support-for-disabling-Readiness-Liveness.patch
Patch09: 0009-Nova-Add-support-for-disabling-Readiness-Liveness-pr.patch
Patch10: 0010-Ironic-Add-pxe-boot-support-for-centos-image.patch
Patch11: 0011-Use-nova-s-ping-method-to-find-out-if-the-service-is.patch
BuildRequires: helm
BuildRequires: openstack-helm-infra
@ -49,6 +50,7 @@ Openstack Helm charts
%patch08 -p1
%patch09 -p1
%patch10 -p1
%patch11 -p1
%build
# initialize helm and build the toolkit

View File

@ -1,7 +1,7 @@
From 5ab3650ea105a53b97f7e0aec2086f141f847aa2 Mon Sep 17 00:00:00 2001
From: Angie Wang <angie.wang@windriver.com>
Date: Wed, 6 Mar 2019 15:26:25 -0500
Subject: [PATCH 01/10] Add Aodh Chart
Subject: [PATCH 01/11] Add Aodh Chart
This commit adds a helm chart to deploy aodh.
The default deployment for aodh is ocata.

View File

@ -1,7 +1,7 @@
From 5302aa4e87694e96cc3dfc56ae494a1a8211cc37 Mon Sep 17 00:00:00 2001
From: Angie Wang <angie.wang@windriver.com>
Date: Wed, 6 Mar 2019 18:06:06 -0500
Subject: [PATCH 02/10] Ceilometer chart: add the ability to publish events to
Subject: [PATCH 02/11] Ceilometer chart: add the ability to publish events to
panko
Ceilometer notification agent sends the events to panko via panko

View File

@ -1,7 +1,7 @@
From a0e8c7e3764b168eaaa82d17d965f62d34766573 Mon Sep 17 00:00:00 2001
From: Chris Friesen <chris.friesen@windriver.com>
Date: Wed, 28 Nov 2018 01:33:39 -0500
Subject: [PATCH 03/10] Remove stale Apache2 service pids when a POD starts.
Subject: [PATCH 03/11] Remove stale Apache2 service pids when a POD starts.
Stale Apache2 pids will prevent Apache2 from starting and will leave
the POD in a crashed state.

View File

@ -1,7 +1,7 @@
From 6a023c248b3cbd093b8f4480f4b2cca5a3c8600d Mon Sep 17 00:00:00 2001
From: Gerry Kopec <Gerry.Kopec@windriver.com>
Date: Thu, 10 Jan 2019 00:12:21 -0500
Subject: [PATCH 04/10] Fix ssh config in nova to support cold migrations
Subject: [PATCH 04/11] Fix ssh config in nova to support cold migrations
- Fix .ssh/config file mapping
- Move private key from nova-compute-ssh container to nova-compute

View File

@ -1,7 +1,7 @@
From 64b22037b53e6423c465367c26a6d7255768ae17 Mon Sep 17 00:00:00 2001
From: Gerry Kopec <Gerry.Kopec@windriver.com>
Date: Wed, 27 Mar 2019 00:35:57 -0400
Subject: [PATCH 05/10] Nova console/ip address search optionality
Subject: [PATCH 05/11] Nova console/ip address search optionality
Add options to nova to enable/disable the use of:
1. the vnc or spice server proxyclient address found by the console

View File

@ -1,7 +1,7 @@
From 4f6701c4cab07d9f54012e2a143173803f97ff3d Mon Sep 17 00:00:00 2001
From: Irina Mihai <irina.mihai@windriver.com>
Date: Tue, 26 Feb 2019 17:43:53 +0000
Subject: [PATCH 06/10] Nova chart: Support ephemeral pool creation
Subject: [PATCH 06/11] Nova chart: Support ephemeral pool creation
If libvirt images_type is rbd, then we need to have the
images_rbd_pool present. These changes add a new job

View File

@ -1,7 +1,7 @@
From 8fc7a67eb359d1dfe67b63bc2636386b76071891 Mon Sep 17 00:00:00 2001
From: Robert Church <robert.church@windriver.com>
Date: Fri, 22 Mar 2019 03:29:26 -0400
Subject: [PATCH 07/10] Horizon: Disable apache2 status_module
Subject: [PATCH 07/11] Horizon: Disable apache2 status_module
a2dismod is not present in the StarlingX httpd based images. Try
a2dismod first, then fail back to using sed to remove the module.

View File

@ -1,7 +1,7 @@
From 615b86e8f394f1648e5c2383364cd46230290182 Mon Sep 17 00:00:00 2001
From: Robert Church <robert.church@windriver.com>
Date: Fri, 22 Mar 2019 03:37:05 -0400
Subject: [PATCH 08/10] Neutron: Add support for disabling Readiness/Liveness
Subject: [PATCH 08/11] Neutron: Add support for disabling Readiness/Liveness
probes
With the introduction of Readiness/Liveness probes in

View File

@ -1,7 +1,7 @@
From af94c98eee44769a2c1e8f211029f8346a13ebc2 Mon Sep 17 00:00:00 2001
From: Robert Church <robert.church@windriver.com>
Date: Fri, 22 Mar 2019 03:42:08 -0400
Subject: [PATCH 09/10] Nova: Add support for disabling Readiness/Liveness
Subject: [PATCH 09/11] Nova: Add support for disabling Readiness/Liveness
probes
With the introduction of Readiness/Liveness probes in

View File

@ -1,7 +1,7 @@
From 8b52fcc187dcb2da5fd7453dbb564d24d475dd49 Mon Sep 17 00:00:00 2001
From: Mingyuan Qi <mingyuan.qi@intel.com>
Date: Thu, 11 Apr 2019 14:59:11 +0800
Subject: [PATCH 10/10] Ironic: Add pxe boot support for centos image
Subject: [PATCH 10/11] Ironic: Add pxe boot support for centos image
Current script does not consider centos distro as base image.
Different folder was checked to copy pxe files to tftpboot folder.

View File

@ -0,0 +1,82 @@
From baf5356a4fb61590a95f64a63c0dcabfebb3baaa Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ji=C5=99=C3=AD=20Suchomel?= <jiri.suchomel@suse.com>
Date: Tue, 9 Apr 2019 10:37:46 +0200
Subject: [PATCH 11/11] Use nova's ping method to find out if the service is
alive
Currently there is fake rpc call "pod_health_probe_method_ignore_errors"
that is passed to the service, just to find out if it is responding. Because
such method does not exist, it is needed to catch and handle the exception
that is inevitably thrown by the service.
While this is technically working correctly, the exceptions pollute the
log files and make it harder for user to see possible real errors.
This is how the error looks like:
ERROR oslo_messaging.rpc.server [-] Exception during message handling: oslo_messaging.rpc.dispatcher.UnsupportedVersion: Endpoint does not support RPC version 1.0. Attempted method: pod_health_probe_method_ignore_errors
ERROR oslo_messaging.rpc.server Traceback (most recent call last):
ERROR oslo_messaging.rpc.server File "/var/lib/openstack/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
ERROR oslo_messaging.rpc.server File "/var/lib/openstack/lib/python3.6/site-packages/oslo_messaging/rpc/dispatcher.py", line 276, in dispatch
ERROR oslo_messaging.rpc.server raise UnsupportedVersion(version, method=method)
ERROR oslo_messaging.rpc.server oslo_messaging.rpc.dispatcher.UnsupportedVersion: Endpoint does not support RPC version 1.0. Attempted method: pod_health_probe_method_ignore_errors
This situation is new since https://review.openstack.org/#/c/639711/
which (correctly) increased the default level of logging. Before 639711
error messages from oslo (both real and ones that could be ignored) were not
present in nova logs at all.
Fortunatelly, nova's BaseAPI class provides 'ping' method that is can
be used for this basic purpose by all nova components.
Change-Id: I0062e74bed399206becb8d9e00f9ec805da864a3
---
nova/templates/bin/_health-probe.py.tpl | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/nova/templates/bin/_health-probe.py.tpl b/nova/templates/bin/_health-probe.py.tpl
index 6434e45..4c1aa45 100644
--- a/nova/templates/bin/_health-probe.py.tpl
+++ b/nova/templates/bin/_health-probe.py.tpl
@@ -17,8 +17,8 @@
"""
Health probe script for OpenStack service that uses RPC/unix domain socket for
communication. Check's the RPC tcp socket status on the process and send
-message to service through rpc call method and expects a reply. It is expected
-to receive failure from the service's RPC server as the method does not exist.
+message to service through rpc call method and expects a reply.
+Use nova's ping method that is designed just for such simple purpose.
Script returns failure to Kubernetes only when
a. TCP socket for the RPC communication are not established.
@@ -28,7 +28,7 @@ Script returns failure to Kubernetes only when
sys.stderr.write() writes to pod's events on failures.
Usage example for Nova Compute:
-# python health-probe-rpc.py --config-file /etc/nova/nova.conf \
+# python health-probe.py --config-file /etc/nova/nova.conf \
# --service-queue-name compute
"""
@@ -50,12 +50,15 @@ def check_service_status(transport):
"""Verify service status. Return success if service consumes message"""
try:
target = oslo_messaging.Target(topic=cfg.CONF.service_queue_name,
- server=socket.gethostname())
+ server=socket.gethostname(),
+ namespace='baseapi',
+ version="1.1")
client = oslo_messaging.RPCClient(transport, target,
timeout=60,
retry=2)
client.call(context.RequestContext(),
- 'pod_health_probe_method_ignore_errors')
+ 'ping',
+ arg=None)
except oslo_messaging.exceptions.MessageDeliveryFailure:
# Log to pod events
sys.stderr.write("Health probe unable to reach message bus")
--
2.7.4