kdump config remove intel eth drivers from ramdisk

Problem:
On a kernel crash, such as the watchdog timer firing, kexec
tries booting the crash recovery kernel in order to capture
a vmcore so that the issue can be debugged. This normally
succeeds unless the platform has ice network hardware. Why?
Because the crash recovery kernel has only a small amount of
memory set aside for it, and the ice driver allocates enough
memory to cause memory exhaustion.  This causes the crash
recovery kernel's startup to fail, leading to complete platform
hang.  In order to break out of the hang, one needs to manually
do a hardware reset or power cycle.

Solution:
Change kdump.conf to leave the ice driver module out of the
initramfs that is used by the crash recovery kernel.  In
fact, leave all of the intel ethernet drivers out since they
are not needed and increase the risk of memory exhaustion.
Upon changing kdump.conf, the kdump service is restarted to
regenerate the initramfs.

Verification:
Install, check the kdump.conf file and unpack the initramfs file
making sure that those modules are gone.  Check controller,
worker, and storage node types.  Reboot node, make sure things
behave as expected ie. no extra kdump.conf mangling and no
unexpected kdump service restarts.
Also crash a node with intel ethernet hardware on it and make
sure it comes back up with a vmcore left in /var/log/crash.

Change-Id: I9112f722cee8e199d94393bca887d3bb9bb89b39
Closes-Bug: 1923879
Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>
This commit is contained in:
Jim Somerville
2021-04-14 17:13:59 -04:00
parent 2a80652598
commit f46c154188

View File

@@ -247,6 +247,16 @@ class platform::config::tpm {
}
class platform::config::kdump {
file_line { '/etc/kdump.conf dracut_args':
path => '/etc/kdump.conf',
line => 'dracut_args --omit-drivers "ice e1000e i40e ixgbe ixgbevf iavf"',
match => '^dracut_args .*--omit-drivers',
}
~> service { 'kdump': }
}
class platform::config::certs::ssl_ca
inherits ::platform::config::certs::params {
@@ -353,6 +363,7 @@ class platform::config::pre {
include ::platform::config::hosts
include ::platform::config::file
include ::platform::config::tpm
include ::platform::config::kdump
include ::platform::config::certs::ssl_ca
if ($::platform::params::distributed_cloud_role =='systemcontroller' and
$::personality == 'controller') {