Remove ceph mon restart during upgrade

Currently we are restarting the monitor to update the ceph configs
during the upgrade-activate step.

The restart operation is failing sometimes for an unknown reason since
the monitor is correctly restarted, but its pid is not created,
causing SM to assume the monitor is not running and leading to a swact.
This behavior was verified in STX 5.0 and is easily reproduced in an
AIO-DX config. Just keep running "/etc/init.d/ceph restart mon" until
the following message appears:

Change-Id: Icc55ab64a4d2a697e08935b120410a05dad16676

--------------------------------------------------
controller-0:~$ sudo /etc/init.d/ceph restart mon
Password:
=== mon.controller ===
=== mon.controller ===
Stopping Ceph mon.controller on controller-0...kill  99034...done
=== mon.controller ===
Starting Ceph mon.controller on controller-0...
Failed to start transient scope unit: Unit ceph-mon.scope already exists.
failed: 'ulimit -n 32768; TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728  systemd-run --scope --unit=ceph-mon --slice=system-ceph /usr/bin/ceph-mon -i controller --pid-file /var/run/ceph/mon.controller.pid -c /etc/ceph/ceph.conf --cluster ceph '
--------------------------------------------------

This problem was verified on AIO-SX (harder to reproduce) and AIO-DX.
For some reason it was not possible to replicate this problem in a 2+2+2
or 2+2.

The change made allowed to modify ceph config file without need to
restart the monitors.

For the AIO-SX, it is possible to run ceph commands to disable the
warnings that appear without restarting the monitor.

The mon_host we add to the config file is not used by the monitors,
since they depend on the monmap that they create during runtime. For
more information about why it is not needed to restart the monitors
after updating ceph config file, please refer to
https://docs.ceph.com/en/latest/rados/configuration/mon-config-ref/#consistency

Story: 2009074
Task: 44073

Testing performed:
 1) Upgrade AIO-SX
 2) Upgrade AIO-DX

Signed-off-by: Vinicius Lopes da Silva <vinicius.lopesdasilva@windriver.com>
Change-Id: Ie108392f6121d860da67da2af353a20917f45bbd
This commit is contained in:
Vinicius Lopes da Silva 2021-11-24 16:13:58 -03:00
parent e223b96ac4
commit 42a9abe13d
1 changed files with 5 additions and 5 deletions

View File

@ -939,7 +939,11 @@ class platform::ceph::upgrade::runtime
'mon/mon warn on insecure global id reclaim allowed': value => false;
'mon/auth allow insecure global id reclaim': value => true;
'mon.controller-0/mon_addr': ensure => absent;
} -> Exec['Restart Ceph Monitor']
} -> exec { 'Removing ceph warnings':
command => 'ceph config set mon mon_warn_on_insecure_global_id_reclaim false;\
ceph config set mon mon_warn_on_insecure_global_id_reclaim_allowed false;\
ceph config set mon auth_allow_insecure_global_id_reclaim false'
}
} else {
# 2 node configuration, we have a floating monitor
$mon_host = $floating_mon_addr
@ -955,10 +959,6 @@ class platform::ceph::upgrade::runtime
ceph_config {
'global/mon_host': value => $mon_host;
} -> Exec['Restart Ceph Monitor']
exec { 'Restart Ceph Monitor' :
command => '/etc/init.d/ceph restart mon',
}
}
}