Replace OSDs on an AIO-DX System (pick)
Updated Patchset 9 comments Updated Patchset 8 comments Updated Patchset 6 comments Updated Patchset 5 comments Updated Patchset 2 comments Updated Patchset 1 comments Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com> Change-Id: Ic12380d71ac0779c52b1280fbcce95710f6a2214 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
This commit is contained in:
@@ -119,9 +119,10 @@ Configure Ceph OSDs on a Host
|
|||||||
add-ssd-backed-journals-using-horizon
|
add-ssd-backed-journals-using-horizon
|
||||||
add-ssd-backed-journals-using-the-cli
|
add-ssd-backed-journals-using-the-cli
|
||||||
add-a-storage-tier-using-the-cli
|
add-a-storage-tier-using-the-cli
|
||||||
replace-osds-and-journal-disks
|
|
||||||
provision-storage-on-a-controller-or-storage-host-using-horizon
|
provision-storage-on-a-controller-or-storage-host-using-horizon
|
||||||
provision-storage-on-a-storage-host-using-the-cli
|
provision-storage-on-a-storage-host-using-the-cli
|
||||||
|
replace-osds-and-journal-disks
|
||||||
|
replace-osds-on-an-aio-dx-system-319b0bc2f7e6
|
||||||
|
|
||||||
-------------------------
|
-------------------------
|
||||||
Persistent Volume Support
|
Persistent Volume Support
|
||||||
|
@@ -13,8 +13,18 @@ You can replace failed storage devices on storage nodes.
|
|||||||
For best results, ensure the replacement disk is the same size as others in
|
For best results, ensure the replacement disk is the same size as others in
|
||||||
the same peer group. Do not substitute a smaller disk than the original.
|
the same peer group. Do not substitute a smaller disk than the original.
|
||||||
|
|
||||||
The replacement disk is automatically formatted and updated with data when the
|
.. note::
|
||||||
storage host is unlocked. For more information, see |node-doc|: :ref:`Change
|
Due to a limitation in **udev**, the device path of a disk connected through
|
||||||
Hardware Components for a Storage Host
|
a SAS controller changes when the disk is replaced. Therefore, in the
|
||||||
<changing-hardware-components-for-a-storage-host>`.
|
general procedure below, you must lock, delete, and re-install the node.
|
||||||
|
However, for an |AIO-DX| system, use the following alternative procedure to
|
||||||
|
replace |OSDs| without reinstalling the host:
|
||||||
|
:ref:`Replace OSDs on an AIO-DX System <replace-osds-on-an-aio-dx-system-319b0bc2f7e6>`.
|
||||||
|
|
||||||
|
.. rubric:: |proc|
|
||||||
|
|
||||||
|
Follow the procedure located at |node-doc|: :ref:`Change
|
||||||
|
Hardware Components for a Storage Host <changing-hardware-components-for-a-storage-host>`.
|
||||||
|
|
||||||
|
The replacement disk is automatically formatted and updated with data when the
|
||||||
|
storage host is unlocked.
|
||||||
|
@@ -0,0 +1,117 @@
|
|||||||
|
.. _replace-osds-on-an-aio-dx-system-319b0bc2f7e6:
|
||||||
|
|
||||||
|
================================
|
||||||
|
Replace OSDs on an AIO-DX System
|
||||||
|
================================
|
||||||
|
|
||||||
|
On systems that use a Ceph backend for persistent storage, you can replace
|
||||||
|
storage disks or swap an |AIO-DX| node while the system is running, even if the
|
||||||
|
storage resources are in active use.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
All storage alarms need to be cleared before starting this procedure.
|
||||||
|
|
||||||
|
.. rubric:: |context|
|
||||||
|
|
||||||
|
You can replace |OSDs| in an |AIO-DX| system to increase capacity, or replace
|
||||||
|
faulty disks on the host without reinstalling the host.
|
||||||
|
|
||||||
|
.. rubric:: |proc|
|
||||||
|
|
||||||
|
#. Ensure that the controller with the |OSD| to be replaced is the standby
|
||||||
|
controller.
|
||||||
|
|
||||||
|
For example, if the disk replacement has to be done on controller-1
|
||||||
|
and it is the active controller, use the following command to swact the
|
||||||
|
controller to controller-0:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)$ system host-show controller-1 | fgrep capabilities
|
||||||
|
~(keystone_admin)$ system host-swact controller-1
|
||||||
|
|
||||||
|
After controller swact, you will have to connect via ssh again to the
|
||||||
|
<oam-floating-ip> to connect to the newly active controller-0.
|
||||||
|
|
||||||
|
#. Determine the **osdid** of the disk that is to be replaced.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)$ system host-stor-list controller-1
|
||||||
|
|
||||||
|
#. Lock the standby controller-1 to make the changes.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)$ system host-lock controller-1
|
||||||
|
|
||||||
|
#. Run the :command:`ceph osd destroy osd.<ID> --yes-i-really-mean-it` command.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)$ ceph osd destroy osd.<id> --yes-i-really-mean-it
|
||||||
|
|
||||||
|
#. Power down controller-1.
|
||||||
|
|
||||||
|
#. Replace the storage disk.
|
||||||
|
|
||||||
|
#. Power on controller-1.
|
||||||
|
|
||||||
|
#. Unlock controller-1.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)]$ system host-unlock controller-1
|
||||||
|
|
||||||
|
#. Wait for the recovery process in the Ceph cluster to complete.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)]$ ceph -s
|
||||||
|
|
||||||
|
cluster:
|
||||||
|
id: 50ce952f-bd16-4864-9487-6c7e959be95e
|
||||||
|
health: HEALTH_WARN
|
||||||
|
Degraded data redundancy: 13/50 objects degraded (26.000%), 10 pgs degraded
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 1 daemons, quorum controller (age 68m)
|
||||||
|
mgr: controller-0(active, since 66m)
|
||||||
|
mds: kube-cephfs:1 {0=controller-0=up:active} 1 up:standby
|
||||||
|
osd: 2 osds: 2 up (since 9s), 2 in (since 9s)
|
||||||
|
|
||||||
|
data:
|
||||||
|
pools: 3 pools, 192 pgs
|
||||||
|
objects: 25 objects, 300 MiB
|
||||||
|
usage: 655 MiB used, 15 GiB / 16 GiB avail
|
||||||
|
pgs: 13/50 objects degraded (26.000%)
|
||||||
|
182 active+clean
|
||||||
|
8 active+recovery_wait+degraded
|
||||||
|
2 active+recovering+degraded
|
||||||
|
|
||||||
|
io:
|
||||||
|
recovery: 24 B/s, 1 keys/s, 1 objects/s
|
||||||
|
|
||||||
|
#. Ensure that the Ceph cluster is healthy.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)]$ ceph -s
|
||||||
|
|
||||||
|
cluster:
|
||||||
|
id: 50ce952f-bd16-4864-9487-6c7e959be95e
|
||||||
|
health: HEALTH_OK
|
||||||
|
|
||||||
|
services:
|
||||||
|
mon: 1 daemons, quorum controller (age 68m)
|
||||||
|
mgr: controller-0(active, since 66m), standbys: controller-1
|
||||||
|
mds: kube-cephfs:1 {0=controller-0=up:active} 1 up:standby
|
||||||
|
osd: 2 osds: 2 up (since 36s), 2 in (since 36s)
|
||||||
|
|
||||||
|
data:
|
||||||
|
pools: 3 pools, 192 pgs
|
||||||
|
objects: 25 objects, 300 MiB
|
||||||
|
usage: 815 MiB used, 15 GiB / 16 GiB avail
|
||||||
|
pgs: 192 active+clean
|
||||||
|
|
||||||
|
|
Reference in New Issue
Block a user