Browse Source

Update documentation on Ceph partitioning

Make the docs up to date:
* Previous version of the documentation assumes that
  partitioning schema is different for SSDs and HDDs,
  this is not the case anymore.
* Ceph charts now have automatic partitioning for both
  OSDs and journals.

Change-Id: I74bd625522469e2860ada995f4e6a81a566107fa
tags/v19.03.06
Evgeny L 3 months ago
parent
commit
ba1dd3681a
1 changed files with 28 additions and 172 deletions
  1. 28
    172
      doc/source/authoring_and_deployment.rst

+ 28
- 172
doc/source/authoring_and_deployment.rst View File

@@ -240,26 +240,27 @@ the order in which you should build your site files is as follows:
240 240
 Control Plane Ceph Cluster Notes
241 241
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
242 242
 
243
-Environment Ceph parameters for the control plane are located in:
243
+Configuration variables for ceph control plane are located in:
244 244
 
245
-``site/${NEW_SITE}/software/charts/ucp/ceph/ceph.yaml``
245
+- ``site/${NEW_SITE}/software/charts/ucp/ceph/ceph-osd.yaml``
246
+- ``site/${NEW_SITE}/software/charts/ucp/ceph/ceph-client.yaml``
246 247
 
247 248
 Setting highlights:
248 249
 
249 250
 -  data/values/conf/storage/osd[\*]/data/location: The block device that
250 251
    will be formatted by the Ceph chart and used as a Ceph OSD disk
251
--  data/values/conf/storage/osd[\*]/journal/location: The directory
252
+-  data/values/conf/storage/osd[\*]/journal/location: The block device
252 253
    backing the ceph journal used by this OSD. Refer to the journal
253 254
    paradigm below.
254 255
 -  data/values/conf/pool/target/osd: Number of OSD disks on each node
255 256
 
256 257
 Assumptions:
257 258
 
258
-1. Ceph OSD disks are not configured for any type of RAID (i.e., they
259
-   are configured as JBOD if connected through a RAID controller). (If
260
-   RAID controller does not support JBOD, put each disk in its own
261
-   RAID-0 and enable RAID cache and write-back cache if the RAID
262
-   controller supports it.)
259
+1. Ceph OSD disks are not configured for any type of RAID, they
260
+   are configured as JBOD when connected through a RAID controller.
261
+   If RAID controller does not support JBOD, put each disk in its
262
+   own RAID-0 and enable RAID cache and write-back cache if the
263
+   RAID controller supports it.
263 264
 2. Ceph disk mapping, disk layout, journal and OSD setup is the same
264 265
    across Ceph nodes, with only their role differing. Out of the 4
265 266
    control plane nodes, we expect to have 3 actively participating in
@@ -268,16 +269,12 @@ Assumptions:
268 269
    (cp\_*-secondary) than the other three (cp\_*-primary).
269 270
 3. If doing a fresh install, disk are unlabeled or not labeled from a
270 271
    previous Ceph install, so that Ceph chart will not fail disk
271
-   initialization
272
+   initialization.
272 273
 
273
-This document covers two Ceph journal deployment paradigms:
274
-
275
-1. Servers with SSD/HDD mix (disregarding operating system disks).
276
-2. Servers with no SSDs (disregarding operating system disks). In other
277
-   words, exclusively spinning disk HDDs available for Ceph.
274
+It's highly recommended to use SSD devices for Ceph Journal partitions.
278 275
 
279 276
 If you have an operating system available on the target hardware, you
280
-can determine HDD and SSD layout with:
277
+can determine HDD and SSD devices with:
281 278
 
282 279
 ::
283 280
 
@@ -288,28 +285,23 @@ and where a value of ``0`` indicates non-spinning disk (i.e. SSD). (Note
288 285
 - Some SSDs still report a value of ``1``, so it is best to go by your
289 286
 server specifications).
290 287
 
291
-In case #1, the SSDs will be used for journals and the HDDs for OSDs.
292
-
293 288
 For OSDs, pass in the whole block device (e.g., ``/dev/sdd``), and the
294 289
 Ceph chart will take care of disk partitioning, formatting, mounting,
295 290
 etc.
296 291
 
297
-For journals, divide the number of journal disks as evenly as possible
298
-between the OSD disks. We will also use the whole block device, however
299
-we cannot pass that block device to the Ceph chart like we can for the
300
-OSD disks.
301
-
302
-Instead, the journal devices must be already partitioned, formatted, and
303
-mounted prior to Ceph chart execution. This should be done by MaaS as
304
-part of the Drydock host-profile being used for control plane nodes.
292
+For Ceph Journals, you can pass in a specific partition (e.g., ``/dev/sdb1``),
293
+note that it's not required to pre-create these partitions, Ceph chart
294
+will create journal partitions automatically if they don't exist.
295
+By default the size of every journal partition is 10G, make sure
296
+there is enough space available to allocate all journal partitions.
305 297
 
306
-Consider the follow example where:
298
+Consider the following example where:
307 299
 
308 300
 -  /dev/sda is an operating system RAID-1 device (SSDs for OS root)
309
--  /dev/sdb is an operating system RAID-1 device (SSDs for ceph journal)
310
--  /dev/sd[cdef] are HDDs
301
+-  /dev/sd[bc] are SSDs for ceph journals
302
+-  /dev/sd[efgh] are HDDs for OSDs
311 303
 
312
-Then, the data section of this file would look like:
304
+The data section of this file would look like:
313 305
 
314 306
 ::
315 307
 
@@ -318,98 +310,31 @@ Then, the data section of this file would look like:
318 310
         conf:
319 311
           storage:
320 312
             osd:
321
-              - data:
322
-                  type: block-logical
323
-                  location: /dev/sdd
324
-                journal:
325
-                  type: directory
326
-                  location: /var/lib/openstack-helm/ceph/journal/journal-sdd
327 313
               - data:
328 314
                   type: block-logical
329 315
                   location: /dev/sde
330 316
                 journal:
331
-                  type: directory
332
-                  location: /var/lib/openstack-helm/ceph/journal/journal-sde
333
-              - data:
334 317
                   type: block-logical
335
-                  location: /dev/sdf
336
-                journal:
337
-                  type: directory
338
-                  location: /var/lib/openstack-helm/ceph/journal/journal-sdf
318
+                  location: /dev/sdb1
339 319
               - data:
340 320
                   type: block-logical
341
-                  location: /dev/sdg
321
+                  location: /dev/sdf
342 322
                 journal:
343
-                  type: directory
344
-                  location: /var/lib/openstack-helm/ceph/journal/journal-sdg
345
-          pool:
346
-            target:
347
-              osd: 4
348
-
349
-where the following mount is setup by MaaS via Drydock host profile for
350
-the control-plane nodes:
351
-
352
-::
353
-
354
-    /dev/sdb is mounted to /var/lib/openstack-helm/ceph/journal
355
-
356
-In case #2, Ceph best practice is to allocate journal space on all OSD
357
-disks. The Ceph chart assumes this partitioning has been done
358
-beforehand. Ensure that your control plane host profile is partitioning
359
-each disk between the Ceph OSD and Ceph journal, and that it is mounting
360
-the journal partitions. (Drydock will drive these disk layouts via MaaS
361
-provisioning). Note the mountpoints for the journals and the partition
362
-mappings. Consider the following example where:
363
-
364
--  /dev/sda is the operating system RAID-1 device
365
--  /dev/sd[bcde] are HDDs
366
-
367
-Then, the data section of this file will look similar to the following:
368
-
369
-::
370
-
371
-    data:
372
-      values:
373
-        conf:
374
-          storage:
375
-            osd:
376
-              - data:
377 323
                   type: block-logical
378 324
                   location: /dev/sdb2
379
-                journal:
380
-                  type: directory
381
-                  location: /var/lib/openstack-helm/ceph/journal0/journal-sdb
382 325
               - data:
383 326
                   type: block-logical
384
-                  location: /dev/sdc2
327
+                  location: /dev/sdg
385 328
                 journal:
386
-                  type: directory
387
-                  location: /var/lib/openstack-helm/ceph/journal1/journal-sdc
388
-              - data:
389 329
                   type: block-logical
390
-                  location: /dev/sdd2
391
-                journal:
392
-                  type: directory
393
-                  location: /var/lib/openstack-helm/ceph/journal2/journal-sdd
330
+                  location: /dev/sdc1
394 331
               - data:
395 332
                   type: block-logical
396
-                  location: /dev/sde2
333
+                  location: /dev/sdh
397 334
                 journal:
398
-                  type: directory
399
-                  location: /var/lib/openstack-helm/ceph/journal3/journal-sde
400
-          pool:
401
-            target:
402
-              osd: 4
403
-
404
-where the following mounts are setup by MaaS via Drydock host profile
405
-for the control-plane nodes:
406
-
407
-::
335
+                  type: block-logical
336
+                  location: /dev/sdc2
408 337
 
409
-    /dev/sdb1 is mounted to /var/lib/openstack-helm/ceph/journal0
410
-    /dev/sdc1 is mounted to /var/lib/openstack-helm/ceph/journal1
411
-    /dev/sdd1 is mounted to /var/lib/openstack-helm/ceph/journal2
412
-    /dev/sde1 is mounted to /var/lib/openstack-helm/ceph/journal3
413 338
 
414 339
 Update Passphrases
415 340
 ~~~~~~~~~~~~~~~~~~~~
@@ -685,75 +610,6 @@ permission denied errors from apparmor when the MaaS container tries to
685 610
 leverage libc6 for /bin/sh when MaaS container ntpd is forcefully
686 611
 disabled.
687 612
 
688
-Setup Ceph Journals
689
-~~~~~~~~~~~~~~~~~~~
690
-
691
-Until genesis node reprovisioning is implemented, it is necessary to
692
-manually perform host-level disk partitioning and mounting on the
693
-genesis node, for activites that would otherwise have been addressed by
694
-a bare metal node provision via Drydock host profile data by MaaS.
695
-
696
-Assuming your genesis HW matches the HW used in your control plane host
697
-profile, you should manually apply to the genesis node the same Ceph
698
-partitioning (OSDs & journals) and formatting + mounting (journals only)
699
-as defined in the control plane host profile. See
700
-``airship-treasuremap/global/profiles/host/base_control_plane.yaml``.
701
-
702
-For example, if we have a journal SSDs ``/dev/sdb`` on the genesis node,
703
-then use the ``cfdisk`` tool to format it:
704
-
705
-::
706
-
707
-    sudo cfdisk /dev/sdb
708
-
709
-Then:
710
-
711
-1. Select ``gpt`` label for the disk
712
-2. Select ``New`` to create a new partition
713
-3. If scenario #1 applies in
714
-   site/$NEW\_SITE/software/charts/ucp/ceph/ceph.yaml\_, then accept
715
-   default partition size (entire disk). If scenario #2 applies, then
716
-   only allocate as much space as defined in the journal disk partitions
717
-   mounted in the control plane host profile.
718
-4. Select ``Write`` option to commit changes, then ``Quit``
719
-5. If scenario #2 applies, create a second partition that takes up all
720
-   of the remaining disk space. This will be used as the OSD partition
721
-   (``/dev/sdb2``).
722
-
723
-Install package to format disks with XFS:
724
-
725
-::
726
-
727
-    sudo apt -y install xfsprogs
728
-
729
-Then, construct an XFS filesystem on the journal partition with XFS:
730
-
731
-::
732
-
733
-    sudo mkfs.xfs /dev/sdb1
734
-
735
-Create a directory as mount point for ``/dev/sdb1`` to match those
736
-defined in the same host profile ceph journals:
737
-
738
-::
739
-
740
-    sudo mkdir -p /var/lib/ceph/cp
741
-
742
-Use the ``blkid`` command to get the UUID for ``/dev/sdb1``, then
743
-populate ``/etc/fstab`` accordingly. Ex:
744
-
745
-::
746
-
747
-    sudo sh -c 'echo "UUID=01234567-ffff-aaaa-bbbb-abcdef012345 /var/lib/ceph/cp xfs defaults 0 0" >> /etc/fstab'
748
-
749
-Repeat all preceeding steps in this section for each journal device in
750
-the Ceph cluster. After this is completed for all journals, mount the
751
-partitions:
752
-
753
-::
754
-
755
-    sudo mount -a
756
-
757 613
 Promenade bootstrap
758 614
 ~~~~~~~~~~~~~~~~~~~
759 615
 

Loading…
Cancel
Save