Updates on site authoring and deployment guide
- Order of few sections re-arranged - Use of tools/airship over cloning various repos for Pegleg, Promenade, and Shipyard - Additional info on Airship VIPs - Multiple grammar fixes after reviews Change-Id: Icb18ad77844038d61046670cb327d27cfcabded3
This commit is contained in:
parent
e14e56a44e
commit
59a4dc2dd6
|
@ -16,59 +16,6 @@ for a standard Airship deployment. For the most part, the site
|
|||
authoring guidance lives within ``seaworthy`` reference site in the
|
||||
form of YAML comments.
|
||||
|
||||
Terminology
|
||||
-----------
|
||||
|
||||
**Cloud**: A platform that provides a standard set of interfaces for
|
||||
`IaaS <https://en.wikipedia.org/wiki/Infrastructure_as_a_service>`__
|
||||
consumers.
|
||||
|
||||
**OSH**: (`OpenStack
|
||||
Helm <https://docs.openstack.org/openstack-helm/latest/>`__) is a
|
||||
collection of Helm charts used to deploy OpenStack on kubernetes.
|
||||
|
||||
**Undercloud/Overcloud**: Terms used to distinguish which cloud is
|
||||
deployed on top of the other. In Airship sites, OpenStack (overcloud)
|
||||
is deployed on top of Kubernetes (undercloud).
|
||||
|
||||
**Airship**: A specific implementation of OpenStack Helm charts onto
|
||||
kubernetes, the deployment of which is the primary focus of this document.
|
||||
|
||||
**Control Plane**: From the point of view of the cloud service provider,
|
||||
the control plane refers to the set of resources (hardware, network,
|
||||
storage, etc) sourced to run cloud services.
|
||||
|
||||
**Data Plane**: From the point of view of the cloud service provider,
|
||||
the data plane is the set of resources (hardware, network, storage,
|
||||
etc.) sourced to run consumer workloads. When used in this document,
|
||||
"data plane" refers to the data plane of the overcloud (OSH).
|
||||
|
||||
**Host Profile**: A host profile is a standard way of configuring a bare
|
||||
metal host. Encompasses items such as the number of bonds, bond slaves,
|
||||
physical storage mapping and partitioning, and kernel parameters.
|
||||
|
||||
Component Overview
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. image:: diagrams/component_list.png
|
||||
|
||||
Node Overview
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
This document refers to several types of nodes, which vary in their
|
||||
purpose, and to some degree in their orchestration / setup:
|
||||
|
||||
- **Build node**: This refers to the environment where configuration
|
||||
documents are built for your environment (e.g., your laptop)
|
||||
- **Genesis node**: The "genesis" or "seed node" refers to a node used
|
||||
to get a new deployment off the ground, and is the first node built
|
||||
in a new deployment environment.
|
||||
- **Control / Controller nodes**: The nodes that make up the control
|
||||
plane. (Note that the Genesis node will be one of the controller
|
||||
nodes.)
|
||||
- **Compute nodes / Worker Nodes**: The nodes that make up the data
|
||||
plane
|
||||
|
||||
Support
|
||||
-------
|
||||
|
||||
|
@ -83,7 +30,6 @@ the component:
|
|||
group <https://storyboard.openstack.org/#!/project_group/85>`__:
|
||||
|
||||
- `Airship Armada <https://storyboard.openstack.org/#!/project/1002>`__
|
||||
- `Airship Berth <https://storyboard.openstack.org/#!/project/1003>`__
|
||||
- `Airship
|
||||
Deckhand <https://storyboard.openstack.org/#!/project/1004>`__
|
||||
- `Airship
|
||||
|
@ -96,31 +42,92 @@ the component:
|
|||
Promenade <https://storyboard.openstack.org/#!/project/1009>`__
|
||||
- `Airship
|
||||
Shipyard <https://storyboard.openstack.org/#!/project/1010>`__
|
||||
- `Airship in a
|
||||
Bottle <https://storyboard.openstack.org/#!/project/1006>`__
|
||||
|
||||
- `Airship Treasuremap
|
||||
<https://storyboard.openstack.org/#!/project/openstack/airship-treasuremap>`__
|
||||
<https://storyboard.openstack.org/#!/project/airship/treasuremap>`__
|
||||
|
||||
Hardware Prep
|
||||
Terminology
|
||||
-----------
|
||||
|
||||
**Cloud**: A platform that provides a standard set of interfaces for
|
||||
`IaaS <https://en.wikipedia.org/wiki/Infrastructure_as_a_service>`__
|
||||
consumers.
|
||||
|
||||
**OSH**: (`OpenStack Helm <https://docs.openstack.org/openstack-helm/latest/>`__) is a
|
||||
collection of Helm charts used to deploy OpenStack on Kubernetes.
|
||||
|
||||
**Helm**: (`Helm <https://helm.sh/>`__) is a package manager for Kubernetes.
|
||||
Helm Charts help you define, install, and upgrade Kubernetes applications.
|
||||
|
||||
**Undercloud/Overcloud**: Terms used to distinguish which cloud is
|
||||
deployed on top of the other. In Airship sites, OpenStack (overcloud)
|
||||
is deployed on top of Kubernetes (undercloud).
|
||||
|
||||
**Airship**: A specific implementation of OpenStack Helm charts that deploy
|
||||
Kubernetes. This deployment is the primary focus of this document.
|
||||
|
||||
**Control Plane**: From the point of view of the cloud service provider,
|
||||
the control plane refers to the set of resources (hardware, network,
|
||||
storage, etc.) configured to provide cloud services for customers.
|
||||
|
||||
**Data Plane**: From the point of view of the cloud service provider,
|
||||
the data plane is the set of resources (hardware, network, storage,
|
||||
etc.) configured to run consumer workloads. When used in this document,
|
||||
"data plane" refers to the data plane of the overcloud (OSH).
|
||||
|
||||
**Host Profile**: A host profile is a standard way of configuring a bare
|
||||
metal host. It encompasses items such as the number of bonds, bond slaves,
|
||||
physical storage mapping and partitioning, and kernel parameters.
|
||||
|
||||
Versioning
|
||||
----------
|
||||
|
||||
Airship reference manifests are delivered monthly as release tags in the
|
||||
`Treasuremap <https://github.com/airshipit/treasuremap/releases>`__.
|
||||
|
||||
The releases are verified by `Seaworthy
|
||||
<https://airship-treasuremap.readthedocs.io/en/latest/seaworthy.html>`__,
|
||||
`Airsloop
|
||||
<https://airship-treasuremap.readthedocs.io/en/latest/airsloop.html>`__,
|
||||
and `Airship-in-a-Bottle
|
||||
<https://github.com/airshipit/treasuremap/blob/master/tools/deployment/aiab/README.rst>`__
|
||||
pipelines before delivery and are recommended for deployments instead of using
|
||||
the master branch directly.
|
||||
|
||||
|
||||
Component Overview
|
||||
------------------
|
||||
|
||||
.. image:: diagrams/component_list.png
|
||||
|
||||
|
||||
Node Overview
|
||||
-------------
|
||||
|
||||
Disk
|
||||
~~~~
|
||||
This document refers to several types of nodes, which vary in their
|
||||
purpose, and to some degree in their orchestration / setup:
|
||||
|
||||
1. For servers that are in the control plane (including Genesis):
|
||||
- **Build node**: This refers to the environment where configuration
|
||||
documents are built for your environment (e.g., your laptop)
|
||||
- **Genesis node**: The "genesis" or "seed node" refers to a node used
|
||||
to get a new deployment off the ground, and is the first node built
|
||||
in a new deployment environment
|
||||
- **Control / Master nodes**: The nodes that make up the control
|
||||
plane. (Note that the genesis node will be one of the controller
|
||||
nodes)
|
||||
- **Compute / Worker Nodes**: The nodes that make up the data
|
||||
plane
|
||||
|
||||
- Two-disk RAID-1: Operating System
|
||||
- Two disks JBOD: Ceph Journal/Meta for control plane
|
||||
- Remaining disks JBOD: Ceph OSD for control plane
|
||||
Hardware Preparation
|
||||
--------------------
|
||||
|
||||
2. For servers that are in the tenant data plane (compute nodes):
|
||||
The Seaworthy site reference shows a production-worthy deployment that includes
|
||||
multiple disks, as well as redundant/bonded network configuration.
|
||||
|
||||
- Two-disk RAID-1: Operating System
|
||||
- Two disks JBOD: Ceph Journal/Meta for tenant-ceph
|
||||
- Two disks JBOD: Ceph OSD for tenant-ceph
|
||||
- Remaining disks need to be configured according to the host profile target
|
||||
for each given server (e.g. RAID-10 for OpenStack Ephemeral).
|
||||
Airship hardware requirements are flexible, and the system can be deployed
|
||||
with very minimal requirements if needed (e.g., single disk, single network).
|
||||
|
||||
For simplified non-bonded, and single disk examples, see
|
||||
`Airsloop <https://airship-treasuremap.readthedocs.io/en/latest/airsloop.html>`__.
|
||||
|
||||
BIOS and IPMI
|
||||
~~~~~~~~~~~~~
|
||||
|
@ -132,98 +139,122 @@ BIOS and IPMI
|
|||
latest firmware version for your hardware. Otherwise, it is recommended to
|
||||
perform an iLo/iDrac reset, as IPMI bugs with long-running firmware are not
|
||||
uncommon.
|
||||
4. Set PXE as first boot device and ensure the correct NIC is selected for PXE
|
||||
4. Set PXE as first boot device and ensure the correct NIC is selected for PXE.
|
||||
|
||||
Disk
|
||||
~~~~
|
||||
|
||||
1. For servers that are in the control plane (including genesis):
|
||||
|
||||
- Two-disk RAID-1: Operating System
|
||||
- Two disks JBOD: Ceph Journal/Meta for control plane
|
||||
- Remaining disks JBOD: Ceph OSD for control plane
|
||||
|
||||
2. For servers that are in the tenant data plane (compute nodes):
|
||||
|
||||
- Two-disk RAID-1: Operating System
|
||||
- Two disks JBOD: Ceph Journal/Meta for tenant-ceph
|
||||
- Two disks JBOD: Ceph OSD for tenant-ceph
|
||||
- Remaining disks configured according to the host profile target
|
||||
for each given server (e.g., RAID-10 for OpenStack ephemeral).
|
||||
|
||||
Network
|
||||
~~~~~~~
|
||||
|
||||
1. You have a network you can successfully PXE boot with your network topology
|
||||
and bonding settings (dedicated PXE interace on untagged/native VLAN in this
|
||||
example)
|
||||
2. You have (VLAN) segmented, routed networks accessible by all nodes for:
|
||||
1. You have a dedicated PXE interface on untagged/native VLAN,
|
||||
1x1G interface (eno1)
|
||||
2. You have VLAN segmented networks,
|
||||
2x10G bonded interfaces (enp67s0f0 and enp68s0f1)
|
||||
|
||||
1. Management network(s) (k8s control channel)
|
||||
2. Calico network(s)
|
||||
3. Storage network(s)
|
||||
4. Overlay network(s)
|
||||
5. Public network(s)
|
||||
- Management network (routed/OAM)
|
||||
- Calico network (Kubernetes control channel)
|
||||
- Storage network
|
||||
- Overlay network
|
||||
- Public network
|
||||
|
||||
HW Sizing and minimum requirements
|
||||
----------------------------------
|
||||
See detailed network configuration in the
|
||||
``site/${NEW_SITE}/networks/physical/networks.yaml`` configuration file.
|
||||
|
||||
+----------+----------+----------+----------+
|
||||
| Node | disk | memory | cpu |
|
||||
+==========+==========+==========+==========+
|
||||
| Build | 10 GB | 4 GB | 1 |
|
||||
+----------+----------+----------+----------+
|
||||
| Genesis | 100 GB | 16 GB | 8 |
|
||||
+----------+----------+----------+----------+
|
||||
| Control | 10 TB | 128 GB | 24 |
|
||||
+----------+----------+----------+----------+
|
||||
Hardware sizing and minimum requirements
|
||||
----------------------------------------
|
||||
|
||||
+-----------------+----------+----------+----------+
|
||||
| Node | Disk | Memory | CPU |
|
||||
+=================+==========+==========+==========+
|
||||
| Build (laptop) | 10 GB | 4 GB | 1 |
|
||||
+-----------------+----------+----------+----------+
|
||||
| Genesis/Control | 500 GB | 64 GB | 24 |
|
||||
+-----------------+----------+----------+----------+
|
||||
| Compute | N/A* | N/A* | N/A* |
|
||||
+----------+----------+----------+----------+
|
||||
+-----------------+----------+----------+----------+
|
||||
|
||||
* Workload driven (determined by host profile)
|
||||
|
||||
See detailed hardware configuration in the
|
||||
``site/${NEW_SITE}/networks/profiles`` folder.
|
||||
|
||||
Establishing build node environment
|
||||
-----------------------------------
|
||||
|
||||
1. On the machine you wish to use to generate deployment files, install required
|
||||
tooling::
|
||||
tooling
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
sudo apt -y install docker.io git
|
||||
|
||||
2. Clone and link the required git repos as follows::
|
||||
2. Clone the ``treasuremap`` git repo as follows
|
||||
|
||||
git clone https://git.openstack.org/openstack/airship-pegleg
|
||||
git clone https://git.openstack.org/openstack/airship-treasuremap
|
||||
.. code-block:: bash
|
||||
|
||||
Building Site documents
|
||||
git clone https://opendev.org/airship/treasuremap.git
|
||||
cd treasuremap && git checkout <release-tag>
|
||||
|
||||
Building site documents
|
||||
-----------------------
|
||||
|
||||
This section goes over how to put together site documents according to
|
||||
your specific environment, and generate the initial Promenade bundle
|
||||
your specific environment and generate the initial Promenade bundle
|
||||
needed to start the site deployment.
|
||||
|
||||
Preparing deployment documents
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In its current form, pegleg provides an organized structure for YAML
|
||||
elements, in order to separate common site elements (i.e., ``global``
|
||||
In its current form, Pegleg provides an organized structure for YAML
|
||||
elements that separates common site elements (i.e., ``global``
|
||||
folder) from unique site elements (i.e., ``site`` folder).
|
||||
|
||||
To gain a full understanding of the pegleg structure, it is highly
|
||||
recommended to read pegleg documentation on this
|
||||
To gain a full understanding of the Pegleg structure, it is highly
|
||||
recommended to read the Pegleg documentation on this topic
|
||||
`here <https://airship-pegleg.readthedocs.io/>`__.
|
||||
|
||||
The ``seaworthy`` site may be used as reference site. It is the
|
||||
principal pipeline for integration and continuous deployment testing of Airship.
|
||||
|
||||
Change directory to the ``airship-treasuremap/site`` folder and copy the
|
||||
Change directory to the ``site`` folder and copy the
|
||||
``seaworthy`` site as follows:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
NEW_SITE=mySite # replace with the name of your site
|
||||
cd airship-treasuremap/site
|
||||
cd treasuremap/site
|
||||
cp -r seaworthy $NEW_SITE
|
||||
|
||||
Remove ``seaworthy`` specific certificates.
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
rm -f airship-treasuremap/site/${NEW_SITE}/secrets/certificates/certificates.yaml
|
||||
rm -f site/${NEW_SITE}/secrets/certificates/certificates.yaml
|
||||
|
||||
|
||||
You will then need to manually make changes to these files. These site
|
||||
manifests are heavily commented to explain parameters, and importantly
|
||||
manifests are heavily commented to explain parameters, and more importantly
|
||||
identify all of the parameters that need to change when authoring a new
|
||||
site.
|
||||
|
||||
These areas which must be updated for a new site are flagged with the
|
||||
label ``NEWSITE-CHANGEME`` in YAML commentary. Search for all instances
|
||||
of ``NEWSITE-CHANGEME`` in your new site definition, and follow the
|
||||
label ``NEWSITE-CHANGEME`` in YAML comments. Search for all instances
|
||||
of ``NEWSITE-CHANGEME`` in your new site definition. Then follow the
|
||||
instructions that accompany the tag in order to make all needed changes
|
||||
to author your new Airship site.
|
||||
|
||||
|
@ -239,36 +270,47 @@ the order in which you should build your site files is as follows:
|
|||
Register DNS names
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Airship has two virtual IPs.
|
||||
|
||||
See ``data.vip`` in section of
|
||||
``site/${NEW_SITE}/networks/common-addresses.yaml`` configuration file.
|
||||
Both are implemented via Kubernetes ingress controller and require FQDNs/DNS.
|
||||
|
||||
Register the following list of DNS names:
|
||||
|
||||
::
|
||||
|
||||
cloudformation.DOMAIN
|
||||
compute.DOMAIN
|
||||
dashboard.DOMAIN
|
||||
drydock.DOMAIN
|
||||
grafana.DOMAIN
|
||||
iam.DOMAIN
|
||||
identity.DOMAIN
|
||||
image.DOMAIN
|
||||
kibana.DOMAIN
|
||||
maas.DOMAIN
|
||||
nagios.DOMAIN
|
||||
network.DOMAIN
|
||||
nova-novncproxy.DOMAIN
|
||||
object-store.DOMAIN
|
||||
orchestration.DOMAIN
|
||||
placement.DOMAIN
|
||||
shipyard.DOMAIN
|
||||
volume.DOMAIN
|
||||
+---+---------------------------+-------------+
|
||||
| A | iam-sw.DOMAIN | ingress-vip |
|
||||
| A | shipyard-sw.DOMAIN | ingress-vip |
|
||||
+---+---------------------------+-------------+
|
||||
| A | cloudformation-sw.DOMAIN | ingress-vip |
|
||||
| A | compute-sw.DOMAIN | ingress-vip |
|
||||
| A | dashboard-sw.DOMAIN | ingress-vip |
|
||||
| A | grafana-sw.DOMAIN | ingress-vip |
|
||||
+---+---------------------------+-------------+
|
||||
| A | identity-sw.DOMAIN | ingress-vip |
|
||||
| A | image-sw.DOMAIN | ingress-vip |
|
||||
| A | kibana-sw.DOMAIN | ingress-vip |
|
||||
| A | nagios-sw.DOMAIN | ingress-vip |
|
||||
| A | network-sw.DOMAIN | ingress-vip |
|
||||
| A | nova-novncproxy-sw.DOMAIN | ingress-vip |
|
||||
| A | object-store-sw.DOMAIN | ingress-vip |
|
||||
| A | orchestration-sw.DOMAIN | ingress-vip |
|
||||
| A | placement-sw.DOMAIN | ingress-vip |
|
||||
| A | volume-sw.DOMAIN | ingress-vip |
|
||||
+---+---------------------------+-------------+
|
||||
| A | maas-sw.DOMAIN | maas-vip |
|
||||
| A | drydock-sw.DOMAIN | maas-vip |
|
||||
+---+---------------------------+-------------+
|
||||
|
||||
Here ``DOMAIN`` is a name of ingress domain, you can find it in the
|
||||
``data.dns.ingress_domain`` section of
|
||||
``site/${NEW_SITE}/secrets/certificates/ingress.yaml`` configuration file.
|
||||
|
||||
Run the following command to get up to date list of required DNS names:
|
||||
Run the following command to get an up-to-date list of required DNS names:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
grep -E 'host: .+DOMAIN' site/${NEW_SITE}/software/config/endpoints.yaml | \
|
||||
sort -u | awk '{print $2}'
|
||||
|
@ -279,21 +321,21 @@ Update Secrets
|
|||
Replace passphrases under ``site/${NEW_SITE}/secrets/passphrases/``
|
||||
with random generated ones:
|
||||
|
||||
- Passpharses generation ``openssl rand -hex 10``
|
||||
- UUID generation ``uuidgen`` (e.g. for Ceph filesystem ID)
|
||||
- Passphrases generation ``openssl rand -hex 10``
|
||||
- UUID generation ``uuidgen`` (e.g., for Ceph filesystem ID)
|
||||
- Update ``secrets/passphrases/ipmi_admin_password.yaml`` with IPMI password
|
||||
- Update ``secrets/passphrases/ubuntu_crypt_password.yaml`` with password hash:
|
||||
|
||||
::
|
||||
.. code-block:: python
|
||||
|
||||
python3 -c "from crypt import *; print(crypt('<YOUR_PASSWORD>', METHOD_SHA512))"
|
||||
|
||||
Configure certificates in ``site/${NEW_SITE}/secrets/certificates/ingress.yaml``,
|
||||
they need to be issued for the domains configured in ``Register DNS names`` section.
|
||||
they need to be issued for the domains configured in the ``Register DNS names`` section.
|
||||
|
||||
.. caution::
|
||||
|
||||
It is required to configure valid certificates, self-signed certificates
|
||||
It is required to configure valid certificates. Self-signed certificates
|
||||
are not supported.
|
||||
|
||||
Control Plane & Tenant Ceph Cluster Notes
|
||||
|
@ -309,7 +351,7 @@ Configuration variables for tenant ceph are located in:
|
|||
- ``site/${NEW_SITE}/software/charts/osh/openstack-tenant-ceph/ceph-osd.yaml``
|
||||
- ``site/${NEW_SITE}/software/charts/osh/openstack-tenant-ceph/ceph-client.yaml``
|
||||
|
||||
Setting highlights:
|
||||
Configuration summary:
|
||||
|
||||
- data/values/conf/storage/osd[\*]/data/location: The block device that
|
||||
will be formatted by the Ceph chart and used as a Ceph OSD disk
|
||||
|
@ -320,43 +362,46 @@ Setting highlights:
|
|||
|
||||
Assumptions:
|
||||
|
||||
1. Ceph OSD disks are not configured for any type of RAID, they
|
||||
1. Ceph OSD disks are not configured for any type of RAID. Instead, they
|
||||
are configured as JBOD when connected through a RAID controller.
|
||||
If RAID controller does not support JBOD, put each disk in its
|
||||
If the RAID controller does not support JBOD, put each disk in its
|
||||
own RAID-0 and enable RAID cache and write-back cache if the
|
||||
RAID controller supports it.
|
||||
2. Ceph disk mapping, disk layout, journal and OSD setup is the same
|
||||
across Ceph nodes, with only their role differing. Out of the 4
|
||||
control plane nodes, we expect to have 3 actively participating in
|
||||
the Ceph quorom, and the remaining 1 node designated as a standby
|
||||
the Ceph quorum, and the remaining 1 node designated as a standby
|
||||
Ceph node which uses a different control plane profile
|
||||
(cp\_*-secondary) than the other three (cp\_*-primary).
|
||||
3. If doing a fresh install, disk are unlabeled or not labeled from a
|
||||
3. If performing a fresh install, disks are unlabeled or not labeled from a
|
||||
previous Ceph install, so that Ceph chart will not fail disk
|
||||
initialization.
|
||||
|
||||
It's highly recommended to use SSD devices for Ceph Journal partitions.
|
||||
.. important::
|
||||
|
||||
It is highly recommended to use SSD devices for Ceph Journal partitions.
|
||||
|
||||
If you have an operating system available on the target hardware, you
|
||||
can determine HDD and SSD devices with:
|
||||
|
||||
::
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
lsblk -d -o name,rota
|
||||
|
||||
where a ``rota`` (rotational) value of ``1`` indicates a spinning HDD,
|
||||
and where a value of ``0`` indicates non-spinning disk (i.e. SSD). (Note
|
||||
- Some SSDs still report a value of ``1``, so it is best to go by your
|
||||
and where a value of ``0`` indicates non-spinning disk (i.e., SSD). (Note:
|
||||
Some SSDs still report a value of ``1``, so it is best to go by your
|
||||
server specifications).
|
||||
|
||||
For OSDs, pass in the whole block device (e.g., ``/dev/sdd``), and the
|
||||
Ceph chart will take care of disk partitioning, formatting, mounting,
|
||||
etc.
|
||||
|
||||
For Ceph Journals, you can pass in a specific partition (e.g., ``/dev/sdb1``),
|
||||
note that it's not required to pre-create these partitions, Ceph chart
|
||||
For Ceph Journals, you can pass in a specific partition (e.g., ``/dev/sdb1``).
|
||||
Note that it's not required to pre-create these partitions. The Ceph chart
|
||||
will create journal partitions automatically if they don't exist.
|
||||
By default the size of every journal partition is 10G, make sure
|
||||
By default the size of every journal partition is 10G. Make sure
|
||||
there is enough space available to allocate all journal partitions.
|
||||
|
||||
Consider the following example where:
|
||||
|
@ -367,7 +412,7 @@ Consider the following example where:
|
|||
|
||||
The data section of this file would look like:
|
||||
|
||||
::
|
||||
.. code-block:: yaml
|
||||
|
||||
data:
|
||||
values:
|
||||
|
@ -403,54 +448,48 @@ Manifest linting and combining layers
|
|||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
After constituent YAML configurations are finalized, use Pegleg to lint
|
||||
your manifests, and resolve any issues that result from linting before
|
||||
your manifests. Resolve any issues that result from linting before
|
||||
proceeding:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo airship-pegleg/tools/pegleg.sh repo \
|
||||
-r airship-treasuremap lint
|
||||
sudo tools/airship pegleg site -r /target lint $NEW_SITE
|
||||
|
||||
Note: ``P001`` and ``P003`` linting errors are expected for missing
|
||||
Note: ``P001`` and ``P005`` linting errors are expected for missing
|
||||
certificates, as they are not generated until the next section. You may
|
||||
suppress these warnings by appending ``-x P001 -x P003`` to the lint
|
||||
suppress these warnings by appending ``-x P001 -x P005`` to the lint
|
||||
command.
|
||||
|
||||
Next, use pegleg to perform the merge that will yield the combined
|
||||
Next, use Pegleg to perform the merge that will yield the combined
|
||||
global + site type + site YAML:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo sh airship-pegleg/tools/pegleg.sh site \
|
||||
-r airship-treasuremap \
|
||||
collect $NEW_SITE
|
||||
sudo tools/airship pegleg site -r /target collect $NEW_SITE
|
||||
|
||||
Perform a visual inspection of the output. If any errors are discovered,
|
||||
you may fix your manifests and re-run the ``lint`` and ``collect``
|
||||
commands.
|
||||
|
||||
After you have an error-free output, save the resulting YAML as follows:
|
||||
Once you have error-free output, save the resulting YAML as follows:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo airship-pegleg/tools/pegleg.sh site \
|
||||
-r airship-treasuremap \
|
||||
collect $NEW_SITE -s ${NEW_SITE}_collected
|
||||
sudo tools/airship pegleg site -r /target collect $NEW_SITE \
|
||||
-s ${NEW_SITE}_collected
|
||||
|
||||
It is this output which will be used in subsequent steps.
|
||||
This output is required for subsequent steps.
|
||||
|
||||
Lastly, you should also perform a ``render`` on the documents. The
|
||||
resulting render from Pegleg will not be used as input in subsequent
|
||||
steps, but is useful for understanding what the document will look like
|
||||
once Deckhand has performed all substitutions, replacements, etc. This
|
||||
is also useful for troubleshooting, and addressing any Deckhand errors
|
||||
is also useful for troubleshooting and addressing any Deckhand errors
|
||||
prior to submitting via Shipyard:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo airship-pegleg/tools/pegleg.sh site \
|
||||
-r airship-treasuremap \
|
||||
render $NEW_SITE
|
||||
sudo tools/airship pegleg site -r /target render $NEW_SITE
|
||||
|
||||
Inspect the rendered document for any errors. If there are errors,
|
||||
address them in your manifests and re-run this section of the document.
|
||||
|
@ -458,64 +497,41 @@ address them in your manifests and re-run this section of the document.
|
|||
Building the Promenade bundle
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Clone the Promenade repo, if not already cloned:
|
||||
Create an output directory for Promenade certs and run
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
git clone https://opendev.org/airship/promenade
|
||||
|
||||
Refer to the ``data/charts/ucp/promenade/reference`` field in
|
||||
``airship-treasuremap/global/software/config/versions.yaml``. If
|
||||
this is a pinned reference (i.e., any reference that's not ``master``),
|
||||
then you should checkout the same version of the Promenade repository.
|
||||
For example, if the Promenade reference was ``86c3c11...`` in the
|
||||
versions file, checkout the same version of the Promenade repo which was
|
||||
cloned previously:
|
||||
|
||||
::
|
||||
|
||||
(cd promenade && git checkout 86c3c11)
|
||||
|
||||
Likewise, before running the ``simple-deployment.sh`` script, you should
|
||||
refer to the ``data/images/ucp/promenade/promenade`` field in
|
||||
``~/airship-treasuremap/global/software/config/versions.yaml``. If
|
||||
there is a pinned reference (i.e., any image reference that's not
|
||||
``latest``), then this reference should be used to set the
|
||||
``IMAGE_PROMENADE`` environment variable. For example, if the Promenade
|
||||
image was pinned to ``quay.io/airshipit/promenade:d30397f...`` in
|
||||
the versions file, then export the previously mentioned environment
|
||||
variable like so:
|
||||
|
||||
::
|
||||
|
||||
export IMAGE_PROMENADE=quay.io/airshipit/promenade:d30397f...
|
||||
|
||||
Now, create an output directory for Promenade bundles and run the
|
||||
``simple-deployment.sh`` script:
|
||||
|
||||
::
|
||||
|
||||
mkdir ${NEW_SITE}_bundle
|
||||
sudo -E promenade/tools/simple-deployment.sh ${NEW_SITE}_collected ${NEW_SITE}_bundle
|
||||
mkdir ${NEW_SITE}_certs
|
||||
sudo tools/airship promenade generate-certs \
|
||||
-o /target/${NEW_SITE}_certs /target/${NEW_SITE}_collected/*.yaml
|
||||
|
||||
Estimated runtime: About **1 minute**
|
||||
|
||||
After the bundle has been successfully created, copy the generated
|
||||
certificates into the security folder. Ex:
|
||||
After the certificates has been successfully created, copy the generated
|
||||
certificates into the security folder. Example:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
mkdir -p airship-treasuremap/site/${NEW_SITE}/secrets/certificates
|
||||
sudo cp ${NEW_SITE}_bundle/certificates.yaml \
|
||||
airship-treasuremap/site/${NEW_SITE}/secrets/certificates/certificates.yaml
|
||||
mkdir -p site/${NEW_SITE}/secrets/certificates
|
||||
sudo cp ${NEW_SITE}_certs/certificates.yaml \
|
||||
site/${NEW_SITE}/secrets/certificates/certificates.yaml
|
||||
|
||||
Regenerate collected YAML files to include copied certificates:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo rm -rf ${NEW_SITE}_collected ${NEW_SITE}_certs
|
||||
sudo tools/airship pegleg site -r /target collect $NEW_SITE \
|
||||
-s ${NEW_SITE}_collected
|
||||
|
||||
Finally, create the Promenade bundle:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
mkdir ${NEW_SITE}_bundle
|
||||
sudo tools/airship promenade build-all --validators \
|
||||
-o /target/${NEW_SITE}_bundle /target/${NEW_SITE}_collected/*.yaml
|
||||
|
||||
sudo airship-pegleg/tools/pegleg.sh site \
|
||||
-r airship-treasuremap \
|
||||
collect $NEW_SITE -s ${NEW_SITE}_collected
|
||||
|
||||
Genesis node
|
||||
------------
|
||||
|
@ -528,32 +544,30 @@ stated previously in this document. Also ensure that the hardware RAID
|
|||
is setup for this node per the control plane disk configuration stated
|
||||
previously in this document.
|
||||
|
||||
Then, start with a manual install of Ubuntu 16.04 on the node you wish
|
||||
to use to seed the rest of your environment standard `Ubuntu
|
||||
Then, start with a manual install of Ubuntu 16.04 on the genesis node, the node
|
||||
you will use to seed the rest of your environment. Use standard `Ubuntu
|
||||
ISO <http://releases.ubuntu.com/16.04>`__.
|
||||
Ensure to select the following:
|
||||
|
||||
- UTC timezone
|
||||
- Hostname that matches the Genesis hostname given in
|
||||
``/data/genesis/hostname`` in
|
||||
``airship-treasuremap/site/${NEW_SITE}/networks/common-addresses.yaml``.
|
||||
- Hostname that matches the genesis hostname given in
|
||||
``data.genesis.hostname`` in
|
||||
``site/${NEW_SITE}/networks/common-addresses.yaml``.
|
||||
- At the ``Partition Disks`` screen, select ``Manual`` so that you can
|
||||
setup the same disk partitioning scheme used on the other control
|
||||
plane nodes that will be deployed by MaaS. Select the first logical
|
||||
device that corresponds to one of the RAID-1 arrays already setup in
|
||||
the hardware controller. On this device, setup partitions matching
|
||||
those defined for the ``bootdisk`` in your control plane host profile
|
||||
found in ``airship-treasuremap/site/${NEW_SITE}/profiles/host``.
|
||||
found in ``site/${NEW_SITE}/profiles/host``.
|
||||
(e.g., 30G for /, 1G for /boot, 100G for /var/log, and all remaining
|
||||
storage for /var). Note that the volume size syntax looking like
|
||||
``>300g`` in Drydock means that all remaining disk space is allocated
|
||||
to this volume, and that volume needs to be at least 300G in
|
||||
size.
|
||||
- Ensure that OpenSSH and Docker (Docker is needed because of
|
||||
miniMirror) are included as installed packages
|
||||
- When you get to the prompt, "How do you want to manage upgrades on
|
||||
this system?", choose "No automatic updates" so that packages are
|
||||
only updated at the time of our choosing (e.g. maintenance windows).
|
||||
only updated at the time of our choosing (e.g., maintenance windows).
|
||||
- Ensure the grub bootloader is also installed to the same logical
|
||||
device as in the previous step (this should be default behavior).
|
||||
|
||||
|
@ -561,38 +575,39 @@ After installation, ensure the host has outbound internet access and can
|
|||
resolve public DNS entries (e.g., ``nslookup google.com``,
|
||||
``curl https://www.google.com``).
|
||||
|
||||
Ensure that the deployed Genesis hostname matches the hostname in
|
||||
``data/genesis/hostname`` in
|
||||
``airship-treasuremap/site/${NEW_SITE}/networks/common-addresses.yaml``.
|
||||
Ensure that the deployed genesis hostname matches the hostname in
|
||||
``data.genesis.hostname`` in
|
||||
``site/${NEW_SITE}/networks/common-addresses.yaml``.
|
||||
If it does not match, then either change the hostname of the node to
|
||||
match the configuration documents, or re-generate the configuration with
|
||||
the correct hostname. In order to change the hostname of the deployed
|
||||
node, you may run the following:
|
||||
the correct hostname.
|
||||
|
||||
::
|
||||
To change the hostname of the deployed node, you may run the following:
|
||||
|
||||
.. code-block:: bash
|
||||
|
||||
sudo hostname $NEW_HOSTNAME
|
||||
sudo sh -c "echo $NEW_HOSTNAME > /etc/hostname"
|
||||
sudo vi /etc/hosts # Anywhere the old hostname appears in the file, replace
|
||||
# with the new hostname
|
||||
|
||||
Or to regenerate manifests, re-run the previous two sections with the
|
||||
after updating the genesis hostname in the site definition.
|
||||
Or, as an alternative, update the genesis hostname
|
||||
in the site definition and then repeat the steps in the previous two sections,
|
||||
"Manifest linting and combining layers" and "Building the Promenade bundle".
|
||||
|
||||
Installing matching kernel version
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Install the same kernel version on the Genesis host that MaaS will use
|
||||
Install the same kernel version on the genesis host that MaaS will use
|
||||
to deploy new baremetal nodes.
|
||||
|
||||
In order to do this, first you must determine the kernel version that
|
||||
To do this, first you must determine the kernel version that
|
||||
will be deployed to those nodes. Start by looking at the host profile
|
||||
definition used to deploy other control plane nodes by searching for
|
||||
``control-plane: enabled``. Most likely this will be a file under
|
||||
``global/profiles/host``. In this file, find the kernel info -
|
||||
e.g.:
|
||||
``global/profiles/host``. In this file, find the kernel info. Example:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
platform:
|
||||
image: 'xenial'
|
||||
|
@ -601,29 +616,29 @@ e.g.:
|
|||
kernel_package: 'linux-image-4.15.0-46-generic'
|
||||
|
||||
It is recommended to install the latest kernel. Check the latest
|
||||
available kernel, update the site specs and Regenerate collected
|
||||
available kernel, update the site specs and regenerate collected
|
||||
YAML files.
|
||||
|
||||
Define any proxy environment variables needed for your environment to
|
||||
reach public Ubuntu package repos, and install the matching kernel on the
|
||||
Genesis host (make sure to run on Genesis host, not on the build host):
|
||||
genesis host (make sure to run on genesis host, not on the build host):
|
||||
|
||||
To install the latest hwe-16.04 kernel:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo apt-get install --install-recommends linux-generic-hwe-16.04
|
||||
|
||||
To install the latest ga-16.04 kernel:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo apt-get install --install-recommends linux-generic
|
||||
|
||||
Check the installed packages on the genesis host with ``dpkg --list``.
|
||||
If there are any later kernel versions installed, remove them with
|
||||
``sudo apt remove``, so that the newly install kernel is the latest
|
||||
available. Boot the genesis node using install kernel.
|
||||
``sudo apt remove``, so that the newly installed kernel is the latest
|
||||
available. Boot the genesis node using the installed kernel.
|
||||
|
||||
Install ntpdate/ntp
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
@ -631,7 +646,7 @@ Install ntpdate/ntp
|
|||
Install and run ntpdate, to ensure a reasonably sane time on genesis
|
||||
host before proceeding:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo apt -y install ntpdate
|
||||
sudo ntpdate ntp.ubuntu.com
|
||||
|
@ -641,13 +656,13 @@ sources, specify a local NTP server instead of using ``ntp.ubuntu.com``.
|
|||
|
||||
Then, install the NTP client:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo apt -y install ntp
|
||||
|
||||
Add the list of NTP servers specified in ``data/ntp/servers_joined`` in
|
||||
Add the list of NTP servers specified in ``data.ntp.servers_joined`` in
|
||||
file
|
||||
``airship-treasuremap/site/${NEW_SITE}/networks/common-address.yaml``
|
||||
``site/${NEW_SITE}/networks/common-address.yaml``
|
||||
to ``/etc/ntp.conf`` as follows:
|
||||
|
||||
::
|
||||
|
@ -658,7 +673,7 @@ to ``/etc/ntp.conf`` as follows:
|
|||
|
||||
Then, restart the NTP service:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo service ntp restart
|
||||
|
||||
|
@ -667,7 +682,7 @@ consider using alternate time sources for your deployment.
|
|||
|
||||
Disable the apparmor profile for ntpd:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
sudo ln -s /etc/apparmor.d/usr.sbin.ntpd /etc/apparmor.d/disable/
|
||||
sudo apparmor_parser -R /etc/apparmor.d/usr.sbin.ntpd
|
||||
|
@ -680,22 +695,21 @@ disabled.
|
|||
Promenade bootstrap
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Copy the ``${NEW_SITE}_bundle`` and ``${NEW_SITE}_collected``
|
||||
directories from the build node to the genesis node, into the home
|
||||
directory of the user there (e.g., ``/home/ubuntu``). Then, run the
|
||||
following script as sudo on the genesis node:
|
||||
Copy the ``${NEW_SITE}_bundle`` directory from the build node to the genesis
|
||||
node, into the home directory of the user there (e.g., ``/home/ubuntu``).
|
||||
Then, run the following script as sudo on the genesis node:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
cd ${NEW_SITE}_bundle
|
||||
sudo ./genesis.sh
|
||||
|
||||
Estimated runtime: **40m**
|
||||
Estimated runtime: **1h**
|
||||
|
||||
Following completion, run the ``validate-genesis.sh`` script to ensure
|
||||
correct provisioning of the genesis node:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
cd ${NEW_SITE}_bundle
|
||||
sudo ./validate-genesis.sh
|
||||
|
@ -705,88 +719,57 @@ Estimated runtime: **2m**
|
|||
Deploy Site with Shipyard
|
||||
-------------------------
|
||||
|
||||
Start by cloning the shipyard repository to the Genesis node:
|
||||
|
||||
::
|
||||
|
||||
git clone https://opendev.org/airship/shipyard
|
||||
|
||||
Refer to the ``data/charts/ucp/shipyard/reference`` field in
|
||||
``airship-treasuremap/global/software/config/versions.yaml``. If
|
||||
this is a pinned reference (i.e., any reference that's not ``master``),
|
||||
then you should checkout the same version of the Shipyard repository.
|
||||
For example, if the Shipyard reference was ``7046ad3...`` in the
|
||||
versions file, checkout the same version of the Shipyard repo which was
|
||||
cloned previously:
|
||||
|
||||
::
|
||||
|
||||
(cd shipyard && git checkout 7046ad3)
|
||||
|
||||
Likewise, before running the ``deckhand_load_yaml.sh`` script, you
|
||||
should refer to the ``data/images/ucp/shipyard/shipyard`` field in
|
||||
``airship-treasuremap/global/software/config/versions.yaml``. If
|
||||
there is a pinned reference (i.e., any image reference that's not
|
||||
``latest``), then this reference should be used to set the
|
||||
``SHIPYARD_IMAGE`` environment variable. For example, if the Shipyard
|
||||
image was pinned to ``quay.io/airshipit/shipyard@sha256:dfc25e1...`` in
|
||||
the versions file, then export the previously mentioned environment
|
||||
variable:
|
||||
|
||||
::
|
||||
|
||||
export SHIPYARD_IMAGE=quay.io/airshipit/shipyard@sha256:dfc25e1...
|
||||
|
||||
Export valid login credentials for one of the Airship Keystone users defined
|
||||
for the site. Currently there is no authorization checks in place, so
|
||||
for the site. Currently there are no authorization checks in place, so
|
||||
the credentials for any of the site-defined users will work. For
|
||||
example, we can use the ``shipyard`` user, with the password that was
|
||||
defined in
|
||||
``airship-treasuremap/site/${NEW_SITE}/secrets/passphrases/ucp_shipyard_keystone_password.yaml``.
|
||||
Ex:
|
||||
``site/${NEW_SITE}/secrets/passphrases/ucp_shipyard_keystone_password.yaml``.
|
||||
Example:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
export OS_AUTH_URL="https://iam-sw.DOMAIN:443/v3"
|
||||
|
||||
export OS_USERNAME=shipyard
|
||||
export OS_PASSWORD=46a75e4...
|
||||
export OS_PASSWORD=password123
|
||||
|
||||
(Note: Default auth variables are defined
|
||||
`here <https://opendev.org/airship/shipyard/src/branch/master/tools/shipyard_docker_base_command.sh>`__,
|
||||
and should otherwise be correct, barring any customizations of these
|
||||
site parameters).
|
||||
Next, load collected site manifests to Shipyard
|
||||
|
||||
Next, run the deckhand\_load\_yaml.sh script providing an absolute path
|
||||
to a directory that contains collected manifests:
|
||||
.. code-block:: bash
|
||||
|
||||
::
|
||||
sudo -E tools/airship shipyard create configdocs ${NEW_SITE} \
|
||||
--directory=/target/${NEW_SITE}_collected
|
||||
|
||||
sudo -E shipyard/tools/deckhand_load_yaml.sh ${NEW_SITE} $(pwd)/${NEW_SITE}_collected
|
||||
sudo tools/airship shipyard commit configdocs
|
||||
|
||||
Estimated runtime: **3m**
|
||||
|
||||
Now deploy the site with shipyard:
|
||||
|
||||
::
|
||||
.. code-block:: bash
|
||||
|
||||
cd shipyard/tools/
|
||||
sudo -E ./deploy_site.sh
|
||||
tools/airship shipyard create action deploy_site
|
||||
|
||||
Estimated runtime: **1h30m**
|
||||
Estimated runtime: **3h**
|
||||
|
||||
The message ``Site Successfully Deployed`` is the expected output at the
|
||||
end of a successful deployment. In this example, this means that Airship and
|
||||
OSH should be fully deployed.
|
||||
Check periodically for successful deployment:
|
||||
|
||||
Disable password-based login on Genesis
|
||||
.. code-block:: bash
|
||||
|
||||
tools/airship shipyard get actions
|
||||
tools/airship shipyard describe action/<ACTION>
|
||||
|
||||
Disable password-based login on genesis
|
||||
---------------------------------------
|
||||
|
||||
Before proceeding, verify that your SSH access to the Genesis node is
|
||||
Before proceeding, verify that your SSH access to the genesis node is
|
||||
working with your SSH key (i.e., not using password-based
|
||||
authentication).
|
||||
|
||||
Then, disable password-based SSH authentication on Genesis in
|
||||
Then, disable password-based SSH authentication on genesis in
|
||||
``/etc/ssh/sshd_config`` by uncommenting the ``PasswordAuthentication``
|
||||
and setting its value to ``no``. Ex:
|
||||
and setting its value to ``no``. Example:
|
||||
|
||||
::
|
||||
|
||||
|
@ -798,3 +781,4 @@ Then, restart the ssh service:
|
|||
|
||||
sudo systemctl restart ssh
|
||||
|
||||
|
||||
|
|
Loading…
Reference in New Issue