Doc updates for install and troubleshooting
Change-Id: I91c330ddda612b6cb708fe318859b82b363e3a90
This commit is contained in:
parent
4bd3751578
commit
6e11ff367e
|
@ -1,21 +1,37 @@
|
||||||
Getting Started
|
Getting Started
|
||||||
===============
|
===============
|
||||||
|
|
||||||
|
Note: This document is meant to give a general understanding of how Promenade
|
||||||
|
could be exercised in a development environment or for general learning and
|
||||||
|
understanding. For holistic UCP deployment procedures, refer to `Treasuremap <https://github.com/att-comdev/treasuremap>`_
|
||||||
|
|
||||||
Basic Deployment
|
Basic Deployment
|
||||||
----------------
|
----------------
|
||||||
|
|
||||||
This approach is quick to get started, but generates the scripts used for
|
This approach is quick to get started, but generates the scripts used for
|
||||||
joining up-front rather than generating them in the API as needed.
|
joining up-front rather than generating them in the API as needed.
|
||||||
|
|
||||||
Setup
|
Setup Build Machine
|
||||||
^^^^^
|
^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
On the machine you wish to use to generate deployment files, install docker:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
sudo apt -y install docker.io
|
||||||
|
|
||||||
|
This can be the same machine you intend to be the Genesis host, or it may be
|
||||||
|
a separate build machine.
|
||||||
|
|
||||||
|
Generate Build files
|
||||||
|
^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
To create the certificates and scripts needed to perform a basic deployment,
|
To create the certificates and scripts needed to perform a basic deployment,
|
||||||
you can use the following helper script:
|
you can use the following helper script on your build machine:
|
||||||
|
|
||||||
.. code-block:: bash
|
.. code-block:: console
|
||||||
|
|
||||||
./tools/simple-deployment.sh examples/basic build
|
sudo ./tools/simple-deployment.sh examples/basic build
|
||||||
|
|
||||||
This will copy the configuration provided in the ``examples/basic`` directory
|
This will copy the configuration provided in the ``examples/basic`` directory
|
||||||
into the ``build`` directory. Then, it will generate self-signed certificates
|
into the ``build`` directory. Then, it will generate self-signed certificates
|
||||||
|
@ -23,18 +39,31 @@ for all the needed components in Deckhand-compatible format. Finally, it will
|
||||||
render the provided configuration into directly-usable ``genesis.sh`` and
|
render the provided configuration into directly-usable ``genesis.sh`` and
|
||||||
``join-<NODE>.sh`` scripts.
|
``join-<NODE>.sh`` scripts.
|
||||||
|
|
||||||
|
Genesis Host Provision
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Install Ubuntu 16.04 on the machine intended to be the genesis host. Ensure
|
||||||
|
the host has outbound internet access and DNS resolution.
|
||||||
|
Ensure that the hostname matches the hostname specified in the Genesis.yaml
|
||||||
|
file used to build the above configurations.
|
||||||
|
|
||||||
Execution
|
Execution
|
||||||
^^^^^^^^^
|
^^^^^^^^^
|
||||||
|
|
||||||
Perform the following steps to execute the deployment:
|
Perform the following steps to execute the deployment:
|
||||||
|
|
||||||
1. Copy the ``genesis.sh`` script to the genesis node and run it.
|
1. Copy the ``genesis.sh`` script to the genesis node and run it as sudo. In the
|
||||||
|
event of runtime errors, refer to :doc:`troubleshooting/genesis`
|
||||||
2. Validate the genesis node by running ``validate-genesis.sh`` on it.
|
2. Validate the genesis node by running ``validate-genesis.sh`` on it.
|
||||||
3. Join master nodes by copying their respective ``join-<NODE>.sh`` scripts to
|
3. Nodes for which ``join-<NODE>.sh`` scripts have been generated should be
|
||||||
|
provisioned at this point, and need to have network connectivity to the
|
||||||
|
genesis node. (This could be a manual Ubuntu provision, or a Drydock-
|
||||||
|
initiated PXE boot in the case of a full fledged UCP deployment).
|
||||||
|
4. Join master nodes by copying their respective ``join-<NODE>.sh`` scripts to
|
||||||
them and running them.
|
them and running them.
|
||||||
4. Validate the master nodes by copying and running their respective
|
5. Validate the master nodes by copying and running their respective
|
||||||
``validate-<NODE>.sh`` scripts on each of them.
|
``validate-<NODE>.sh`` scripts on each of them.
|
||||||
5. Re-provision the Genesis node
|
6. Re-provision the Genesis node
|
||||||
|
|
||||||
a) Run the ``/usr/local/bin/promenade-teardown`` script on the Genesis node:
|
a) Run the ``/usr/local/bin/promenade-teardown`` script on the Genesis node:
|
||||||
b) Delete the node from the cluster via one of the other nodes ``kubectl delete node <GENESIS>``.
|
b) Delete the node from the cluster via one of the other nodes ``kubectl delete node <GENESIS>``.
|
||||||
|
@ -42,7 +71,7 @@ Perform the following steps to execute the deployment:
|
||||||
d) Join the genesis node as a normal node using its ``join-<GENESIS>.sh`` script.
|
d) Join the genesis node as a normal node using its ``join-<GENESIS>.sh`` script.
|
||||||
e) Validate the node using ``validate-<GENSIS>.sh``.
|
e) Validate the node using ``validate-<GENSIS>.sh``.
|
||||||
|
|
||||||
6. Join and validate all remaining nodes using the ``join-<NODE>.sh`` and
|
7. Join and validate all remaining nodes using the ``join-<NODE>.sh`` and
|
||||||
``validate-<NODE>.sh`` scripts described above.
|
``validate-<NODE>.sh`` scripts described above.
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -33,4 +33,5 @@ Promenade Configuration Guide
|
||||||
design
|
design
|
||||||
getting-started
|
getting-started
|
||||||
configuration/index
|
configuration/index
|
||||||
|
troubleshooting/index
|
||||||
api
|
api
|
||||||
|
|
|
@ -0,0 +1,78 @@
|
||||||
|
Genesis Troubleshooting
|
||||||
|
=======================
|
||||||
|
|
||||||
|
genesis.sh
|
||||||
|
----------
|
||||||
|
|
||||||
|
Kubernetes services failures
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Before the Armada manifests are applied, the genesis.sh script will bring basic
|
||||||
|
kubernetes services online by starting docker containers for these services.
|
||||||
|
|
||||||
|
One of the first services to be brought up is the kubernetes API. If it fails to
|
||||||
|
come up, you may see a repeated error as follows from the genesis.sh script:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
.The connection to the server apiserver.kubernetes.promenade:6443 was
|
||||||
|
refused - did you specify the right host or port?
|
||||||
|
|
||||||
|
Check that the hostname in your Genesis.yaml matches the hostname of the
|
||||||
|
machine you are trying to install onto. If they do not match, change one to
|
||||||
|
match the other. If you change Genesis.yaml, then re-generate the Promenade
|
||||||
|
payloads.
|
||||||
|
|
||||||
|
If the hostnames match, check the container logs under /var/log/pods to see the
|
||||||
|
reason for the provisioning failure. (``kubectl logs`` function will not be
|
||||||
|
available if the API container is not running).
|
||||||
|
|
||||||
|
Armada failures
|
||||||
|
^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
When executing genesis.sh, you may encounter failures from Armada in the
|
||||||
|
provisioning of other containers. For example:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
CRITICAL armada [-] Unhandled error: armada.exceptions.tiller_exceptions.ReleaseException: Failed to Install release: barbican
|
||||||
|
|
||||||
|
Use ``kubectl logs`` on the failed pod to determine the reason for the failure.
|
||||||
|
E.g.:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
sudo kubectl logs barbican-api-5b8bccdf8f-x7sld --namespace=ucp
|
||||||
|
|
||||||
|
Other errors may point to configuration errors. For example:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
CRITICAL armada [-] Unhandled error: armada.exceptions.source_exceptions.GitLocationException: master is not a valid git repository.
|
||||||
|
|
||||||
|
In this case, the git branch name was inadvertently substituted for the git URL
|
||||||
|
in one of the chart definitions in ``bootstrap-armada.yaml``.
|
||||||
|
|
||||||
|
Post-run failures
|
||||||
|
^^^^^^^^^^^^^^^^^
|
||||||
|
|
||||||
|
At its conclusion, the genesis script will output the list of containers
|
||||||
|
provisioned and their status, as reported by kubernetes. It is possible that
|
||||||
|
some containers may not be in a Running state. E.g.:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
ucp promenade-api-6696769cd-qwpzf 0/1 ImagePullBackOff 0 10h
|
||||||
|
|
||||||
|
For general failures, ``kubectl logs`` may be used as in the previous section.
|
||||||
|
In this case, it was necessary to run ``kubectl describe`` on the pod to get the
|
||||||
|
details of the image pull failure. E.g.:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
kubectl describe pod promenade-api-7dc54d47c-qw27m --namespace=ucp
|
||||||
|
|
||||||
|
In this particular incident report, the problem was a missing certificate on the
|
||||||
|
bare metal node which caused the image download to fail. Installing the
|
||||||
|
certificate, restarting the docker service, and then waiting for the container
|
||||||
|
to retry resolved this particular issue.
|
|
@ -0,0 +1,9 @@
|
||||||
|
Troubleshooting
|
||||||
|
===============
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 2
|
||||||
|
:caption: Troubleshooting
|
||||||
|
|
||||||
|
genesis
|
||||||
|
|
Loading…
Reference in New Issue