docs: Update some of sysadmin details
Give a little more details on the current ci/cd setup; remove puppet cruft. Change-Id: I684df4459cf5940d70b89e4c05103f8a8352af87
This commit is contained in:
parent
642c6c2d88
commit
e3fb7d2be0
@ -6,147 +6,89 @@ System Administration
|
|||||||
#####################
|
#####################
|
||||||
|
|
||||||
Our infrastructure is code and contributions to it are handled just
|
Our infrastructure is code and contributions to it are handled just
|
||||||
like the rest of OpenStack. This means that anyone can contribute to
|
like the rest of OpenDev. This means that anyone can contribute to
|
||||||
the installation and long-running maintenance of systems without shell
|
the installation and long-running maintenance of systems without shell
|
||||||
access, and anyone who is interested can provide feedback and
|
access, and anyone who is interested can provide feedback and
|
||||||
collaborate on code reviews.
|
collaborate on code reviews.
|
||||||
|
|
||||||
The configuration of every system operated by the infrastructure team
|
The configuration of every system operated by the infrastructure team
|
||||||
is managed by a combination of Ansible and Puppet:
|
is managed by Ansible and driven by continuous integration and
|
||||||
|
deployment by Zuul.
|
||||||
|
|
||||||
https://opendev.org/opendev/system-config
|
https://opendev.org/opendev/system-config
|
||||||
|
|
||||||
All system configuration should be encoded in that repository so that
|
All system configuration should be encoded in that repository so that
|
||||||
anyone may propose a change in the running configuration to Gerrit.
|
anyone may propose a change in the running configuration to Gerrit.
|
||||||
|
|
||||||
Making a Change in Puppet
|
Guide to CI and CD
|
||||||
=========================
|
==================
|
||||||
|
|
||||||
Many changes to the Puppet configuration can safely be made while only
|
All development work is based around Zuul jobs and a continuous
|
||||||
performing syntax checks. Some more complicated changes merit local
|
integration and development workflow.
|
||||||
testing and an interactive development cycle. The system-config repo
|
|
||||||
is structured to facilitate local testing before proposing a change
|
|
||||||
for review. This is accomplished by separating the puppet
|
|
||||||
configuration into several layers with increasing specificity about
|
|
||||||
site configuration higher in the stack.
|
|
||||||
|
|
||||||
The `modules/` directory holds puppet modules that abstractly describe
|
The starting point for all services is generally the playbooks and
|
||||||
the configuration of a service. Ideally, these should have no
|
roles kept in :git_file:`playbooks`.
|
||||||
OpenStack-specific information in them, and eventually they should all
|
Most playbooks are named ``service-<name>.yaml`` and will indicate
|
||||||
become modules that are directly consumed from PuppetForge, only
|
which production areas they drive.
|
||||||
existing in the system-config repo during an initial incubation period.
|
|
||||||
This is not yet the case, so you may find OpenStack-specific
|
|
||||||
configuration in these modules, though we are working to reduce it.
|
|
||||||
|
|
||||||
The `modules/openstack_project/manifests/` directory holds
|
These playbooks run on groups of hosts which are defined in
|
||||||
configuration for each of the servers that the OpenStack project runs.
|
:git_file:`inventory/service/groups`. The production hosts are kept
|
||||||
Think of these manifests as describing how OpenStack runs a particular
|
in an inventory at :git_file:`inventory/base/hosts.yaml`. During
|
||||||
service. However, no site-specific configuration such as hostnames or
|
testing, these same playbooks are run against the test nodes. You can
|
||||||
credentials should be included in these files. This is what lets you
|
note that the testing hosts are given names that match the group
|
||||||
easily test an OpenStack project manifest on your own server.
|
configuration in the jobs defined in
|
||||||
|
:git_file:`zuul.d/system-config-run.yaml`.
|
||||||
|
|
||||||
Finally, the `manifests/site.pp` file contains the information that is
|
Deployment is run through a bastion host ``bridge.openstack.org``.
|
||||||
specific to the actual servers that OpenStack runs. These should be
|
After changes are approved, Zuul will run Ansible on this host; which
|
||||||
very simple node definitions that largely exist simply to provide
|
will then connect to the production hosts and run the orchestration
|
||||||
private data from hiera to the more robust manifests in the
|
using the latest committed code. The bridge is a special host because
|
||||||
`openstack_project` modules.
|
it holds production secrets, such as passwords or API keys, and
|
||||||
|
unredacted logs. As many logs as possible are provided in the public
|
||||||
|
Zuul job results, but they need to be audited to ensure they do not
|
||||||
|
leak secrets and thus in some cases may not be published.
|
||||||
|
|
||||||
This means that you can run the same configuration on your own server
|
For CI testing, each job creates a "fake" bridge, along with the
|
||||||
simply by providing a different manifest file instead of site.pp.
|
servers required for orchestration. Thus CI testing is performed by a
|
||||||
|
"nested" Ansible -- Zuul initially connects to the testing bridge node
|
||||||
.. note::
|
and deploys it, and then this node runs its own Ansible that tests the
|
||||||
The example below is for Debian / Ubuntu systems. If you are using a
|
orchestration to the other testing nodes, simulating the production
|
||||||
Red Hat based system be sure to setup sudo or simply run the commands as
|
environment. This is driven by playbooks kept in
|
||||||
the root user.
|
:git_file:`playbooks/zuul`. Here you will also find testing
|
||||||
|
definitions of host variables that are kept secret for production
|
||||||
As an example, to run the etherpad configuration on your own server,
|
hosts.
|
||||||
start by ensuring `git` is installed and then cloning the system-config
|
|
||||||
Git repo::
|
|
||||||
|
|
||||||
sudo su -
|
|
||||||
apt-get install git
|
|
||||||
git clone https://opendev.org/opendev/system-config
|
|
||||||
cd system-config
|
|
||||||
|
|
||||||
Then copy the etherpad node definition from `manifests/site.pp` to a new
|
|
||||||
file (be sure to specify the FQDN of the host you are working with in
|
|
||||||
the node specifier). It might look something like this::
|
|
||||||
|
|
||||||
# local.pp
|
|
||||||
class { 'openstack_project::etherpad':
|
|
||||||
ssl_cert_file_contents => hiera('etherpad_ssl_cert_file_contents'),
|
|
||||||
ssl_key_file_contents => hiera('etherpad_ssl_key_file_contents'),
|
|
||||||
ssl_chain_file_contents => hiera('etherpad_ssl_chain_file_contents'),
|
|
||||||
mysql_host => hiera('etherpad_db_host', 'localhost'),
|
|
||||||
mysql_user => hiera('etherpad_db_user', 'username'),
|
|
||||||
mysql_password => hiera('etherpad_db_password'),
|
|
||||||
}
|
|
||||||
|
|
||||||
.. note::
|
|
||||||
Be sure not to use any of the hiera functionality from manifests/site.pp
|
|
||||||
since it is not installed yet. You should be able to comment out the logic
|
|
||||||
safely.
|
|
||||||
|
|
||||||
Then to apply that configuration, run the following from the root of the
|
|
||||||
system-config repository::
|
|
||||||
|
|
||||||
./install_puppet.sh
|
|
||||||
./install_modules.sh
|
|
||||||
puppet apply -l /tmp/manifest.log --modulepath=modules:/etc/puppet/modules manifests/local.pp
|
|
||||||
|
|
||||||
That should turn the system you are logged into into an etherpad
|
|
||||||
server with the same configuration as that used by the OpenStack
|
|
||||||
project. You can edit the contents of the system-config repo and
|
|
||||||
iterate ``puppet apply`` as needed. When you're ready to propose the
|
|
||||||
change for review, you can propose the change with git-review. See the
|
|
||||||
`Development workflow section in the Developer's Guide
|
|
||||||
<https://docs.opendev.org/opendev/infra-manual/latest/developers.html#development-workflow>`_
|
|
||||||
for more information.
|
|
||||||
|
|
||||||
Accessing Clouds
|
|
||||||
================
|
|
||||||
|
|
||||||
As an unprivileged user who is a member of the `sudo` group on
|
|
||||||
bridge, you can access any of the clouds with::
|
|
||||||
|
|
||||||
sudo openstack --os-cloud <cloud name> --os-cloud-region <region name>
|
|
||||||
|
|
||||||
|
After the test environment is orchestrated, the
|
||||||
|
`testinfra <https://testinfra.readthedocs.io/en/latest/>`__ tests from
|
||||||
|
:git_file:`testinfra` are run. This validates the complete
|
||||||
|
orchestration testing environment; things such as ensuring user
|
||||||
|
creation, container readiness and service wellness checks are all
|
||||||
|
performed.
|
||||||
|
|
||||||
.. _adding_new_server:
|
.. _adding_new_server:
|
||||||
|
|
||||||
Adding a New Server
|
Adding a New Server
|
||||||
===================
|
===================
|
||||||
|
|
||||||
To create a new server, do the following:
|
Creating a new server for your service requires discussion with the
|
||||||
|
OpenDev administrators to ensure donor resources are being used
|
||||||
|
effectively.
|
||||||
|
|
||||||
* Add a file in :git_file:`modules/openstack_project/manifests/` that defines a
|
* Hosts should only be configured by Ansible. Nonetheless, in some
|
||||||
class which specifies the configuration of the server.
|
cases SSH access can be granted. Add your public key to
|
||||||
|
:git_file:`inventory/base/group_vars/all.yaml` and include a stanza
|
||||||
* Add a node pattern entry in :git_file:`manifests/site.pp` for the server
|
like this in your server ``host_vars``::
|
||||||
that uses that class. Make sure it supports an ordinal naming pattern
|
|
||||||
(e.g., fooserver01.openstack.org not just fooserver.openstack.org, even
|
|
||||||
if you're replacing an existing server) and that another server with the
|
|
||||||
same does not already exist in the ansible inventory.
|
|
||||||
|
|
||||||
* If your server needs private information such as passwords, use
|
|
||||||
hiera calls in the site manifest, and ask an infra-core team member
|
|
||||||
to manually add the private information to hiera.
|
|
||||||
|
|
||||||
* You should be able to install and configure most software only with
|
|
||||||
ansible or puppet. Nonetheless, if you need SSH access to the host,
|
|
||||||
add your public key to :git_file:`inventory/service/group_vars/all.yaml` and
|
|
||||||
include a stanza like this in your server class::
|
|
||||||
|
|
||||||
extra_users:
|
extra_users:
|
||||||
- your_user_name
|
- your_user_name
|
||||||
|
|
||||||
* Add an RST file with documentation about the server in :git_file:`doc/source`
|
* Add an RST file with documentation about the server and services in
|
||||||
and add it to the index in that directory.
|
:git_file:`doc/source` and add it to the index in that directory.
|
||||||
|
|
||||||
SSH Access
|
SSH Access
|
||||||
==========
|
==========
|
||||||
|
|
||||||
For any of the systems managed by the OpenStack Infrastructure team, the
|
For any of the systems managed by the OpenDev Infrastructure team, the
|
||||||
following practices must be observed for SSH access:
|
following practices must be observed for SSH access:
|
||||||
|
|
||||||
* SSH access is only permitted with SSH public/private key
|
* SSH access is only permitted with SSH public/private key
|
||||||
@ -171,14 +113,13 @@ following practices must be observed for SSH access:
|
|||||||
is received should be used, and the SSH keys should be added with
|
is received should be used, and the SSH keys should be added with
|
||||||
the confirmation constraint ('ssh-add -c').
|
the confirmation constraint ('ssh-add -c').
|
||||||
* The number of SSH keys that are configured to permit access to
|
* The number of SSH keys that are configured to permit access to
|
||||||
OpenStack machines should be kept to a minimum.
|
OpenDev machines should be kept to a minimum.
|
||||||
* OpenStack Infrastructure machines must use puppet to centrally manage and
|
* OpenDev Infrastructure machines must use Ansible to centrally manage
|
||||||
configure user accounts, and the SSH authorized_keys files from the
|
and configure user accounts, and the SSH authorized_keys files from
|
||||||
openstack-infra/system-config repository.
|
the opendev/system-config repository.
|
||||||
* SSH keys should be periodically rotated (at least once per year).
|
* SSH keys should be periodically rotated (at least once per year).
|
||||||
During rotation, a new key can be added to puppet for a time, and
|
During rotation, a new key can be added to puppet for a time, and
|
||||||
then the old one removed. Be sure to run puppet on the backup
|
then the old one removed.
|
||||||
servers to make sure they are updated.
|
|
||||||
|
|
||||||
|
|
||||||
GitHub Access
|
GitHub Access
|
||||||
@ -186,7 +127,7 @@ GitHub Access
|
|||||||
|
|
||||||
To ensure that code review and testing are not bypassed in the public
|
To ensure that code review and testing are not bypassed in the public
|
||||||
Git repositories, only Gerrit will be permitted to commit code to
|
Git repositories, only Gerrit will be permitted to commit code to
|
||||||
OpenStack repositories. Because GitHub always allows project
|
OpenDev repositories. Because GitHub always allows project
|
||||||
administrators to commit code, accounts that have access to manage the
|
administrators to commit code, accounts that have access to manage the
|
||||||
GitHub projects necessarily will have commit access to the
|
GitHub projects necessarily will have commit access to the
|
||||||
repositories.
|
repositories.
|
||||||
@ -197,7 +138,7 @@ would prefer to keep a separate account, it can be added to the
|
|||||||
organisation after discussion and noting the caveats around elevated
|
organisation after discussion and noting the caveats around elevated
|
||||||
access. The account must have 2FA enabled.
|
access. The account must have 2FA enabled.
|
||||||
|
|
||||||
In either case, the adminstrator accounts should not be used to check
|
In either case, the administrator accounts should not be used to check
|
||||||
out or commit code for any project.
|
out or commit code for any project.
|
||||||
|
|
||||||
Note that it is unlikely to be useful to use an account also used for
|
Note that it is unlikely to be useful to use an account also used for
|
||||||
@ -207,26 +148,16 @@ for all projects.
|
|||||||
Root only information
|
Root only information
|
||||||
#####################
|
#####################
|
||||||
|
|
||||||
Some information is only relevant if you have root access to the system - e.g.
|
Below is information relevant to members of the core team with root
|
||||||
you are an OpenStack CI root operator, or you are running a clone of the
|
access.
|
||||||
OpenStack CI infrastructure for another project.
|
|
||||||
|
|
||||||
Force configuration run on a server
|
Accessing Clouds
|
||||||
===================================
|
================
|
||||||
|
|
||||||
If you need to force a configuration run on a single server before the
|
As an unprivileged user who is a member of the `sudo` group on bridge,
|
||||||
usual cron job time, you can use the ``kick.sh`` script on
|
you can inspect any of the clouds with::
|
||||||
``bridge.openstack.org``.
|
|
||||||
|
|
||||||
You could do a single server::
|
sudo openstack --os-cloud <cloud name> --os-cloud-region <region name>
|
||||||
|
|
||||||
# /opt/system-config/production/tools/kick.sh 'review.openstack.org'
|
|
||||||
|
|
||||||
Or use matching to cover a range of servers::
|
|
||||||
|
|
||||||
# /opt/system-config/production/tools/kick.sh 'ze*.openstack.org'
|
|
||||||
|
|
||||||
# /opt/system-config/production/tools/kick.sh 'ze0[1-4].openstack.org'
|
|
||||||
|
|
||||||
Backups
|
Backups
|
||||||
=======
|
=======
|
||||||
@ -477,9 +408,8 @@ from misspelling the name of the file and is recommended.
|
|||||||
Examples
|
Examples
|
||||||
--------
|
--------
|
||||||
|
|
||||||
To disable an OpenStack instance called `amazing.openstack.org` temporarily
|
To disable an OpenDev instance called `foo.opendev.org` temporarily,
|
||||||
without landing a puppet change, ensure the following is in
|
ensure the following is in `/etc/ansible/hosts/emergency.yaml`
|
||||||
`/etc/ansible/hosts/emergency.yaml`
|
|
||||||
|
|
||||||
::
|
::
|
||||||
|
|
||||||
@ -489,6 +419,21 @@ without landing a puppet change, ensure the following is in
|
|||||||
disabled:
|
disabled:
|
||||||
- foo.opendev.org # 2020-05-23 bob is testing change 654321
|
- foo.opendev.org # 2020-05-23 bob is testing change 654321
|
||||||
|
|
||||||
|
Ad-hoc Ansible runs
|
||||||
|
===================
|
||||||
|
|
||||||
|
If you need to run Ansible manually against a host, you should
|
||||||
|
|
||||||
|
* disable automated Ansible runs following the section above
|
||||||
|
* ``su`` to the ``zuul`` user and run the playbook with something like
|
||||||
|
``ansible-playbook -vv
|
||||||
|
src/opendev.org/opendev/system-config/playbooks/service-<name>.yaml``
|
||||||
|
* Restore automated ansible runs.
|
||||||
|
* You can also use the ``--limit`` flag to restrict which hosts run
|
||||||
|
when there are many in a group. However, be aware that some
|
||||||
|
roles/playbooks like ``letsencrypt`` and ``backup`` run across
|
||||||
|
multiple hosts (deploying DNS records or authorization keys), so
|
||||||
|
incorrect ``--limit`` flags could cause further failures.
|
||||||
|
|
||||||
.. _cinder:
|
.. _cinder:
|
||||||
|
|
||||||
|
Loading…
Reference in New Issue
Block a user