4c86706e5e
This introduces and "Open Infrastructure" page which is designed for a moderately experienced developer with some understanding of Zuul, Ansible and basic Linux admin skills to have an entrypoint to navigating the system-config and related repositories. It is designed to re-enforce the idea of open infrastructure, and explain how development, testing and production come together at a level high enough to be understood, but with links or descriptions of specific places in the code to get started. It moves a little of what was in the sysadmin page into this, and leaves that page as more low-level descriptions of various tasks. Change-Id: I60a9299df455b98ad549ac0075a59d381722bc06
213 lines
9.5 KiB
ReStructuredText
213 lines
9.5 KiB
ReStructuredText
:title: Contributing Cloud Test Resources
|
|
|
|
.. _contributing_cloud:
|
|
|
|
Contributing Cloud Test Resources
|
|
#################################
|
|
|
|
OpenStack utilizes a "project gating" system based on `Zuul
|
|
<https://docs.openstack.org/infra/zuul/>`_ to ensure that every change
|
|
proposed to any OpenStack project passes tests before being added to
|
|
its source code repository. Each change may run several jobs which
|
|
test the change in various configurations, and each job may run
|
|
thousands of individual tests. To ensure the overall security of the
|
|
system as well as isolation between unrelated changes, each job is run
|
|
on an OpenStack compute instance that is created specifically to run
|
|
that job and is destroyed and replaced immediately after completing
|
|
that task.
|
|
|
|
This system operates across multiple OpenStack clouds, making the
|
|
OpenStack project infrastructure itself a substantial and very public
|
|
cross-cloud OpenStack application.
|
|
|
|
The compute instances used by this system are generously donated by
|
|
organizations that are contributing to OpenStack, and the project is
|
|
very appreciative of this.
|
|
|
|
By visiting https://zuul.openstack.org/ you can see the system in
|
|
action at any time.
|
|
|
|
You'll see every job that's running currently, as well as some graphs
|
|
that show activity over time. Each of those jobs is running on its
|
|
own compute instance. We create and destroy quite a number of those
|
|
each day (most compute instances last for about 1 hour).
|
|
|
|
Having resources from more providers will help us continue to grow the
|
|
project and deliver test results to developers quickly. OpenStack has
|
|
long-since become too complicated for developers to effectively test in
|
|
even the most common configurations on their own, so this process is
|
|
very important for developers.
|
|
|
|
If you have some capacity on an OpenStack cloud that you are able to
|
|
contribute to the project, it would be a big help. This is what we
|
|
need:
|
|
|
|
* Nova and Glance APIs (with the ability to upload images)
|
|
* A single instance with 500GB of disk (via Cinder is preferred, local
|
|
is okay) per cloud region for our region-local mirror
|
|
|
|
Each test instance requires:
|
|
|
|
* 8GB RAM
|
|
* 8vCPU at 2.4GHz (or more or less vCPUs depending on speed)
|
|
* A public IP address (IPv4 and/or IPv6)
|
|
* 80GB of storage
|
|
|
|
In a setting where our instances will be segregated, our usage
|
|
patterns will cause us to be our own noisy neighbors at the worst
|
|
times, so it would be best to plan for little or no overcommitment.
|
|
In an unsegregated public cloud setting, the distribution of our jobs
|
|
over a larger number of hypervisors will allow for more
|
|
overcommitment.
|
|
|
|
Since there's a bit of setup and maintenance involved in adding a new
|
|
provider, a minimum of 100 instances would be helpful.
|
|
|
|
Benefits to Contributors
|
|
========================
|
|
|
|
Since we continuously use the OpenStack APIs and are familiar with how
|
|
they should operate, we occasionally discover potential problems with
|
|
contributing clouds before many of their other users (or occasionally
|
|
even ops teams). In these cases, we work with contacts on their
|
|
operations teams to let them know and try to help fix problems before
|
|
they become an issue for their customers.
|
|
|
|
We collect numerous metrics about the performance of the clouds we
|
|
utilize. From these metrics we create dashboards which are freely
|
|
accessible via the Internet to help providers see and debug
|
|
performance issues.
|
|
|
|
The names and regions of providers are a primary component of
|
|
hostnames on job workers, and as such are noticeable to those
|
|
reviewing job logs from our CI system (as an example, developers
|
|
investigating test results on proposed source code changes). In this
|
|
way, names of providers contributing test resources become known to
|
|
the technical community in their day-to-day interaction with our
|
|
systems.
|
|
|
|
The OpenStack Foundation has identified Infrastructure Donors as a
|
|
special category of sponsoring organization and prominently identifies
|
|
those contributing a significant quantity of resources (as determined
|
|
by the Infra team) at:
|
|
https://www.openstack.org/foundation/companies/#infra-donors
|
|
|
|
If this sounds interesting, and you have some capacity to spare, it
|
|
would be very much appreciated. You are welcome to contact the
|
|
Infrastructure team on our public mailing list at
|
|
<service-discuss@lists.opendev.org>, or in our IRC channel,
|
|
`#opendev` on OFTC.
|
|
|
|
Contribution Workflow
|
|
=====================
|
|
|
|
After discussing your welcome contribution with the infrastructure
|
|
team it will be time to build and configure the cloud.
|
|
|
|
Initial setup
|
|
-------------
|
|
|
|
We require two projects to be provisioned
|
|
|
|
* A ``zuul`` project for infrastructure testing nodes
|
|
* A ``ci`` project for control-plane services
|
|
|
|
The ``zuul`` project will be used by nodepool for running the testing
|
|
nodes. Note there may be be references in configuration to projects
|
|
with ``jenkins``; although this is not used any more some original
|
|
clouds named their projects for the CI system in use at the time.
|
|
|
|
At a minimum, the ``ci`` project has the region-local mirror host(s)
|
|
for the cloud's region(s). This will be named
|
|
``mirror.<region>.<cloud>.openstack.org`` and all jobs running in the
|
|
``zuul`` project will be configured to use it as much as possible
|
|
(this might influence choices you make in network setup, etc.).
|
|
Depending on the resources available and with prior co-ordination with
|
|
the provider, the infrastructure team may also run other services in
|
|
this project such as webservers, file servers or nodepool builders.
|
|
|
|
The exact project and user names is not particularly important,
|
|
usually something like ``openstack[ci|zuul]`` is chosen. Per below,
|
|
these will exist as ``openstackci-<provider>``
|
|
``openstackzuul-<provider>`` in various ``clouds.yaml`` configuration
|
|
files. For minimising potential for problems it is probably best that
|
|
the provided users do not have "admin" credentials; although in some
|
|
clouds that are private to OpenStack infra admin permissions may be
|
|
granted, or an alternative user available with such permissions, to
|
|
help with various self-service troubleshooting. For example, the
|
|
infrastructure team does not require any particular access to subnet
|
|
or router configuration in the cloud, although where requested we are
|
|
happy to help with this level of configuration.
|
|
|
|
Add cloud configuration
|
|
-----------------------
|
|
|
|
After creating the two projects and users, configuration and
|
|
authentication details need to be added into configuration management.
|
|
The public portions can be proposed via the standard review process at
|
|
any time by anyone. Exact details of cloud configuration changes from
|
|
time to time; the best way to begin the addition is to clone the
|
|
``system-configuration`` repository (i.e. this repo) with ``git clone
|
|
https://opendev.org/opendev/system-config`` and ``grep``
|
|
for an existing cloud (or go through ``git log`` and find the last
|
|
cloud added) and follow the pattern. After posting the review, CI
|
|
tests and reviewers will help with any issues.
|
|
|
|
These details largely consist of the public portions of the
|
|
``openstackclient`` configuration format, such as the endpoint and
|
|
version details. Note we require ``https`` communication to Keystone;
|
|
we can use self-signed certificates if required, some non-commercial
|
|
clouds use `letsencrypt <https://letsencrypt.org>`__ while others use
|
|
their CA of preference.
|
|
|
|
Once the public review is ready, the secret values used in the review
|
|
need to be manually entered by an ``infra-root`` member into the
|
|
secret storage on ``bridge.openstack.org``. You can communicate these
|
|
via GPG encrypted mail to a ``infra-root`` member (ping ``infra-root``
|
|
in ``#opendev`` and someone will appear). If not told
|
|
explicitly, most sign the OpenStack signing key, so you can find their
|
|
preferred key via that; if the passwords can be changed plain-text is
|
|
also fine. With those in place, the public review will be committed
|
|
and the cloud will become active.
|
|
|
|
Once active, ``bridge.openstack.org`` will begin regularly running
|
|
`ansible-role-cloud-launcher
|
|
<http://opendev.org/opendev/ansible-role-cloud-launcher/>`__
|
|
against the new cloud to configure keys, upload base images, setup
|
|
security groups and such.
|
|
|
|
Activate in nodepool
|
|
--------------------
|
|
|
|
After the cloud is configured, it can be added as a resource for
|
|
nodepool to use for testing nodes.
|
|
|
|
Firstly, an ``infra-root`` member will need to make the region-local
|
|
mirror server, configure any required storage for it and setup DNS.
|
|
With this active, the cloud is ready to start running testing nodes.
|
|
|
|
At this point, the cloud needs to be added to nodepool configuration
|
|
in `project-config
|
|
<https://opendev.org/openstack/project-config/src/branch/master/nodepool>`__.
|
|
Again existing entries provide useful templates for the initial review
|
|
proposal, which can be done by anyone. Some clouds provision
|
|
particular flavors for CI nodes; these need to be present at this
|
|
point and will be conveyed via the nodepool configuration. Again CI
|
|
checks and reviewers will help with any fine details.
|
|
|
|
Once this is committed, nodepool will upload images into the new
|
|
region and start running nodes automatically. Don't forget to add the
|
|
region to the `grafana
|
|
<https://opendev.org/openstack/project-config/src/branch/master/grafana>`__.
|
|
configuration to ensure we have a dashboard for the region's health.
|
|
|
|
Ongoing operation
|
|
-----------------
|
|
|
|
If at any point the cloud needs to be disabled for maintenance a
|
|
review can be proposed to set the ``max-servers`` to zero in the
|
|
nodepool configuration. We usually propose a revert of this at the
|
|
same time with a negative workflow to remember to turn it back on when
|
|
appropriate. In an emergency, an ``infra-root`` member can bypass the
|
|
normal review process and apply such a change by hand.
|