This change adds a job to the Grafana chart that allows for the
changing of the grafana admin user password if required, as
Grafana only allows the changing of this password via the
grafana-admin CLI or via an http call that requires both the old
and new password
Change-Id: I59a5d26edc4aa4da16e80c5454ecdebbae3a1d15
This updates the Ceph dashboards for Grafana, as some of the ceph
metrics have changed with the Mimic release. This fixes issues
with the ceph OSD metrics that broke some Grafana panels, and also
removes the Ceph panel for displaying the number of monitors in
quorum, as that metric has been removed in Mimic
Change-Id: If6cbbfa7d2972ddd0e44b29a6c8277188d2d9ff0
This updates the Elasticsearch health status expressions used in
Prometheus, Nagios and Grafana. The previous Prometheus rule
defined for Elasticsearch health checked for a status that was
> 0 to trigger an alarm for a green health status. The correct
returned values are: 1 for green, 0 for both red and yellow. This
changes the expression to use arithmetic operators to give us a
result that maps to: 2 for green, 1 for yellow, 0 for red.
This also updates the Elasticsearch dashboard in Grafana to add a
new mapping for the updated 2g,1y,0r scale.
Finally, this also updates the Nagios service check to be a bit
more verbose in its output.
For reference, see:
https://github.com/justwatchcom/elasticsearch_exporter/issues/120
Change-Id: I6ef2a7c308c6ebfdb693b46127a285bceb6ba872
This adds the container security context to grafana, which
explicitly sets allowPrivilegeEscalation to false
Change-Id: I3723a0c96699b9a517dafa2df08bf8cc916bf117
This updates the Grafana chart to include the pod security context
on the grafana pod. This changes the pod's user from root to the
grafana user instead
Change-Id: Id64853640f1941001b83566865defe93227b4291
This PS implements the helm toolkit function to generate the
Egress in kubernetes network policy manifest based on overrideable values.
It also enbale the K8s network policy at Osh-infra gate.
Change-Id: Icbe2a18c98dba795d15398dcdcac64228f6a7b4c
This updates the Grafana Ceph dashboards to use templating to
determine which ceph-mgr to use for displaying ceph related
metrics. This required setting the appropriate labels on the
ceph-mgr service to be able to distinguish between releases
Change-Id: Id2eceacadc5b6366d7bc6668bc16ccf5ba878e4a
This patch set implements the helm toolkit function to generate a
kubernetes network policy manifest based on overrideable values.
This also adds a chart that shuts down all the ingress and egress
traffics in the namespace. This can be used to ensure the
whitelisted network policy works as intended.
Additionally, implementation is done for some infrastructure charts.
Change-Id: I78e87ef3276e948ae4dd2eb462b4b8012251c8c8
Co-Authored-By: Mike Pham <tp6510@att.com>
Signed-off-by: Tin Lam <tin@irrational.io>
This changes the image used for various jobs and helm tests in the
osh-infra charts. This replaces the kolla heat image with the loci
based heat image used for jobs and helm tests in openstack-helm in
order to drive consistency
Change-Id: Ie9deedadb7507282fe62723ec4641dd508040364
This updates the helm test pod templates in the charts with helm
tests defined. This change includes the addition of:
- Generate test pod cluster roles and role bindings
- Generate service accounts for test pods
- Add node selectors to the test pods
- Add service accounts to the test pods
- Addition of entrypoint container to the test pods
- Indentation fix for rabbitmq test pod template
Change-Id: I9a0dd8a1a87bfe5eaf1362e92b37bc004f9c2cdb
This PS adds the ability to attach a release uuid to pods and rc
objects as desired. A follow up ps will add the ability to add arbitary
annotations to the same objects.
Change-Id: Iceedba457a03387f6fc44eb763a00fd57f9d84a5
Signed-off-by: Pete Birley <pete@port.direct>
This removes the configuration value for enabling LDAP signup by
default in the Grafana chart, which restricts the ability for a
user to sign up for Grafana access via the login page.
Change-Id: Ifed1dbf7eda022541d7a1ab179788c92763bc310
This updates the osh-infra charts to use a secret for their
configuration files instead of a configmap, allowing for the
storage of sensitive information
Change-Id: Ia32587162288df0b297c45fd43b55cef381cb064
This updates the grafana dashboards to use a default refresh
value of 5m to prevent dashboards with intensive queries (like the
container dashboard) from submitting frequent, expensive requests
to Prometheus
This also removes the override to disable the ingress service for
grafana in the developer deployment script, as it was overlooked
when enabling ingresses after the ingress chart was introduced
Change-Id: I0958a3978cec25a1350172cbe75996f1346858c5
This adds authentication to Prometheus with an apache reverse
proxy, similar to elasticsearch, kibana and nagios. This adds an
admin user and password via htpasswd along with adding ldap
support.
This required modifying the grafana chart to configure the
prometheus datasource's basic auth credentials in the data sources
provisioning configuration file by checking whether basic auth is
enabled and injecting the username/password defined in the
corresponding endpoint definition.
This also modifies the nagios chart to use the authenticated
endpoint for prometheus, which is required for nagios to
successfully query the prometheus endpoint for its service
checking mechanism
Change-Id: Ia4ccc3c44a89b2c56594be1f4cc28ac07169bf8c
This fixes two issues with the Ceph dashboards in Grafana: the
first fix addresses an incorrect heading for Utilized Capacity in
the ceph cluster dashboard (was reporting utilized as available),
and the second fix addresses the Pool Usage gauge to accurately
reflect the percentage of the pool used (was incorrectly
multiplying the percentage result by 100 a second time, resulting
in large and inaccurate results)
Change-Id: I024a555cdb82ee181eb414337b84e7ad62717c97
In most cases, the ingress controller's nodeSelector key and value
are "node-role.kubernetes.io/ingress" and "true".
Using quote to treat the nodeSelector value as a string.
Change-Id: Ie1745629b90795e4d888d85f35565e6d6350e09b
This encloses the ldap admin bind password in single quotes
instead of double quotes, which allows for special characters to
be successfully included in the password.
Change-Id: I57649a92595c3fe643f32dd1fb3e7c5b2a0802e7
This updates the TLS secret templates to include the backend
service in the dict supplied to the manifest template, as it is
required for the TLS secret to render correctly.
This also removes the readiness probe from the nagios container in
the deployment for the nagios chart, as it wasn't functioning as
intended due to the port not being available for the probe
Change-Id: Iabcfd40c74938e0497d08ffeeebc98ab722fa660
Adds support for TLS on overriden fqdns for public endpoints for
the services that have them in openstack-helm-infra. Currently this
implementation is limited, in that it does not provide support for
dynamically loading CAs into the containers, or specifying them manually
via configuration. As a result only well known or CA's added manually
to containers will be recognised.
Change-Id: I4ab4bbe24b6544b64cd365467e8efb2a421ac3f4
This moves to define the datasources provisioned by grafana via
a template defined in the values.yaml. This allows us to define
multiple datasource types that can be mapped directly to the
corresponding entries in endpoints, which enables us to generate
the data source urls via endpoint lookups rather than hardcoding
this. This is the first step to support multiple data sources in
a singular grafana deployment
Change-Id: Iac7f4b1e07aaf83ae4d2a0c923cd06817f0d8c0d
This updates the LDAP configuration for grafana, using a template
defined in the values.yaml file. Using the template allows us to
dynamically define LDAP configuration values, such as the bind dn,
search base and group search base paths, the password, and the
LDAP fqdn. This also updates the volume mount for the
provisioning directory to be defined by the configuration value in
the values.yaml file
Change-Id: I1e4866d1189cf40b08b3443dc725646a1b76094c
This PS removes the use of the `quote and truncate` approach to
suppress output from gotpl actions in templates and replaces it
with the recommended practice of defining `$_` instead.
Change-Id: I5fedc3471dcbecef37d2fe1302bf9760b3163467
Signed-off-by: Pete Birley <pete@port.direct>
This PS moves to use the current ga version for kubernetes daemonsets,
additionally any remaining deployments that were using the
`extensions/v1beta1` have been updated to `apps/v1`.
Story: 2002205
Task: 21735
Change-Id: If9703162dc472af1e6096bf2b9062802fd5ce8ab
Signed-off-by: Pete Birley <pete@port.direct>
This moves the charts in openstack-helm-infra closer towards a
standard structure. It addresses multiple deviations, including:
missing resources for init containers, incorrect indents for
disabled resources in some charts, incorrect indents for volumes
and volumemounts added via values, missing resources for some
helm test templates, missing helm-toolkit image functions, and
moving the resource template declarations to be under the image
template declarations
Change-Id: I4834a5d476ef7fc69c5583caacc0229050f20a76
This adds a grafana dashboard for Elasticsearch, providing insight
into the overall cluster health
Change-Id: I5e59a5a5c491b4416ba4505205910d6c6babbff8
This adds a dashboard for displaying prometheus specific metrics,
providing insight into the performance of prometheus as well as
metrics related to time series, rule evaluations, scrape delays,
and query latency
Change-Id: I2c23c6fc9d0a00236cd38c63d29207e04a368f5f
This adds ldap support to the grafana chart. This required updating
the version of Grafana to 5.0, as this version allows for using
configuration files to bootstrap the datasources and dashboards
instead of using the grafana http api. This was a necessary change
as using ldap for grafana presented issues trying to create the
datasource via the http api
This also adds a basic helm test for grafana. This test simply
verifies whether the prometheus datasource configured exists and
whether the number of dashboards reported by the admin api matches
the number of dashboards expected
Change-Id: I2e987cb425adba9f909722ffdb25b83f82710c4d
Move to v0.3.1 of kubernetes-entrypoint which has 2 breaking changes to
pod dependencies, and also adds support for depending on jobs via
labels.
Change-Id: I2bafc2153ddd46b3833b253a2e7950bccbccf8ed
This ps includes the following grafana dashboard changes:
- Renames the OpenStack dashboard title
- Removes redundant kubernetes dashboards
- Fixes datasource for the nginx dashboard
- Fixes templating variable for rabbitmq dashboard
Change-Id: I2fa1ff606746ce1f51d2ed01788bb5282bd53dfc
This ps proposes adding a common template for the image_repo_sync
jobs for consumption by the charts
Change-Id: I48476d1e4fd94bd1b08b13b46983e3d999f8d8ca
This adds the local image registry endpoint to elasticsearch,
fluent-logging and grafana. This endpoint was missing from the
values.yaml in those charts
Change-Id: I30dc1f0cab40ccf8a493e13f407e2f0d37af1eee
This moves all relevant charts in osh-infra to use the htk manifest
template for ingresses, bringing them in line with the charts in
openstack-helm
Change-Id: Ic9c3cc6f0051fa66b6f88ec2b2725698b36ce824
This ps adds more granular node selectors for the charts in osh
infra to match what is currently done in osh
Change-Id: I8957a95053b9fb3ea329fd37ff049cd223a7695d
This PS simplify the logic for dyanmicly merging the image management
depenencies into pod deps when active.
Change-Id: I0cf6c93173bc5fbce697ac15be8697d3b1326d0a