Commit Graph

70 Commits

Author SHA1 Message Date
Steve Wilkerson
65ce9c73d7 Grafana: Add job to update admin password
This change adds a job to the Grafana chart that  allows for the
changing of the grafana admin user password if required, as
Grafana only allows the changing of this password via the
grafana-admin CLI or via an http call that requires both the old
and new password

Change-Id: I59a5d26edc4aa4da16e80c5454ecdebbae3a1d15
2019-02-12 09:59:45 -06:00
Steve Wilkerson
cd4ec0b4b2 Grafana: Update Ceph dashboards for Mimic release
This updates the Ceph dashboards for Grafana, as some of the ceph
metrics have changed with the Mimic release.  This fixes issues
with the ceph OSD metrics that broke some Grafana panels, and also
removes the Ceph panel for displaying the number of monitors in
quorum, as that metric has been removed in Mimic

Change-Id: If6cbbfa7d2972ddd0e44b29a6c8277188d2d9ff0
2019-01-21 09:25:57 -06:00
Steve Wilkerson
9e5a295465 Update Elasticsearch health status expressions
This updates the Elasticsearch health status expressions used in
Prometheus, Nagios and Grafana.  The previous Prometheus rule
defined for Elasticsearch health checked for a status that was
> 0 to trigger an alarm for a green health status. The correct
returned values are: 1 for green, 0 for both red and yellow. This
changes the expression to use arithmetic operators to give us a
result that maps to: 2 for green, 1 for yellow, 0 for red.

This also updates the Elasticsearch dashboard in Grafana to add a
new mapping for the updated 2g,1y,0r scale.

Finally, this also updates the Nagios service check to be a bit
more verbose in its output.

For reference, see:
https://github.com/justwatchcom/elasticsearch_exporter/issues/120

Change-Id: I6ef2a7c308c6ebfdb693b46127a285bceb6ba872
2019-01-16 11:11:59 -06:00
Zuul
1c87af7856 Merge "Grafana: Add container security context" 2019-01-07 19:40:22 +00:00
Steve Wilkerson
7788a1ebea Grafana: Add dashboard for coredns
This adds a Grafana dashboard for coredns metrics

Change-Id: I5b6698675fad2562741569de559419a1898523ee
2019-01-04 12:00:04 -06:00
Zuul
0b66795342 Merge "Grafana: Add pod security context for grafana user" 2019-01-04 10:08:33 +00:00
Chris Wedgwood
0c4e37391f 'NOP' cleanup for more consistent white-space use in charts
Where we have the style '{{ ...' we should use the style '... }}'.

Change-Id: Ic3e779e4681370d396f95d3804ca27db5b9d3642
2019-01-03 22:45:49 +00:00
Steve Wilkerson
bf5840fa7a Grafana: Add container security context
This adds the container security context to grafana, which
explicitly sets allowPrivilegeEscalation to false

Change-Id: I3723a0c96699b9a517dafa2df08bf8cc916bf117
2019-01-03 16:19:03 -06:00
Steve Wilkerson
680f920312 Grafana: Add pod security context for grafana user
This updates the Grafana chart to include the pod security context
on the grafana pod. This changes the pod's user from root to the
grafana user instead

Change-Id: Id64853640f1941001b83566865defe93227b4291
2019-01-03 12:42:52 -06:00
Pete Birley
0bf3674539 Revert "Add Egress Helm-toolkit function & enforce the nework policy at OSH-INFRA"
This reverts commit 8d33a2911c.

Change-Id: Ic861b9bf9b337449b47a3558da8355e7a5bcacee
2018-12-16 04:21:46 +00:00
Mike Pham
8d33a2911c Add Egress Helm-toolkit function & enforce the nework policy at OSH-INFRA
This PS implements the helm toolkit function to generate the
Egress in kubernetes network policy manifest based on overrideable values.
It also enbale the K8s network policy at Osh-infra gate.

Change-Id: Icbe2a18c98dba795d15398dcdcac64228f6a7b4c
2018-12-14 16:32:40 -05:00
Steve Wilkerson
f3d8bda9d6 Grafana: Support multiple Ceph clusters with dashboards
This updates the Grafana Ceph dashboards to use templating to
determine which ceph-mgr to use for displaying ceph related
metrics.  This required setting the appropriate labels on the
ceph-mgr service to be able to distinguish between releases

Change-Id: Id2eceacadc5b6366d7bc6668bc16ccf5ba878e4a
2018-10-16 21:32:13 +00:00
Tin Lam
92e68d33ea Add network policy toolkit function
This patch set implements the helm toolkit function to generate a
kubernetes network policy manifest based on overrideable values.
This also adds a chart that shuts down all the ingress and egress
traffics in the namespace. This can be used to ensure the
whitelisted network policy works as intended.

Additionally, implementation is done for some infrastructure charts.

Change-Id: I78e87ef3276e948ae4dd2eb462b4b8012251c8c8
Co-Authored-By: Mike Pham <tp6510@att.com>
Signed-off-by: Tin Lam <tin@irrational.io>
2018-10-15 13:50:50 +00:00
Steve Wilkerson
c7cbb9f4dd Charts: Update heat image used for jobs and helm tests
This changes the image used for various jobs and helm tests in the
osh-infra charts. This replaces the kolla heat image with the loci
based heat image used for jobs and helm tests in openstack-helm in
order to drive consistency

Change-Id: Ie9deedadb7507282fe62723ec4641dd508040364
2018-10-11 14:47:58 -05:00
Steve Wilkerson
bfa237d347 Charts: Update helm test pod templates
This updates the helm test pod templates in the charts with helm
tests defined. This change includes the addition of:

- Generate test pod cluster roles and role bindings
- Generate service accounts for test pods
- Add node selectors to the test pods
- Add service accounts to the test pods
- Addition of entrypoint container to the test pods
- Indentation fix for rabbitmq test pod template

Change-Id: I9a0dd8a1a87bfe5eaf1362e92b37bc004f9c2cdb
2018-10-09 21:00:00 +00:00
Pete Birley
bb3ff98d53 Add release uuid to pods and rc objects
This PS adds the ability to attach a release uuid to pods and rc
objects as desired. A follow up ps will add the ability to add arbitary
annotations to the same objects.

Change-Id: Iceedba457a03387f6fc44eb763a00fd57f9d84a5
Signed-off-by: Pete Birley <pete@port.direct>
2018-09-13 05:35:35 +00:00
Steve Wilkerson
8c75dc7924 Grafana: Disable LDAP signup by default
This removes the configuration value for enabling LDAP signup by
default in the Grafana chart, which restricts the ability for a
user to sign up for Grafana access via the login page.

Change-Id: Ifed1dbf7eda022541d7a1ab179788c92763bc310
2018-09-10 18:10:22 +00:00
Steve Wilkerson
9a311475ba Charts: Use secrets for configs in chart
This updates the osh-infra charts to use a secret for their
configuration files instead of a configmap, allowing for the
storage of sensitive information

Change-Id: Ia32587162288df0b297c45fd43b55cef381cb064
2018-08-24 15:56:53 -05:00
Steve Wilkerson
9ee7561521 Grafana: Update default refresh intervals, enable gate ingress
This updates the grafana dashboards to use a default refresh
value of 5m to prevent dashboards with intensive queries (like the
container dashboard) from submitting frequent, expensive requests
to Prometheus

This also removes the override to disable the ingress service for
grafana in the developer deployment script, as it was overlooked
when enabling ingresses after the ingress chart was introduced

Change-Id: I0958a3978cec25a1350172cbe75996f1346858c5
2018-08-20 10:59:53 -05:00
Steve Wilkerson
8652e14acb Add auth for prometheus
This adds authentication to Prometheus with an apache reverse
proxy, similar to elasticsearch, kibana and nagios. This adds an
admin user and password via htpasswd along with adding ldap
support.

This required modifying the grafana chart to configure the
prometheus datasource's basic auth credentials in the data sources
provisioning configuration file by checking whether basic auth is
enabled and injecting the username/password defined in the
corresponding endpoint definition.

This also modifies the nagios chart to use the authenticated
endpoint for prometheus, which is required for nagios to
successfully query the prometheus endpoint for its service
checking mechanism

Change-Id: Ia4ccc3c44a89b2c56594be1f4cc28ac07169bf8c
2018-08-08 18:49:45 +00:00
Steve Wilkerson
c524931707 Grafana: Update Ceph Dashboards
This fixes two issues with the Ceph dashboards in Grafana: the
first fix addresses an incorrect heading for Utilized Capacity in
the ceph cluster dashboard (was reporting utilized as available),
and the second fix addresses the Pool Usage gauge to accurately
reflect the percentage of the pool used (was incorrectly
multiplying the percentage result by 100 a second time, resulting
in large and inaccurate results)

Change-Id: I024a555cdb82ee181eb414337b84e7ad62717c97
2018-08-02 11:10:33 -05:00
Seungkyu Ahn
a430533e6a Quoting node_select_value in Ingress Controller
In most cases, the ingress controller's nodeSelector key and value
are "node-role.kubernetes.io/ingress" and "true".
Using quote to treat the nodeSelector value as a string.

Change-Id: Ie1745629b90795e4d888d85f35565e6d6350e09b
2018-08-01 02:39:05 +00:00
Steve Wilkerson
b6f5c19e9d Grafana: Update quotes for ldap admin bind password
This encloses the ldap admin bind password in single quotes
instead of double quotes, which allows for special characters to
be successfully included in the password.

Change-Id: I57649a92595c3fe643f32dd1fb3e7c5b2a0802e7
2018-07-24 15:50:12 -05:00
Steve Wilkerson
dc16a897d7 Add missing labels to helm test pods
This adds missing labels to the helm test pods in osh-infra

Change-Id: I618d9089bfde2d847411f5f876f0ff6afd9cce7f
2018-07-10 08:55:40 -05:00
Steve Wilkerson
c26a1b53f6 Update TLS secret templates, remove nagios readiness probe
This updates the TLS secret templates to include the backend
service in the dict supplied to the manifest template, as it is
required for the TLS secret to render correctly.

This also removes the readiness probe from the nagios container in
the deployment for the nagios chart, as it wasn't functioning as
intended due to the port not being available for the probe

Change-Id: Iabcfd40c74938e0497d08ffeeebc98ab722fa660
2018-06-27 18:56:45 -05:00
Zuul
714bc3e6da Merge "Ingress: Add initial TLS Support for osh-infra public endpoints" 2018-06-26 23:07:28 +00:00
Steve Wilkerson
b823954787 Ingress: Add initial TLS Support for osh-infra public endpoints
Adds support for TLS on overriden fqdns for public endpoints for
the services that have them in openstack-helm-infra. Currently this
implementation is limited, in that it does not provide support for
dynamically loading CAs into the containers, or specifying them manually
via configuration. As a result only well known or CA's added manually
to containers will be recognised.

Change-Id: I4ab4bbe24b6544b64cd365467e8efb2a421ac3f4
2018-06-26 14:47:19 -05:00
Steve Wilkerson
68fa1d6fbe Grafana: Provision data sources via dynamic template in values
This moves to define the datasources provisioned by grafana via
a template defined in the values.yaml. This allows us to define
multiple datasource types that can be mapped directly to the
corresponding entries in endpoints, which enables us to generate
the data source urls via endpoint lookups rather than hardcoding
this. This is the first step to support multiple data sources in
a singular grafana deployment

Change-Id: Iac7f4b1e07aaf83ae4d2a0c923cd06817f0d8c0d
2018-06-26 13:57:46 -05:00
Steve Wilkerson
497959371d Grafana: Update LDAP configuration, update volume mounts
This updates the LDAP configuration for grafana, using a template
defined in the values.yaml file. Using the template allows us to
dynamically define LDAP configuration values, such as the bind dn,
search base and group search base paths, the password, and the
LDAP fqdn.  This also updates the volume mount for the
provisioning directory to be defined by the configuration value in
the values.yaml file

Change-Id: I1e4866d1189cf40b08b3443dc725646a1b76094c
2018-06-26 07:36:15 -05:00
Pete Birley
abb00e97fd Gotpl: remove quote and trunc to suppress output
This PS removes the use of the `quote and truncate` approach to
suppress output from gotpl actions in templates and replaces it
with the recommended practice of defining `$_` instead.

Change-Id: I5fedc3471dcbecef37d2fe1302bf9760b3163467
Signed-off-by: Pete Birley <pete@port.direct>
2018-06-16 16:37:08 -05:00
Pete Birley
fa629cdbbd Daemonsets: Use current kubernetes daemonset api version
This PS moves to use the current ga version for kubernetes daemonsets,
additionally any remaining deployments that were using the
`extensions/v1beta1` have been updated to `apps/v1`.

Story: 2002205
Task: 21735

Change-Id: If9703162dc472af1e6096bf2b9062802fd5ce8ab
Signed-off-by: Pete Birley <pete@port.direct>
2018-06-13 21:53:18 +00:00
Zuul
c037e88071 Merge "Charts: Tidy up openstack-helm-infra charts" 2018-05-24 19:41:14 +00:00
Steve Wilkerson
de9c46bcfa Charts: Tidy up openstack-helm-infra charts
This moves the charts in openstack-helm-infra closer towards a
standard structure. It addresses multiple deviations, including:
missing resources for init containers, incorrect indents for
disabled resources in some charts, incorrect indents for volumes
and volumemounts added via values, missing resources for some
helm test templates, missing helm-toolkit image functions, and
moving the resource template declarations to be under the image
template declarations

Change-Id: I4834a5d476ef7fc69c5583caacc0229050f20a76
2018-05-21 12:58:22 -07:00
Zuul
976d7ba35c Merge "Grafana: Add Elasticsearch dashboard" 2018-05-21 19:17:41 +00:00
Zuul
704042ec1d Merge "Grafana: Add Prometheus dashboard" 2018-05-21 17:33:29 +00:00
Steve Wilkerson
b07f58379f Grafana: Add Elasticsearch dashboard
This adds a grafana dashboard for Elasticsearch, providing insight
into the overall cluster health

Change-Id: I5e59a5a5c491b4416ba4505205910d6c6babbff8
2018-05-18 15:47:29 -05:00
Steve Wilkerson
9c90f7d2a9 Grafana: Add Prometheus dashboard
This adds a dashboard for displaying prometheus specific metrics,
providing insight into the performance of prometheus as well as
metrics related to time series, rule evaluations, scrape delays,
and query latency

Change-Id: I2c23c6fc9d0a00236cd38c63d29207e04a368f5f
2018-05-18 15:24:42 -05:00
Rakesh Patnaik
7ea1b738ae improvements/fixes for openstack dashboards for grafana
Change-Id: I68ddffd4db6dab7e7ecc00adcdafc110279dee37
2018-05-17 12:55:07 +00:00
Steve Wilkerson
e081c19fe8 Add ldap support to grafana, update version, add helm tests
This adds ldap support to the grafana chart. This required updating
the version of Grafana to 5.0, as this version allows for using
configuration files to bootstrap the datasources and dashboards
instead of using the grafana http api. This was a necessary change
as using ldap for grafana presented issues trying to create the
datasource via the http api

This also adds a basic helm test for grafana. This test simply
verifies whether the prometheus datasource configured exists and
whether the number of dashboards reported by the admin api matches
the number of dashboards expected

Change-Id: I2e987cb425adba9f909722ffdb25b83f82710c4d
2018-05-15 01:42:04 +00:00
Sean Eagan
f402171e42 Move to v0.3.1 of kubernetes-entrypoint
Move to v0.3.1 of kubernetes-entrypoint which has 2 breaking changes to
pod dependencies, and also adds support for depending on jobs via
labels.

Change-Id: I2bafc2153ddd46b3833b253a2e7950bccbccf8ed
2018-04-25 12:38:44 -05:00
Zuul
330f1787a4 Merge "Grafana: Update dashboards" 2018-04-20 15:00:49 +00:00
Zuul
e36dfcd21d Merge "Add manifest for image_repo_sync job" 2018-04-19 20:55:15 +00:00
Steve Wilkerson
19137ccf48 Grafana: Update dashboards
This ps includes the following grafana dashboard changes:

- Renames the OpenStack dashboard title
- Removes redundant kubernetes dashboards
- Fixes datasource for the nginx dashboard
- Fixes templating variable for rabbitmq dashboard

Change-Id: I2fa1ff606746ce1f51d2ed01788bb5282bd53dfc
2018-04-19 14:10:39 +00:00
Steve Wilkerson
e166432a98 Add manifest for image_repo_sync job
This ps proposes adding a common template for the image_repo_sync
jobs for consumption by the charts

Change-Id: I48476d1e4fd94bd1b08b13b46983e3d999f8d8ca
2018-04-19 14:10:08 +00:00
Steve Wilkerson
ee7516f565 add elasticsearch, fluent-logging, grafana registry endpoints
This adds the local image registry endpoint to elasticsearch,
fluent-logging and grafana.  This endpoint was missing from the
values.yaml in those charts

Change-Id: I30dc1f0cab40ccf8a493e13f407e2f0d37af1eee
2018-04-19 01:12:47 +00:00
Zuul
49e9084679 Merge "OSH-Infra: Update labels for chart components" 2018-04-18 18:47:08 +00:00
Zuul
626b94e0c8 Merge "Helm-Toolkit: Kubernetes Entrypoint, simplify image dependencies" 2018-04-17 15:11:00 +00:00
Steve Wilkerson
7757400edc OSH-infra: move charts to use ingress manifest in htk
This moves all relevant charts in osh-infra to use the htk manifest
template for ingresses, bringing them in line with the charts in
openstack-helm

Change-Id: Ic9c3cc6f0051fa66b6f88ec2b2725698b36ce824
2018-04-13 15:41:12 -05:00
Steve Wilkerson
aaffc4caf0 OSH-Infra: Update labels for chart components
This ps adds more granular node selectors for the charts in osh
infra to match what is currently done in osh

Change-Id: I8957a95053b9fb3ea329fd37ff049cd223a7695d
2018-04-13 08:44:33 -05:00
Pete Birley
b9336ca613 Helm-Toolkit: Kubernetes Entrypoint, simplify image dependencies
This PS simplify the logic for dyanmicly merging the image management
depenencies into pod deps when active.

Change-Id: I0cf6c93173bc5fbce697ac15be8697d3b1326d0a
2018-04-13 08:42:37 -05:00