Zuul Gates for hostconfig repository and updated code for cron job, resiliency feature

Scripts and files to build initial zuul gates for the
hostconfig repository.

Added cronjob feature - Executing the HostConfig CRs
based on the reconcile-period. This features also adds
support for reconcile execution based on number
of iterations and reconcile-interval specified.

Update the docs with the Node-resiliency observations
tested with hostconfig-operator pod.

Change-Id: Ic0a2f110d709236f9eb23756e3776d4104dd832f
This commit is contained in:
Sirisha Gopigiri 2020-08-24 15:47:07 +05:30 committed by Drew Walters
parent 7bf7989360
commit 46576c9cfe
74 changed files with 2144 additions and 1759 deletions

231
README.md
View File

@ -1,58 +1,55 @@
# Airship HostConfig Using Ansible Operator
This repo contains the code for Airship HostConfig Application using Ansible Operator
# Airship HostConfig Operator
A Day-2 host management interface for Kubernetes
This repo contains the code for Airship HostConfig Operator
built on Ansible Operator
## How to Run
## Approach 1
If Kubernetes setup is not available please refer to README.md in kubernetes folder to bring up the kubernetes setup. It uses Vagrant and Virtual Box to bring up 3 master and 5 worker node VMs
If Kubernetes setup is not available you can build up using
the scripts in tools/deployment folder. The scripts bring up a
kind based 3 master and 5 worker node setup.
After the VMs are up and running, connect to master node
Please follow the below steps to bring up the kubernetes setup
and then launch the hostconfig-operator pod for testing.
1. Clone the repository
```
vagrant ssh k8-master-1
git clone https://opendev.org/airship/hostconfig-operator.git
cd hostconfig-operator
```
Navigate to airship-host-config folder
2. To install kind, kubectl and operator-sdk utilities
```
cd airship-host-config/airship-host-config/
./tools/deployment/00_install_kind.sh
./tools/deployment/01_install_kubectl.sh
./tools/deployment/02_install_operator_sdk.sh
```
Execute the create_labels.sh file so that the Kubernetes nodes are labelled accordingly as master and worker nodes. We are also are attaching some sample zones and regions to the kubernetes nodes
3. Create hostconfig kind cluster
```
./create_labels.sh
./tools/deployment/10_create_hostconfig_cluster.sh
```
Please note: As part of the tasks executed whenever we are creating a Hostconfig CR object, we are checking a "hello" file in the $HOME directory of the ansible ssh user. This file is created as part of the ./setup.sh script please feel free to comment the task if not needed before builing the image.
Execute the setup.sh script to build and copy the Airship Hostconfig Ansible Operator Image to worker nodes. It also deploys the application on the Kubernetes setup as deployment kind. The below script configures Airship HostConfig Ansible Operator to use "vagrant" as both username and password when it tries connecting to the Kubernetes Nodes. So when we create a HostConfig Kubernetes CR object the application tries to execute the hostconfig ansible role on the Kubernetes Nodes specified in the CR object by connecting using the "vagrant" username and password.
4. Configure SSH on the kind cluster nodes and create labels
on the nodes
```
./setup.sh
./tools/deployment/20_configure_ssh_on_nodes.sh
./tools/deployment/30_create_labels.sh
```
If you want to execute the ansible playbook in the hostconfig example with a different user, you can also set the username and password of the Kubernetes nodes when executing the setup.sh script. So this configures the HostConfig Ansible Operator pod to use the "username" and "password" passed when the hostconfig ansible role is executed on the kubernetes nodes.
5. Deploy HostConfig Operator on the kubernetes master node
```
./setup.sh <username> <password>
```
If you are planning for the ansible-operator to use username and private key when connecting to the kubernetes node. You can use the script available that creates the private and public keys, copy the public key to kubernetes nodes, creates the secret and attach the secret as annotation.
```
./install_ssh_private_key.sh
```
To try you own custom keys or custom names, follow the below commands to generate the private and public keys. Use this private key and username to generate the kuberenetes secret. Once the secret is available attach this secret name as annotation to the kubernetes node. Also copy the public key to the node.
```
ssh-keygen -q -t rsa -N '' -f <key_file_name>
ssh-copy-id -i <key_file_name> <username>@<node_ip>
kubectl create secret generic <secret_name> --from-literal=username=<username> --from-file=ssh_private_key=<key_file_name>
kubectl annotate node <node_name> secret=<secret_name>
./tools/deployment/40_deploy_hostconfig_operator.sh
```
Check for hostconfig-operator pod status. It should come to
running state.
## Approach 2
If Kubernetes setup is already available, please follow the below procedure
If Kubernetes setup is already available, please follow the
below procedure
** Pre-requisites: Access to kubernetes setup using kubectl **
@ -64,48 +61,66 @@ export KUBECONFIG=~/.kube/config
Clone the repository
```
git clone https://github.com/SirishaGopigiri/airship-host-config.git
git clone https://opendev.org/airship/hostconfig-operator.git
cd hostconfig-operator
```
Navigate to airship-host-config folder
1. Configure SSH keys and creates kubernetes secrets and
annotations on nodes. Pre-requisite for the script is that SSH
should be already installed and configured on the nodes. And a
sample SSH user which can be used by the hostconfig-operator to
connect to nodes should be already configured on the nodes.
```
cd airship-host-config/airship-host-config/
./tools/install_ssh_private_key.sh <username> <password>
```
Please note: As part of the tasks executed whenever we are creating a Hostconfig CR object, we are checking a "hello" file in the $HOME directory of the ansible ssh user. This file is created as part of the ./setup.sh script please feel free to comment the task if not needed before builing the image.
Execute the setup.sh script to build and copy the Airship Hostconfig Ansible Operator Image to worker nodes. It also deploys the application on the Kubernetes setup as deployment kind. The below script configures Airship HostConfig Ansible Operator to use "vagrant" as both username and password when it tries connecting to the Kubernetes Nodes. So when we create a HostConfig Kubernetes CR object the application tries to execute the hostconfig ansible role on the Kubernetes Nodes specified in the CR object by connecting using the "vagrant" username and password.
2. Deploy HostConfig Operator on the kubernetes cluster.
```
./setup.sh
./tools/deployment/40_deploy_hostconfig_operator.sh
```
Check for hostconfig-operator pod status. It should come to
running state.
If you want to execute the ansible playbook in the hostconfig example with a different user, you can also set the username and password of the Kubernetes nodes when executing the setup.sh script. So this configures the HostConfig Ansible Operator pod to use the "username" and "password" passed when the hostconfig ansible role is executed on the kubernetes nodes.
3. Before executing any HostConfig CR objects, please label
the nodes appropriately. There is a sample 30_create_labels.sh
script available in tools/deployment folder for reference.
The valid labels that can be configured in the HostConfig CR are:
* [`topology.kubernetes.io/region`](https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#topologykubernetesiozone)
* [`topology.kubernetes.io/zone`](https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#topologykubernetesioregion)
* [`kubernetes.io/hostname`](https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-hostname)
* [`kubernetes.io/arch`](https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-arch)
* [`kubernetes.io/os`](https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-os)
* `kubernetes.io/role`
```
./setup.sh <username> <password>
```
If you are planning for the ansible-operator to use username and private key when connecting to the kubernetes node. You can use the script available that creates the private and public keys, copy the public key to kubernetes nodes, creates the secret and attach the secret as annotation.
```
./install_ssh_private_key.sh
```
## SSH Keys
For hostconfig operator to use your own custom keys or custom
secret names, follow the below commands to generate the private
and public keys. Use this private key and username to generate
the kuberenetes secret. Once the secret is available attach
this secret name as annotation to the kubernetes node. Also
copy the public key to the node.
To try you own custom keys or custom names, follow the below commands to generate the private and public keys. Use this private key and username to generate the kuberenetes secret. Once the secret is available attach this secret name as annotation to the kubernetes node. Also copy the public key to the node.
```
ssh-keygen -q -t rsa -N '' -f <key_file_name>
ssh-copy-id -i <key_file_name> <username>@<node_ip>
kubectl create secret generic <secret_name> --from-literal=username=<username> --from-file=ssh_private_key=<key_file_name>
kubectl create secret generic <secret_name> \
--from-literal=username=<username> \
--from-file=ssh_private_key=<key_file_name>
kubectl annotate node <node_name> secret=<secret_name>
```
## Run Examples
## Run Examples
After the scrits are executed successfully and once the
hostconfig operator pod comes to running state, navigate
to demo_examples and execute the desired examples.
After the setup.sh file executed successfully, navigate to demo_examples and execute the desired examples
Before executing the examples keep tailing the logs of the airship-host-config pod to see the ansible playbook getting executed while running the examples
Before executing the examples keep tailing the logs of the
airship-host-config pod to see the ansible playbook getting
executed while running the examples. Or you can as well
check the status of the CR object created
```
kubectl get pods
@ -119,18 +134,106 @@ cd demo_examples
kubectl apply -f example_host_groups.yaml
kubectl apply -f example_match_host_groups.yaml
kubectl apply -f example_parallel.yaml
```
Apart from the logs on the pod when we execute the hostconfig role we are creating a "tetsing" file on the kubernetes nodes, please check the contents in that file which states the time of execution of the hostconfig role by the HostConfig Ansible Operator Pod.
Execute below command on the kubernetes hosts to get the timestamp of execution.
```
To check the status of the hostconfig CR object you have
executed you can use the kubectl command.
```
cat /home/vagrant/testing
kubectl get hostconfig <hostconfig_cr_name> -o json
```
If the setup is configured using a different user, check using the below command
This displays the detailed output of each task that
has been executed.
You can as well check the working of the operator by
executing the validation scripts available.
```
cat /home/<username>/testing
./tools/deployment/50_test_hostconfig_cr.sh
./tools/deployment/51_test_hostconfig_cr_reconcile.sh
```
These scripts execute some sample CR's and check the execution.
## Airship HostConfig Operator CR object specification variables
Here we discuss about the various variable that can be used in
the HostConfig CR Object to control the execution flow of the
kubernetes nodes.
host_groups: Dictionary specifying the key/value labels of the
Kubernetes nodes on which the playbook should be executed.
sequential: When set to true executes the host_groups labels sequentially.
match_host_groups: Performs an AND operation of the host_group labels
and executes the playbook on those hosts only which have all the labels
matched, when set to true.
max_hosts_parallel: Caps the numbers of hosts that are executed
in each iteration.
stop_on_failure: When set to true stops the playbook execution
on that host and subsequent hosts whenever a task fails on a node.
max_failure_percentage: Sets the Maximum failure percentage of
hosts that are allowed to fail on a every iteration.
Annotations:
reconcile_period: Executes the CR object for every period of time
given in the annotation.
reconcile_iterations: Limits the number of iterations of
reconcile-period to the number of iterations specified.
reconcile_interval: Runs the CR object with the frequency give as
reconcile-period, for an interval of time given as reconcile_interval.
Config Roles:
ulimit, sysctl: Array objects specifiying the configuration of
ulimit and sysctl on the kubernetes nodes.
kubeadm, shell: Array objects specifiying the kubeadm and shell
commands that world be executed on the kubernetes nodes.
The demo_examples folder has some examples listed which can be
used to initially to play with the above variables
1. example_host_groups.yaml - Gives example on how to use host_groups
2. example_sequential.yaml - In this example the host_groups specified
goes by sequence and in the first iteration the master nodes gets
executed and then the worker nodes get executed
3. example_match_host_groups.yaml - In this example the playbook will
be executed on all the hosts matching "us-east-1a" zone and are
master nodes, "us-east-1a" and are worker nodes, "us-east-1b" and
are "master" nodes, "us-east-1b" and are worker nodes.
All the hosts matching the condition will be executed in parallel.
4. example_sequential_match_host_groups.yaml - This is the same example
as above but just that the execution goes in sequence.
5. example_parallel.yaml - In this example we will be executing on
only 2 hosts for every iteration.
6. example_stop_on_failure.yaml - This example shows that the execution
stops whenever a task fails on any kubernetes hosts
7. example_max_percentage.yaml - In this example the execution stops
only when the hosts failing exceeds 30% at a given iteration.
8. example_sysctl_ulimit.yaml - In this example we configure the kubernetes
nodes with the values specified for ulimit and sysclt in the CR object.
9. example_reconcile.yaml - Gives an example on how to use
reconcile annotation to run the HostConfig CR periodically
at the given frequecny in the annotation.
10. example_reconcile_iterations.yaml - In this example the CR objects
executes at the given frequency in the "reconcile-period" annotation for
only fixed number of times specified in the "reconcile-iterations" annotation.
11. example_reconcile_interval.yaml - Gives an example to run CR objects
at a particular frquency for a particular interval of time,
specified as "reconcile-interval" annotation.

View File

@ -1,37 +0,0 @@
# Airship HostConfig Using Ansible Operator
Here we discuss about the various variable that are used in the HostConfig CR Object to control the execution flow of the kubernetes nodes
host_groups: Dictionary specifying the key/value labels of the Kubernetes nodes on which the playbook should be executed
sequential: When set to true executes the host_groups labels sequentially
match_host_groups: Performs an AND operation of the host_group labels and executes the playbook on the hosts which have all the labels matched, when set to true
max_hosts_parallel: Caps the numbers of hosts that are executed in each iteration
stop_on_failure: When set to true stops the playbook execution on that host and subsequent hosts whenever a task fails on a node
max_failure_percentage: Sets the Maximum failure percentage of hosts that are allowed to fail on a every iteration
reexecute: Executes the playbook again on the successful hosts as well
ulimit, sysctl: Array objects specifiying the configuration of ulimit and sysctl on the kubernetes nodes
The demo_examples folder has some examples listed which can be used to initially to play with the above variables
1. example_host_groups.yaml - Gives example on how to use host_groups
2. example_sequential.yaml - In this example the host_groups specified goes by sequence and in the first iteration the master nodes get executed and then the worker nodes get executed
3. example_match_host_groups.yaml - In this example the playbook will be executed on all the hosts matching "us-east-1a" zone and are master nodes, "us-east-1a" and are worker nodes, "us-east-1b" and are "master" nodes, "us-east-1b" and are worker nodes. All the hosts matching the condition will be executed in parallel.
4. example_sequential_match_host_groups.yaml - This is the same example as above but just the execution goes in sequence
5. example_parallel.yaml - In this example we will be executing 2 hosts for every iteration
6. example_stop_on_failure.yaml - This example shows that the execution stops whenever a task fails on any kubernetes hosts
7. example_max_percentage.yaml - In this example the execution stops only when the hosts failing exceeds 30% at a given iteration.
8. example_sysctl_ulimit.yaml - In this example we configure the kubernetes nodes with the values specified for ulimit and sysclt in the CR object.

View File

@ -1,22 +1,40 @@
# Ansible Operator base image
FROM quay.io/operator-framework/ansible-operator:v0.17.0
# Installing dependency libraries
COPY requirements.yml ${HOME}/requirements.yml
RUN ansible-galaxy collection install -r ${HOME}/requirements.yml \
&& chmod -R ug+rwx ${HOME}/.ansible
# Configuration for ansible
COPY build/ansible.cfg /etc/ansible/ansible.cfg
# CRD entrypoint definition YAML file
COPY watches.yaml ${HOME}/watches.yaml
# Installing ssh clients - used to connect to kubernetes nodes
USER root
RUN usermod --password rhEpSyEyZ9rxc root
RUN dnf install openssh-clients -y
RUN yum install -y wget && wget http://download.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm && rpm -ivh epel-release-6-8.noarch.rpm && yum --enablerepo=epel -y install sshpass
USER ansible-operator
# Copying the configuration roles
COPY roles/ ${HOME}/roles/
# Copying the entry-point playbook
COPY playbooks/ ${HOME}/playbooks/
# Copying inventory - used to build the kubernetes nodes dynamically
COPY inventory/ ${HOME}/inventory/
# Copying filter and callback plugins used for computation
COPY plugins/ ${HOME}/plugins/
# ansible-runner unable to pick custom callback plugins specified in any other directory other than /usr/local/lib/python3.6/site-packages/ansible/plugins/callback
# ansible-runner is overriding the ANSIBLE_CALLBACK_PLUGINS Environment variable
# https://github.com/ansible/ansible-runner/blob/stable/1.3.x/ansible_runner/runner_config.py#L178
COPY plugins/callback/hostconfig_k8_cr_status.py /usr/local/lib/python3.6/site-packages/ansible/plugins/callback/
# Intializing ssh folder
RUN mkdir ${HOME}/.ssh

View File

@ -1,11 +1,8 @@
[defaults]
inventory_plugins = /opt/ansible/plugins/inventory
callback_plugins = /opt/ansible/plugins/callback
stdout_callback = yaml
callback_whitelist = profile_tasks,timer,hostconfig_k8_cr_status
module_utils = /opt/ansible/module_utils
roles_path = /opt/ansible/roles
library = /opt/ansible/library
inventory = /opt/ansible/inventory
filter_plugins = /opt/ansible/plugins/filter
remote_tmp = /tmp/ansible
# Custom callback plugin to update the CR object status
callback_whitelist = hostconfig_k8_cr_status

View File

@ -1,28 +0,0 @@
#!/bin/bash
kubectl label node k8s-master-1 kubernetes.io/role=master
kubectl label node k8s-master-2 kubernetes.io/role=master
kubectl label node k8s-master-3 kubernetes.io/role=master
kubectl label node k8s-node-1 kubernetes.io/role=worker
kubectl label node k8s-node-2 kubernetes.io/role=worker
kubectl label node k8s-node-3 kubernetes.io/role=worker
kubectl label node k8s-node-4 kubernetes.io/role=worker
kubectl label node k8s-node-5 kubernetes.io/role=worker
kubectl label node k8s-master-1 topology.kubernetes.io/region=us-east
kubectl label node k8s-master-2 topology.kubernetes.io/region=us-west
kubectl label node k8s-master-3 topology.kubernetes.io/region=us-east
kubectl label node k8s-node-1 topology.kubernetes.io/region=us-east
kubectl label node k8s-node-2 topology.kubernetes.io/region=us-east
kubectl label node k8s-node-3 topology.kubernetes.io/region=us-east
kubectl label node k8s-node-4 topology.kubernetes.io/region=us-west
kubectl label node k8s-node-5 topology.kubernetes.io/region=us-west
kubectl label node k8s-master-1 topology.kubernetes.io/zone=us-east-1a
kubectl label node k8s-master-2 topology.kubernetes.io/zone=us-west-1a
kubectl label node k8s-master-3 topology.kubernetes.io/zone=us-east-1b
kubectl label node k8s-node-1 topology.kubernetes.io/zone=us-east-1a
kubectl label node k8s-node-2 topology.kubernetes.io/zone=us-east-1a
kubectl label node k8s-node-3 topology.kubernetes.io/zone=us-east-1b
kubectl label node k8s-node-4 topology.kubernetes.io/zone=us-west-1a
kubectl label node k8s-node-5 topology.kubernetes.io/zone=us-west-1a

View File

@ -1,10 +0,0 @@
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example1
spec:
# Add fields here
host_groups:
- name: "kubernetes.io/role"
values:
- "master"

View File

@ -1,17 +0,0 @@
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example3
spec:
# Add fields here
host_groups:
- name: "topology.kubernetes.io/zone"
values:
- "us-east-1a"
- "us-east-1b"
- name: "kubernetes.io/role"
values:
- "master"
- "worker"
sequential: false
match_host_groups: true

View File

@ -1,14 +0,0 @@
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example5
spec:
# Add fields here
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
- "worker"
sequential: true
stop_on_failure: false
max_failure_percentage: 30

View File

@ -1,17 +0,0 @@
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example4
spec:
# Add fields here
host_groups:
- name: "topology.kubernetes.io/zone"
values:
- "us-east-1a"
- "us-east-1b"
- name: "kubernetes.io/role"
values:
- "master"
- "worker"
sequential: true
match_host_groups: true

View File

@ -1,13 +0,0 @@
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example6
spec:
# Add fields here
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
- "worker"
sequential: true
stop_on_failure: true

View File

@ -17,8 +17,19 @@ spec:
validation:
openAPIV3Schema:
type: object
x-kubernetes-preserve-unknown-fields: true
x-kubernetes-preserve-unknown-fields: true # Marking true so that the CR status can be updated
required:
- apiVersion
- kind
- metadata
- spec
properties:
apiVersion:
type: string
kind:
type: string
metadata:
type: object
spec:
description: "HostConfig Spec to perform hostconfig Opertaions."
type: object

View File

@ -4,7 +4,7 @@ kind: Deployment
metadata:
name: airship-host-config
spec:
replicas: 1
replicas: 2
selector:
matchLabels:
name: airship-host-config
@ -14,16 +14,28 @@ spec:
name: airship-host-config
spec:
serviceAccountName: airship-host-config
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: name
operator: In
values:
- airship-host-config
topologyKey: "kubernetes.io/hostname"
nodeSelector:
kubernetes.io/role: master
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: airship-host-config
# Replace this with the built image name
image: "quay.io/sirishagopigiri/airship-host-config"
image: "airship-hostconfig:local"
imagePullPolicy: "IfNotPresent"
volumeMounts:
- mountPath: /tmp/ansible-operator/runner
name: runner
- mountPath: /opt/ansible/data
name: data
env:
- name: WATCH_NAMESPACE
valueFrom:
@ -52,5 +64,3 @@ spec:
volumes:
- name: runner
emptyDir: {}
- name: data
emptyDir: {}

View File

@ -1,17 +1,27 @@
#!/usr/bin/env python3
import os
import sys
# Python code to build Inventory dynamically based on the kubernetes nodes
# present in the cluster and labels and annotations associated
# with the kubernetes nodes.
import argparse
import time
import base64
import kubernetes.client
from kubernetes.client.rest import ApiException
import yaml
import json
import os
import kubernetes.client
from kubernetes.client.rest import ApiException
interested_labels_annotations = [
"beta.kubernetes.io/arch", "beta.kubernetes.io/os",
"kubernetes.io/arch", "kubernetes.io/hostname", "kubernetes.io/os",
"kubernetes.io/role", "topology.kubernetes.io/region",
"topology.kubernetes.io/zone", "projectcalico.org/IPv4Address",
"projectcalico.org/IPv4IPIPTunnelAddr", "Kernel Version", "OS Image",
"Operating System", "Container Runtime Version",
"Kubelet Version", "Operating System"
]
interested_labels_annotations = ["beta.kubernetes.io/arch", "beta.kubernetes.io/os", "kubernetes.io/arch", "kubernetes.io/hostname", "kubernetes.io/os", "kubernetes.io/role", "topology.kubernetes.io/region", "topology.kubernetes.io/zone", "projectcalico.org/IPv4Address", "projectcalico.org/IPv4IPIPTunnelAddr", "Kernel Version", "OS Image", "Operating System", "Container Runtime Version", "Kubelet Version", "Operating System"]
class KubeInventory(object):
@ -19,7 +29,8 @@ class KubeInventory(object):
self.inventory = {}
self.read_cli_args()
self.api_instance = kubernetes.client.CoreV1Api(kubernetes.config.load_incluster_config())
self.api_instance = kubernetes.client.CoreV1Api(
kubernetes.config.load_incluster_config())
if self.args.list:
self.kube_inventory()
elif self.args.host:
@ -33,10 +44,14 @@ class KubeInventory(object):
# Kube driven inventory
def kube_inventory(self):
self.inventory = {"group": {"hosts": [], "vars": {}}, "_meta": {"hostvars": {}}}
self.inventory = {
"group": {"hosts": [], "vars": {}},
"_meta": {"hostvars": {}}
}
self.get_nodes()
# Sets the ssh username and password using the secret name given in the label
# Sets the ssh username and password using
# the secret name given in the label
def _set_ssh_keys(self, labels, node_internalip, node_name):
namespace = ""
if "SECRET_NAMESPACE" in os.environ:
@ -45,46 +60,63 @@ class KubeInventory(object):
namespace = "default"
if "secret" in labels.keys():
try:
secret_value = self.api_instance.read_namespaced_secret(labels["secret"], namespace)
secret_value = self.api_instance.read_namespaced_secret(
labels["secret"], namespace)
except ApiException as e:
print("Exception when calling Secret: %s\n" % e)
return False
if "username" in secret_value.data.keys():
username = (base64.b64decode(secret_value.data['username'])).decode("utf-8")
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_user"] = username
username = (base64.b64decode(
secret_value.data['username'])).decode("utf-8")
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_user"] = username
elif "USER" in os.environ:
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_user"] = os.environ.get("USER")
self.inventory["_meta"]["hostvars"][node_internalip]\
["ansible_ssh_user"] = os.environ.get("USER")
else:
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_user"] = 'kubernetes'
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_user"] = 'kubernetes'
if "password" in secret_value.data.keys():
password = (base64.b64decode(secret_value.data['password'])).decode("utf-8")
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_pass"] = password
password = (base64.b64decode(
secret_value.data['password'])).decode("utf-8")
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_pass"] = password
elif "ssh_private_key" in secret_value.data.keys():
private_key = (base64.b64decode(secret_value.data['ssh_private_key'])).decode("utf-8")
private_key = (base64.b64decode(
secret_value.data['ssh_private_key'])).decode("utf-8")
fileName = "/opt/ansible/.ssh/"+node_name
with open(os.open(fileName, os.O_CREAT | os.O_WRONLY, 0o644), 'w') as f:
with open(os.open(
fileName, os.O_CREAT | os.O_WRONLY, 0o644), 'w') as f:
f.write(private_key)
f.close()
os.chmod(fileName, 0o600)
self.inventory["_meta"]["hostvars"][node_internalip][
"ansible_ssh_private_key_file"] = fileName
elif "PASS" in os.environ:
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_pass"] = os.environ.get("PASS")
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_pass"] = os.environ.get("PASS")
else:
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_pass"] = 'kubernetes'
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_pass"] = 'kubernetes'
else:
return False
return True
# Sets default username and password from environment variables or some default username/password
# Sets default username and password from environment variables or
# some default username/password
def _set_default_ssh_keys(self, node_internalip):
if "USER" in os.environ:
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_user"] = os.environ.get("USER")
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_user"] = os.environ.get("USER")
else:
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_user"] = 'kubernetes'
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_user"] = 'kubernetes'
if "PASS" in os.environ:
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_pass"] = os.environ.get("PASS")
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_pass"] = os.environ.get("PASS")
else:
self.inventory["_meta"]["hostvars"][node_internalip]["ansible_ssh_pass"] = 'kubernetes'
self.inventory["_meta"]["hostvars"]\
[node_internalip]["ansible_ssh_pass"] = 'kubernetes'
return
# Gets the Kubernetes nodes labels and annotations and build the inventory
@ -111,28 +143,42 @@ class KubeInventory(object):
node_name = node["metadata"]["name"]
self.inventory["_meta"]["hostvars"][node_internalip][
"kube_node_name"] = node_name
if not self._set_ssh_keys(node["metadata"]["annotations"], node_internalip, node_name):
if not self._set_ssh_keys(
node["metadata"]["annotations"],
node_internalip, node_name):
self._set_default_ssh_keys(node_internalip)
# As the annotations are not of interest so not adding them to ansible host groups
# Only updating the host variable with annotations
# As the annotations are not of interest so
# not adding them to ansible host groups
# Only updating the host variable with annotations
for key, value in node["metadata"]["annotations"].items():
self.inventory["_meta"]["hostvars"][node_internalip][key] = value
self.inventory["_meta"]["hostvars"]\
[node_internalip][key] = value
# Add groups based on labels and also updates the host variables
for key, value in node["metadata"]["labels"].items():
self.inventory["_meta"]["hostvars"][node_internalip][key] = value
self.inventory["_meta"]["hostvars"]\
[node_internalip][key] = value
if key in interested_labels_annotations:
if key+'_'+value not in self.inventory.keys():
self.inventory[key+'_'+value] = {"hosts": [], "vars": {}}
if node_internalip not in self.inventory[key+'_'+value]["hosts"]:
self.inventory[key+'_'+value]["hosts"].append(node_internalip)
self.inventory[key+'_'+value] = {
"hosts": [], "vars": {}
}
if node_internalip not in \
self.inventory[key+'_'+value]["hosts"]:
self.inventory[key+'_'+value]\
["hosts"].append(node_internalip)
# Add groups based on node info and also updates the host variables
for key, value in node['status']['node_info'].items():
self.inventory["_meta"]["hostvars"][node_internalip][key] = value
self.inventory["_meta"]["hostvars"]\
[node_internalip][key] = value
if key in interested_labels_annotations:
if key+'_'+value not in self.inventory.keys():
self.inventory[key+'_'+value] = {"hosts": [], "vars": {}}
if node_internalip not in self.inventory[key+'_'+value]["hosts"]:
self.inventory[key+'_'+value]["hosts"].append(node_internalip)
self.inventory[key+'_'+value] = {
"hosts": [], "vars": {}
}
if node_internalip not in \
self.inventory[key+'_'+value]["hosts"]:
self.inventory[key+'_'+value]\
["hosts"].append(node_internalip)
return
def empty_inventory(self):

View File

@ -1,9 +1,9 @@
---
#playbook.yaml
# create_playbook.yaml
# Ansible play to initialize custom variables
# The below blocks of helps in setting the ansible variables
# according to the CR object passed
# The below role of helps in setting the ansible variables
# according to the CR object passed
- name: DISPLAY THE INVENTORY VARS
collections:
- community.kubernetes
@ -16,72 +16,74 @@
- import_role:
name: setvariables
# The play gets executed when the stop_on_failure is undefined or set to false
# stating that the play book execution shouldn't stop even if the tasks fail on the hosts
# The play gets executed when the
# (stop_on_failure is defined or undefined or stop_on_failure is false
# and when max_failure_percentage is undefined)
# When stop_on_failure is false the playbook execution continues on other hosts
# expect the execution stops on the failed nodes.
# The below tasks considers the host_config_serial_variable variable value set from the previous block
# Executes the number of hosts set in the host_config_serial_variable at every iteration
# Dynamically gets the roles defined in the config section of the HostConfig CR spec and
# executes the role with the parameters specified in that role section of the CR.
- name: Execute Roles based on hosts and based on the Failure condition
collections:
- community.kubernetes
- operator_sdk.util
hosts: "{{ hostvars['localhost']['hostconfig_host_groups'] | default('all')}}"
serial: "{{ hostvars['localhost']['hostconfig_serial_variable'] | default('100%') }}"
any_errors_fatal: "{{ stop_on_failure|default(false) }}"
any_errors_fatal: false
gather_facts: no
tasks:
- name: HostConfig Block
block:
- import_role:
name: sysctl
when: config.sysctl is defined
- import_role:
name: ulimit
when: config.ulimit is defined
- name: Update the file for success hosts
local_action: lineinfile line={{ inventory_hostname }} create=yes dest=/opt/ansible/data/hostconfig/{{ meta.name }}/success_hosts
throttle: 1
rescue:
- name: Update the file for Failed hosts
local_action: lineinfile line={{ inventory_hostname }} create=yes dest=/opt/ansible/data/hostconfig/{{ meta.name }}/failed_hosts
throttle: 1
when: ((stop_on_failure is undefined or stop_on_failure is defined) and max_failure_percentage is undefined) or (stop_on_failure is true and max_failure_percentage is defined)
- include_role:
name: "{{ item.key }}"
with_dict: "{{ config }}"
when: (config is defined) and (stop_on_failure is undefined or stop_on_failure is false) and (max_failure_percentage is undefined)
# The below play executes with hostconfig role only when the stop_failure is false
# and when the max_failure_percentage variable is defined.
# The play gets executed when the
# (when stop_on_failure is set to true and when max_failure_percentage is defined or undefined)
# When stop_on_failure is true the playbook execution stops on all hosts
# whenever one task fails on any one of the node.
# The below tasks considers the host_config_serial_variable variable value set from the previous block
# Executes the number of hosts set in the host_config_serial_variable at every iteration
- name: Execute Roles based on hosts and based on percentage of Failure
hosts: "{{ hostvars['localhost']['hostconfig_host_groups'] | default('all')}}"
serial: "{{ hostvars['localhost']['hostconfig_serial_variable'] | default('100%') }}"
max_fail_percentage: "{{ hostvars['localhost']['max_failure_percentage'] }}"
gather_facts: no
tasks:
- name: Max Percetage Block
block:
- import_role:
name: sysctl
when: config.sysctl is defined
- import_role:
name: ulimit
when: config.ulimit is defined
when: (stop_on_failure is false or stop_on_failure is undefined) and (max_failure_percentage is defined)
# Update K8 CR Status
- name: Update CR Status
# Dynamically gets the roles defined in the config section of the HostConfig CR spec and
# executes the role with the parameters specified in that role section of the CR.
- name: Execute Roles based on hosts and based on the Failure condition
collections:
- community.kubernetes
- operator_sdk.util
hosts: localhost
hosts: "{{ hostvars['localhost']['hostconfig_host_groups'] | default('all')}}"
serial: "{{ hostvars['localhost']['hostconfig_serial_variable'] | default('100%') }}"
any_errors_fatal: true
gather_facts: no
tasks:
- name: Update CR Status
block:
- name: Write results to resource status
k8s_status:
api_version: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
name: '{{ meta.name }}'
namespace: '{{ meta.namespace }}'
status:
hostConfigStatus: "{{ hostConfigStatus }}"
when: hostConfigStatus is defined
- name: HostConfig Block
block:
- include_role:
name: "{{ item.key }}"
with_dict: "{{ config }}"
when: config is defined and stop_on_failure is defined and stop_on_failure is true
# The below play executes with hostconfig role only when the stop_failure is false
# and when the max_failure_percentage variable is defined.
# The below tasks considers the host_config_serial_variable variable value set from the previous block
# Executes the number of hosts set in the host_config_serial_variable at every iteration.
# Dynamically gets the roles defined in the config section of the HostConfig CR spec and
# executes the role with the parameters specified in that role section of the CR.
- name: Execute Roles based on hosts and based on percentage of Failure
collections:
- community.kubernetes
- operator_sdk.util
hosts: "{{ hostvars['localhost']['hostconfig_host_groups'] | default('all')}}"
serial: "{{ hostvars['localhost']['hostconfig_serial_variable'] | default('100%') }}"
max_fail_percentage: "{{ max_failure_percentage | default(0) }}"
gather_facts: no
tasks:
- name: Max Percentage Block
block:
- include_role:
name: "{{ item.key }}"
with_dict: "{{ config }}"
when: (config is defined) and (stop_on_failure is false or stop_on_failure is undefined) and (max_failure_percentage is defined)

View File

@ -1,10 +0,0 @@
---
- name: Delete LocalHosts
hosts: localhost
gather_facts: no
tasks:
- name: delete the files
file:
path: "/opt/ansible/data/hostconfig/{{ meta.name }}"
state: absent
register: output

View File

@ -1,112 +1,389 @@
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
import ast
import itertools
import os
import re
from ansible.plugins.callback import CallbackBase
from datetime import datetime, timedelta
from kubernetes import client, config
from kubernetes.client.rest import ApiException
DOCUMENTATION = '''
callback: hostconfig_k8_cr_status
callback_type: aggregate
requirements:
- whitelist in configuration
short_description: Adds time to play stats
version_added: "2.0"
short_description: Adds status field to CR object
version_added: "1.0"
description:
- This callback just adds total play duration to the play stats.
- This callback module update the status field in the
HostConfig CR object with each task status.
'''
from ansible.plugins.callback import CallbackBase
class CallbackModule(CallbackBase):
"""
This callback module tells you how long your plays ran for.
This callback module updates the status field in the
HostConfig CR object with each task status.
"""
CALLBACK_VERSION = 2.0
CALLBACK_VERSION = 1.0
CALLBACK_TYPE = 'aggregate'
CALLBACK_NAME = 'hostconfig_k8_cr_status'
CALLBACK_NEEDS_WHITELIST = True
# Initailize the kubernetes api object and class variables
def __init__(self):
super(CallbackModule, self).__init__()
config.load_incluster_config()
self.custom_api_instance = client.CustomObjectsApi()
self.api_instance = client.CoreV1Api()
self.host_config_status = dict()
self.ansible_summary = dict()
self.start_time = datetime.utcnow()
# Defines which tasks status to skip in the CR object status
self.skip_status_tasks = [
"debug", "k8s_status",
"local_action", "set_fact", "k8s_info",
"lineinfile", "include_role", "file", "fail"
]
# Intializing the variable manager and host variables
# Also initializes with the CR object name and namespace
def v2_playbook_on_play_start(self, play):
self.vm = play.get_variable_manager()
self.skip_status_tasks = ["debug", "k8s_status", "local_action", "set_fact", "k8s_info", "lineinfile"]
self.host_vars = self.vm.get_vars()['hostvars']
self.hostConfigName = self.host_vars['localhost']['meta']['name']
self.namespace = self.host_vars['localhost']['meta']['namespace']
# This function is triggered when a certain task fails with
# unreachable state
def runner_on_unreachable(self, host, result):
self.v2_runner_on_unreachable(result)
# This function is triggered when a certain task fails with
# failed state
def runner_on_failed(self, host, result, ignore_errors=False):
self.v2_runner_on_failed(result, ignore_errors=False)
# This function is triggered when a certain task fails with
# ok state
def runner_on_ok(self, host, res):
self.v2_runner_on_ok(result)
def v2_runner_on_failed(self, result, ignore_errors=False):
self.set_host_config_status(result, True)
# This function is triggered when a certain task fails with
# unreachable state in ansible v2 version
def v2_runner_on_unreachable(self, result):
self.set_host_config_status(result, False, True)
return
# This function is triggered when a certain task fails with
# failed state in ansible v2 version
def v2_runner_on_failed(self, result, ignore_errors=False):
if result._task_fields["action"] == "fail":
return
self.set_host_config_status(result, True, False)
return
# This function is triggered when a certain task fails with
# ok state in ansible v2 version
def v2_runner_on_ok(self, result):
hostname = result._host.name
if result._task_fields["action"] in self.skip_status_tasks:
# Even if the task is set skip if the
# "set_cr_status" task variable is defined
# then the task status will be updated in the CR object.
if "vars" in result._task_fields.keys() and \
"set_cr_status" in \
result._task_fields["vars"].keys() and \
result._task_fields["vars"]["set_cr_status"]:
self.set_host_config_status(result)
return
return
self.set_host_config_status(result)
return
def set_host_config_status(self, result, failed=False):
# Builds the hostConfigStatus object, which is used to update the CR
# The hostConfigStatus is updated based on the task status
def set_host_config_status(self, result, failed=False, unreachable=False):
hostname = result._host.name
task_name = result.task_name
task_result = result._result
status = dict()
hostConfigStatus = dict()
host_vars = self.vm.get_vars()['hostvars'][hostname]
host_vars = self.host_vars[hostname]
k8_hostname = ''
if 'kubernetes.io/hostname' in host_vars.keys():
k8_hostname = host_vars['kubernetes.io/hostname']
else:
k8_hostname = hostname
if 'hostConfigStatus' in self.vm.get_vars()['hostvars']['localhost'].keys():
hostConfigStatus = self.vm.get_vars()['hostvars']['localhost']['hostConfigStatus']
if k8_hostname not in hostConfigStatus.keys():
hostConfigStatus[k8_hostname] = dict()
if task_name in hostConfigStatus[k8_hostname].keys():
status[task_name] = hostConfigStatus[k8_hostname][task_name]
if k8_hostname not in self.host_config_status.keys():
self.host_config_status[k8_hostname] = dict()
if task_name in self.host_config_status[k8_hostname].keys():
status[task_name] = host_config_status[k8_hostname][task_name]
status[task_name] = dict()
if 'stdout' in task_result.keys() and task_result['stdout'] != '':
status[task_name]['stdout'] = task_result['stdout']
if 'stderr' in task_result.keys() and task_result['stderr'] != '':
status[task_name]['stderr'] = task_result['stderr']
if 'msg' in task_result.keys() and task_result['msg'] != '':
status['msg'] = task_result['msg'].replace('\n', ' ')
if 'results' in task_result.keys() and len(task_result['results']) != 0:
check_keys = ["stdout", "stderr", "msg"]
for key in check_keys:
if key in task_result.keys() and task_result[key] != "":
status[task_name][key] = task_result[key]
# If the task executes in for loop; collecting
# the results of each iteration
if 'results' in task_result.keys() and \
len(task_result['results']) != 0:
status[task_name]['results'] = list()
check_keys_res = [
"stdout", "stderr", "msg",
"module_stdout", "module_stderr", "item"
]
for res in task_result['results']:
stat = dict()
if 'stdout' in res.keys() and res['stdout']:
stat['stdout'] = res['stdout']
if 'stderr' in res.keys() and res['stderr']:
stat['stderr'] = res['stderr']
if 'module_stdout' in res.keys() and res['module_stdout']:
stat['module_stdout'] = res['module_stdout']
if 'module_stderr' in res.keys() and res['module_stderr']:
stat['module_stderr'] = res['module_stderr']
if 'msg' in res.keys() and res['msg']:
stat['msg'] = res['msg'].replace('\n', ' ')
if 'item' in res.keys() and res['item']:
stat['item'] = res['item']
if res['failed']:
for key in check_keys_res:
if key in res.keys() and res[key]:
stat[key] = res[key]
if 'failed' in res.keys() and res['failed']:
stat['status'] = "Failed"
elif 'unreachable' in res.keys() and res['unreachable']:
stat['status'] = "Unreachable"
else:
stat['status'] = "Successful"
stat['stderr'] = ""
stat['module_stderr'] = ""
if "msg" not in stat.keys():
stat['msg'] = ""
if "vars" in result._task_fields.keys() and \
"cr_status_vars" in \
result._task_fields["vars"].keys():
for var in result._task_fields["vars"]["cr_status_vars"]:
if var in res.keys():
stat[var] = res[var]
if "ansible_facts" in res.keys() and \
var in res["ansible_facts"].keys():
stat[var] = res["ansible_facts"][var]
status[task_name]['results'].append(stat)
if failed:
status[task_name]['status'] = "Failed"
elif unreachable:
status[task_name]['status'] = "Unreachable"
else:
status[task_name]['status'] = "Successful"
# As the k8s_status module is merging the current and previous status, if there are any previous failure messages overriding them https://github.com/fabianvf/ansible-k8s-status-module/blob/master/k8s_status.py#L322
status[task_name]['stderr'] = ""
if "msg" not in status[task_name].keys():
status[task_name]['msg'] = ""
hostConfigStatus[k8_hostname].update(status)
self.vm.set_host_variable('localhost', 'hostConfigStatus', hostConfigStatus)
if "vars" in result._task_fields.keys() and \
"cr_status_vars" in result._task_fields["vars"].keys():
for var in result._task_fields["vars"]["cr_status_vars"]:
if var in task_result.keys():
status[var] = task_result[var]
if "ansible_facts" in task_result.keys() and \
var in task_result["ansible_facts"].keys():
status[var] = task_result["ansible_facts"][var]
self.host_config_status[k8_hostname].update(status)
self._display.display(str(status))
return
# Determines the iterations completed and further updates the CR object
# not to schedule further itertations if the reconcile-iterations or
# reconcile-interval is met
def update_reconcile_status(self):
cr_obj = self.get_host_config_cr()
annotations = self.host_vars['localhost']\
['_hostconfig_airshipit_org_hostconfig']\
['metadata']['annotations']
reconcile_status = dict()
if "ansible.operator-sdk/reconcile-period" in annotations.keys():
iterations = 0
pre_iter = None
if 'reconcileStatus' in cr_obj['status'].keys() and \
'completed_iterations' in \
cr_obj['status']['reconcileStatus'].keys():
pre_iter = cr_obj['status']['reconcileStatus']\
['completed_iterations']
# Checks if the reconcile-interval or period is specified
if "ansible.operator-sdk/reconcile-interval" in annotations.keys():
# Calculates the iterations based on the reconcile-interval
# This executes for the very first iteration only
if pre_iter is None:
interval = annotations["ansible.operator-sdk/reconcile-interval"]
period = annotations["ansible.operator-sdk/reconcile-period"]
iterations = self.get_iterations_from_interval(
interval, period)
reconcile_status['total_iterations'] = iterations
elif 'total_iterations' in \
cr_obj['status']['reconcileStatus'].keys():
iterations = cr_obj['status']['reconcileStatus']\
['total_iterations']
else:
reconcile_status['msg'] = "Unable to retrieve "+\
"total iterations to be executed."
return reconcile_status
if isinstance(iterations, str) \
and "greater than or equal to" in iterations:
reconcile_status['msg'] = iterations
return reconcile_status
elif isinstance(iterations, str) and \
"format not specified" in iterations:
reconcile_status['msg'] = iterations
return reconcile_status
elif "ansible.operator-sdk/reconcile-iterations" \
in annotations.keys():
iterations = annotations["ansible.operator-sdk/reconcile-iterations"]
else:
reconcile_status['msg'] = "Reconcile iterations or interval "+\
"not specified. Running simple reconcile."
return reconcile_status
if int(iterations) <= 1:
reconcile_status['msg'] = "Reconcile iterations or interval "+\
"should have iterations more than 1. "+\
"Running simple reconcile."
return reconcile_status
# If any host fails execution the iteration is not counted
if self.check_failed_hosts():
if pre_iter is not None:
reconcile_status['completed_iterations'] = pre_iter
reconcile_status['total_iterations'] = iterations
reconcile_status['msg'] = "One or More Hosts Failed, "+\
"so not considering reconcile."
return reconcile_status
if pre_iter is None:
pre_iter = 0
current_iter = int(pre_iter)+1
reconcile_status['total_iterations'] = iterations
reconcile_status['completed_iterations'] = current_iter
if int(current_iter) == int(iterations)-1:
cr_obj["metadata"]["annotations"]\
["ansible.operator-sdk/reconcile-period"] = "0"
self.custom_api_instance.patch_namespaced_custom_object(
group="hostconfig.airshipit.org",
version="v1alpha1",
plural="hostconfigs",
name=self.hostConfigName,
namespace=self.namespace,
body=cr_obj)
reconcile_status['msg'] = "Running reconcile based "+\
"on reconcile-period. Updated the CR to stop "+\
"running reconcile again."
return reconcile_status
elif int(current_iter) == int(iterations):
del reconcile_status['total_iterations']
del reconcile_status['completed_iterations']
reconcile_status['msg'] = "Running reconcile completed. "+\
"Total iterations completed are "+str(current_iter)
return reconcile_status
reconcile_status['msg'] = "Running reconcile based on "+\
"reconcile-period."
else:
reconcile_status['msg'] = "No reconcile annotations specified."
return reconcile_status
def check_failed_hosts(self):
if len(self.ansible_summary["failures"].keys()) > 0 or \
len(self.ansible_summary["unreachable"].keys()) > 0 or \
len(self.ansible_summary["rescued"].keys()) > 0:
return True
return False
# Determines the reconcile iteration from the reconcile-interval specified
def get_iterations_from_interval(self, interval, period):
endsubstring = ['h', 'm', 's', 'ms', 'us', 'ns']
try:
if not interval.endswith(tuple(endsubstring)) or \
not period.endswith(tuple(endsubstring)):
return "Reconcile parameters format not \
specified appropriately!!"
regex = re.compile(r'((?P<hours>\d+?)h)?((?P<minutes>\d+?)m)?((?P<seconds>\d+?)s)?((?P<millisecond>\d+?)ms)?((?P<microsecond>\d+?)us)?((?P<nanosecond>\d+?)ns)?')
interval_re = regex.match(interval)
period_re = regex.match(period)
period_dict = dict()
interval_dict = dict()
for key, value in period_re.groupdict().items():
if value:
period_dict[key] = int(value)
for key, value in interval_re.groupdict().items():
if value:
interval_dict[key] = int(value)
inter = timedelta(**interval_dict)
peri = timedelta(**period_dict)
if inter.seconds >= peri.seconds:
return int(inter/peri)
else:
return "The reconcile-interval should be greater than or "+\
"equal to reconcile-period!!"
except Exception as e:
return "Reconcile parameters format not specified appropriately!!"
# Calculates the minutes, days and seconds from the execution time
def days_hours_minutes_seconds(self, runtime):
minutes = (runtime.seconds // 60) % 60
r_seconds = runtime.seconds % 60
return runtime.days, runtime.seconds // 3600, minutes, r_seconds
# Computes the execution time taken for the playbook to complete
def execution_time(self, end_time):
runtime = end_time - self.start_time
return "Playbook run took %s days, %s hours, %s minutes, %s seconds" % (self.days_hours_minutes_seconds(runtime))
# Triggered when the playbook execution is completed
def v2_playbook_on_stats(self, stats):
self.playbook_on_stats(stats)
return
# Triggered when the playbook execution is completed
# This function updates the CR object with the tasks
# status and reconcile status
def playbook_on_stats(self, stats):
end_time = datetime.utcnow()
summary_fields = [
"ok", "failures", "dark", "ignored",
"rescued", "skipped", "changed"
]
for field in summary_fields:
stat = stats.__dict__[field]
status = dict()
if 'localhost' in stat.keys():
del stat['localhost']
for key, value in stat.items():
if 'kubernetes.io/hostname' in self.host_vars[key].keys():
status[self.host_vars[key]['kubernetes.io/hostname']] = \
value
else:
status[key] = value
if field == "dark":
self.ansible_summary["unreachable"] = status
else:
self.ansible_summary[field] = status
# Gets the reconcile status for the current execution
reconcile_status = self.update_reconcile_status()
cr_status = dict()
self.ansible_summary["completion_timestamp"] = end_time
self.ansible_summary["execution_time"] = self.execution_time(end_time)
cr_status['ansibleSummary'] = self.ansible_summary
cr_obj = self.get_host_config_cr()
# If the current status has not executed on some hosts
# updates those details from the with previous iterations status
if 'hostConfigStatus' in cr_obj['status'].keys():
status = cr_obj['status']['hostConfigStatus']
for key, value in status.items():
if key not in self.host_config_status and key != "localhost":
self.host_config_status[key] = value
cr_obj['status']['hostConfigStatus'] = self.host_config_status
cr_status['reconcileStatus'] = reconcile_status
cr_obj['status'].update(cr_status)
self._display.display("Updating CR Status with below object!!")
self._display.display(str(cr_status))
resp = self.custom_api_instance.\
replace_namespaced_custom_object_status(
group="hostconfig.airshipit.org",
version="v1alpha1",
plural="hostconfigs",
name=self.hostConfigName,
namespace=self.namespace,
body=cr_obj)
self._display.display("Response from KubeAPI server after "+\
"sending status update request")
self._display.display(str(resp))
return
# Returns the HostConfig CR object
# based on the CR object name and namespace
def get_host_config_cr(self):
return self.custom_api_instance.\
get_namespaced_custom_object(
group="hostconfig.airshipit.org",
version="v1alpha1",
namespace=self.namespace,
plural="hostconfigs",
name=self.hostConfigName)

View File

@ -1,57 +1,59 @@
#!/usr/bin/python3
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
import ast
import itertools
import os
# This plugin calculates the list of list of hosts that need to be executed in
# sequence as given by the host_groups variable. The AND and OR conditions on the
# host_groups variable is calculated based on the match_host_groups variable
# This returns the list of list of hosts that the ansible should execute the playbook on
from kubernetes import client, config
from kubernetes.client.rest import ApiException
# This plugin calculates the list of list of hosts that need to be executed in
# sequence as given by the host_groups variable.
# The AND and OR conditions on the host_groups variable is calculated
# based on the match_host_groups variable
# This returns the list of list of hosts that
# the ansible should execute the playbook on
# Returns: [[192.168.1.12, 192.168.1.11], [192.168.1.14], [192.168.1.5]]
# Returns list of keys and values defined in host_groups
# of the CR object
def host_groups_get_keys(host_groups):
keys = list()
values = list()
for hg in host_groups:
keys.append(hg['name'])
values.append(hg['values'])
print(keys)
print(values)
return keys, values
# Performs a permutation and combination among all the
# host_groups specified in the CR object
def host_groups_combinations(host_groups):
keys, values = host_groups_get_keys(host_groups)
for instance in itertools.product(*values):
yield dict(zip(keys, instance))
def removeSuccessHosts(hostGroups, hostConfigName):
filename = '/opt/ansible/data/hostconfig/'+hostConfigName+'/success_hosts'
print(filename)
if os.path.isfile(filename):
hosts = list()
with open(filename) as f:
hosts = [line.rstrip() for line in f]
print(hosts)
for host in hosts:
for hostGroup in hostGroups:
if host in hostGroup:
hostGroup.remove(host)
print(hostGroups)
return hostGroups
def hostconfig_host_groups(host_groups, groups, hostConfigName, match_host_groups, reexecute):
# Determines the hosts on which the execution has to happen
# based on the host_groups defined in the CR object
def hostconfig_host_groups(
host_groups, groups, match_host_groups
):
host_groups_list = list()
host_group_list = list()
if type(host_groups) != list:
return ''
# Performs an AND operation to select the hosts
# which match all of the host_group labels specified
if match_host_groups:
hgs_list = list()
for host_group in host_groups_combinations(host_groups):
hg = list()
for k,v in host_group.items():
for k, v in host_group.items():
hg.append(k+'_'+v)
hgs_list.append(hg)
for hgs in hgs_list:
@ -59,6 +61,8 @@ def hostconfig_host_groups(host_groups, groups, hostConfigName, match_host_group
for i in range(1, len(hgs)):
host_group = list(set(host_group) & set(groups[hgs[i]]))
host_groups_list.append(host_group)
# Performs an OR operation to select the hosts which match
# any one of the host_group labels specified
else:
for host_group in host_groups:
for value in host_group["values"]:
@ -69,13 +73,15 @@ def hostconfig_host_groups(host_groups, groups, hostConfigName, match_host_group
hg = groups[key+'_'+value]
host_group_list = hg.copy()
else:
hg = list((set(groups[key+'_'+value])) - (set(host_group_list) & set(groups[key+'_'+value])))
hg = list(
(set(groups[key+'_'+value]))
- (set(host_group_list) &
set(groups[key+'_'+value]))
)
host_group_list.extend(hg)
host_groups_list.append(hg)
else:
return "Invalid Host Groups "+key+" and "+value
if not reexecute:
return str(removeSuccessHosts(host_groups_list, hostConfigName))
return str(host_groups_list)

View File

@ -1,5 +1,5 @@
#!/usr/bin/python3
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
@ -7,6 +7,7 @@ __metaclass__ = type
# that is accepted by the ansible playbook for execution
# Returns: [192.168.1.12, 192.168.1.11, 192.168.1.14, 192.168.1.5]
def hostconfig_host_groups_to_list(hostconfig_host_groups):
host_groups_list = list()
if type(hostconfig_host_groups) != list:
@ -17,7 +18,8 @@ def hostconfig_host_groups_to_list(hostconfig_host_groups):
class FilterModule(object):
''' Fake test plugin for ansible-operator '''
''' Plugin to convert list of list to list of \
strings for ansible-operator '''
def filters(self):
return {

View File

@ -3,40 +3,44 @@
from __future__ import (absolute_import, division, print_function)
__metaclass__ = type
# Futher divides the host_config_serial variable into a new list
# so that for each iteration there will be not more than the
# This plugin futher divides the host_config_serial variable into a new list
# so that for each iteration there will be not more than the
# max_hosts_parallel(int variable) number of hosts executing
# If we have 3 masters and 5 worker and labels sent are masters and workers
# and the max_hosts_parallel is 2
# Returns: [2, 2, 2, 2] if sequential is false
# Returns: [2, 1, 2, 2, 1] if the sequential is true
def hostconfig_max_hosts_parallel(max_hosts_parallel, hostconfig_host_groups, sequential=False):
def hostconfig_max_hosts_parallel(
max_hosts_parallel, hostconfig_host_groups, sequential=False):
parallel_list = list()
if type(max_hosts_parallel) != int and type(hostconfig_host_groups) != list and (sequential) != bool:
if type(max_hosts_parallel) != int and \
type(hostconfig_host_groups) != list and (sequential) != bool:
return ''
if sequential:
for hg in hostconfig_host_groups:
length = len(hg)
parallel_list += int(length/max_hosts_parallel) * [max_hosts_parallel]
if length%max_hosts_parallel != 0:
parallel_list.append(length%max_hosts_parallel)
parallel_list += (int(length/max_hosts_parallel)
* [max_hosts_parallel])
if length % max_hosts_parallel != 0:
parallel_list.append(length % max_hosts_parallel)
else:
hgs = list()
for hg in hostconfig_host_groups:
hgs.extend(hg)
length = len(hgs)
parallel_list += int(length/max_hosts_parallel) * [max_hosts_parallel]
if length%max_hosts_parallel != 0:
parallel_list.append(length%max_hosts_parallel)
if length % max_hosts_parallel != 0:
parallel_list.append(length % max_hosts_parallel)
return str(parallel_list)
class FilterModule(object):
''' HostConfig Max Hosts in Parallel plugin for ansible-operator to calucate the ansible serial variable '''
''' HostConfig Max Hosts in Parallel plugin for ansible-operator to \
calucate the ansible serial variable '''
def filters(self):
return {
'hostconfig_max_hosts_parallel': hostconfig_max_hosts_parallel
}

View File

@ -7,6 +7,7 @@ __metaclass__ = type
# Interested Groups are defined using the host_groups
# Returns a list of integers [2, 1, 3] based on the host_groups variables
def hostconfig_sequential(hostconfig_host_groups, groups):
seq_list = list()
if type(hostconfig_host_groups) != list:

View File

@ -1,3 +1,4 @@
# Installing kubernetes and operator_sdk libraries
collections:
- community.kubernetes
- operator_sdk.util

View File

@ -0,0 +1,15 @@
---
# To execute kubeadm commands on the kubernetes nodes
# List of commands can be specified
- name: Execute kubeadm command cert expirataion
shell: "/usr/bin/kubeadm {{ kubeadm_item.command }}"
with_items: "{{ config.kubeadm }}"
when: config.kubeadm is defined
loop_control:
loop_var: kubeadm_item
become: yes
register: kubeadm_output
- name: kubeadm output
debug: msg={{ kubeadm_output }}
when: kubeadm_output is defined

View File

@ -0,0 +1,26 @@
---
# To check the kubernetes components certificte expiration details
# Will also annotate the node with the expiration details
- name: Execute kubeadm command cert expirataion
shell: >
kubeadm alpha certs check-expiration | tail -n +4 | sed -r "/^\s*$/d"|grep -v "CERTIFICATE" | awk '{printf("{ %s: %s %s %s %s %s },", $1, $2, $3, $4, $5, $6)}' | sed "s/.$//"
when: config.kubeadm_check_cert_expiration is defined and config.kubeadm_check_cert_expiration is true
become: yes
register: kubeadm_output
- name: kubeadm output
debug: msg={{ kubeadm_output }}
when: kubeadm_output is defined
- name: Annotate kubernetes nodes
delegate_to: localhost
k8s_raw:
state: present
definition:
apiVersion: v1
kind: Node
metadata:
name: "{{ lookup('vars', 'kubernetes.io/hostname') }}"
annotations:
cert-expiration: "{{ kubeadm_output.stdout }}"
when: kubeadm_output.stdout is defined

View File

@ -1,17 +1,23 @@
---
# The below blocak of code helps in intializing the hosts, serial variables
# The below block of code helps in initializing the hosts, serial variables
# that would be used by the ansible playbook to control sequential or parallel execution
- name: Host Groups
block:
- set_fact:
reexecute: false
when: reexecute is undefined
- set_fact:
match_host_groups: false
when: match_host_groups is undefined
- set_fact:
sequential: false
when: sequential is undefined
- set_fact:
stop_on_failure: false
when: stop_on_failure is undefined
- set_fact:
max_failure_percentage: 0
when: max_failure_percentage is undefined
- debug:
msg: "{{ config }}"
when: config is defined
# This hostconfig_host_groups custom filter plugin helps in computing the AND or OR
# operation on the host_groups labels passed through the CR object.
# The AND and OR operation is controlled using the match_host_groups variable.
@ -19,7 +25,7 @@
# every iteration.
# Returns: [[192.168.1.5, 192.168.1.3], [192.168.1.4]]
- set_fact:
hostconfig_host_groups: "{{ host_groups|hostconfig_host_groups(groups, meta.name, match_host_groups, reexecute) }}"
hostconfig_host_groups: "{{ host_groups|hostconfig_host_groups(groups, match_host_groups) }}"
- debug:
msg: "Host Groups Variable {{ hostconfig_host_groups }}"
# The hostconfig_serial custom filter plugin helps in calculating the list of hosts
@ -59,16 +65,3 @@
- debug:
msg: "{{ hostconfig_serial_variable }}"
when: host_groups is undefined and max_hosts_parallel is defined
- name: Failure Testing
block:
# This block of max_failure_percentage helps in intaializing default value
# to the max_failure_percentage variable so that the below play would be selected
# appropriately
- set_fact:
max_failure_percentage: "{{ max_failure_percentage }}"
when: max_failure_percentage is defined
# Please note we are just setting some default value to the max_failure_percentage
# so that we can check the conditions below
- set_fact:
max_failure_percentage: 100
when: max_failure_percentage is undefined

View File

@ -0,0 +1,13 @@
---
# To execute shell commands on the kubernetes nodes
# List of commands can be specified
- name: Execute shell command on nodes
shell: "{{ shell_item.command }}"
with_items: "{{ config.shell }}"
loop_control:
loop_var: shell_item
register: shell_output
- name: shell output
debug: msg={{ shell_output }}
when: shell_output is defined

View File

@ -1,4 +1,6 @@
---
# Configures the array of sysctl configuration provided in the CR
# object on the corresponding kubernetes node selected.
- name: sysctl configuration
sysctl:
name: "{{ item.name }}"

View File

@ -1,4 +1,6 @@
---
# Configures ulimit on the selected kubernetes nodes
# Accepts a list of ulimit configuration
- name: ulimit configuration
pam_limits:
domain: "{{ item.user }}"

View File

@ -1,72 +0,0 @@
#!/bin/bash
RELEASE_VERSION=v0.8.0
AIRSHIP_PATH=airship-host-config/airship-host-config
IMAGE_NAME=airship-host-config
install_operator_sdk(){
echo "Installing Operator-SDK to build image"
curl -OJL https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
chmod +x operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
sudo mv operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu /usr/local/bin/operator-sdk
}
build_host_config_image(){
echo "Building Airship host-config Ansible operator Image"
cd $HOME/$AIRSHIP_PATH
operator-sdk build $IMAGE_NAME
}
get_worker_ips(){
echo >&2 "Getting other master and worker node IPs to copy Airship host-config Ansible Operator Image"
IP_ADDR=`ifconfig enp0s8 | grep Mask | awk '{print $2}'| cut -f2 -d:`
worker_node_ips=`kubectl get nodes -o wide | grep -v $IP_ADDR | awk '{print $6}' | sed -e '1d'`
echo $worker_node_ips
}
save_and_load_docker_image(){
cd $HOME/$AIRSHIP_PATH
echo "Saving Airship host-config Ansible Operator Image so that it would be copied to other worker nodes"
docker save $IMAGE_NAME -o $IMAGE_NAME
worker_node_ips=$(get_worker_ips)
echo "Copying Image to following worker Nodes"
echo $worker_node_ips
touch $HOME/hello
for i in $worker_node_ips
do
sshpass -p "vagrant" scp -o StrictHostKeyChecking=no $IMAGE_NAME vagrant@$i:~/.
sshpass -p "vagrant" ssh vagrant@$i docker load -i $IMAGE_NAME
sshpass -p "vagrant" ssh vagrant@$i touch hello
done
}
get_username_password(){
if [ -z "$1" ]; then USERNAME="vagrant"; else USERNAME=$1; fi
if [ -z "$2" ]; then PASSWORD="vagrant"; else PASSWORD=$2; fi
echo $USERNAME $PASSWORD
}
deploy_airship_ansible_operator(){
read USERNAME PASSWORD < <(get_username_password $1 $2)
echo "Setting up Airship host-config Ansible operator"
echo "Using Username: $USERNAME and Password: $PASSWORD of K8 nodes for host-config pod setup"
sed -i "s/AIRSHIP_HOSTCONFIG_IMAGE/$IMAGE_NAME/g" $HOME/$AIRSHIP_PATH/deploy/operator.yaml
sed -i "s/PULL_POLICY/IfNotPresent/g" $HOME/$AIRSHIP_PATH/deploy/operator.yaml
sed -i "s/USERNAME/$USERNAME/g" $HOME/$AIRSHIP_PATH/deploy/operator.yaml
sed -i "s/PASSWORD/$PASSWORD/g" $HOME/$AIRSHIP_PATH/deploy/operator.yaml
kubectl apply -f $HOME/$AIRSHIP_PATH/deploy/crds/hostconfig.airshipit.org_hostconfigs_crd.yaml
kubectl apply -f $HOME/$AIRSHIP_PATH/deploy/role.yaml
kubectl apply -f $HOME/$AIRSHIP_PATH/deploy/service_account.yaml
kubectl apply -f $HOME/$AIRSHIP_PATH/deploy/role_binding.yaml
kubectl apply -f $HOME/$AIRSHIP_PATH/deploy/cluster_role_binding.yaml
kubectl apply -f $HOME/$AIRSHIP_PATH/deploy/operator.yaml
}
configure_github_host_config_setup(){
install_operator_sdk
build_host_config_image
save_and_load_docker_image
deploy_airship_ansible_operator $1 $2
}
configure_github_host_config_setup $1 $2

View File

@ -2,7 +2,4 @@
- version: v1alpha1
group: hostconfig.airshipit.org
kind: HostConfig
playbook: playbooks/create_playbook.yaml
finalizer:
name: finalizer.hostconfig.airshipit.org
playbook: playbooks/delete_playbook.yaml
playbook: playbooks/create_playbook.yaml # Main playbook that gets executed whenever a new HostConfig CR object is created or updated

View File

@ -0,0 +1,18 @@
# This example executes the shell and kubeadm commands
# on all the nodes labelled as master nodes in the
# kubernetes cluster
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example1
spec:
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
config:
shell:
- command: "date;hostname"
kubeadm:
- command: "alpha certs check-expiration"

View File

@ -0,0 +1,29 @@
# This example executes the shell command on the nodes
# which match the below conditions. Please note the execution
# on all the nodes happen in parallel as "sequential is false"
#
# Only one iteration: Nodes matching labels:
# 1. "us-east-1a" and "master"
# 2. "us-east-1a" and "worker"
# 3. "us-east-1b" and "master"
# 4. "us-east-1b" and "worker"
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example3
spec:
host_groups:
- name: "topology.kubernetes.io/zone"
values:
- "us-east-1a"
- "us-east-1b"
- name: "kubernetes.io/role"
values:
- "master"
- "worker"
sequential: false
match_host_groups: true
config:
shell:
- command: "date;hostname"

View File

@ -0,0 +1,21 @@
# This HostConfig CR object executes the shell and kubeadm
# commands on the master nodes in the cluster.
# And execution stops at the task for all nodes
# when failed nodes exceed 30%.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example5
spec:
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
stop_on_failure: false
max_failure_percentage: 30
config:
shell:
- command: "date;hostname"
kubeadm:
- command: "alpha certs check-expiration"

View File

@ -1,12 +1,18 @@
# This HostConfig CR object when created executes
# the shell command on the master and worker nodes
# scheduling not more that 2 nodes per iteration.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example7
spec:
# Add fields here
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
- "worker"
max_hosts_parallel: 2
config:
shell:
- command: "date;hostname;sleep 5"

View File

@ -0,0 +1,20 @@
# This example CR when created executes the shell and
# kubeadm commands on the nodes labelled as master for
# every 30s.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
annotations:
ansible.operator-sdk/reconcile-period: "30s"
name: example9
spec:
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
config:
shell:
- command: "date;hostname"
kubeadm:
- command: "alpha certs check-expiration"

View File

@ -0,0 +1,19 @@
# This HostConfig CR executes the kubeadm command on the
# master nodes for every 30s for an interval of 2m40s.
# This covers 5 iterations.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
annotations:
ansible.operator-sdk/reconcile-period: "30s"
ansible.operator-sdk/reconcile-interval: "2m40s"
name: example11
spec:
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
config:
kubeadm:
- command: "alpha certs check-expiration"

View File

@ -0,0 +1,21 @@
# This CR executes the shell and kubeadm commands on the
# master nodes for every 30s. The execution happens for
# 3 times.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
annotations:
ansible.operator-sdk/reconcile-period: "30s"
ansible.operator-sdk/reconcile-iterations: "3"
name: example10
spec:
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
config:
shell:
- command: "date;hostname"
kubeadm:
- command: "alpha certs check-expiration"

View File

@ -1,12 +1,18 @@
# In this example the shell command is executed
# on all the master nodes first and then on all
# the worker nodes in the second iteration.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example2
spec:
# Add fields here
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
- "worker"
sequential: true
config:
shell:
- command: "date;hostname;sleep 5"

View File

@ -0,0 +1,32 @@
# This example HostConfig CR when created executes
# the shell command on the nodes which match the
# below labels in a sequence.
#
# Iteration 1: nodes with labels "us-east-1a" and "master" nodes
# Iteration 2: nodes with labels "us-east-1a" and "worker" nodes
# Iteration 3: nodes with labels "us-east-1b" and "master" nodes
# Iteration 4: nodes with labels "us-east-1b" and "worker" nodes
#
# Please note that the nodes which have already executed in previous
# iteration will not be scheduled again in the next iteration even
# if the labels match.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example4
spec:
host_groups:
- name: "topology.kubernetes.io/zone"
values:
- "us-east-1a"
- "us-east-1b"
- name: "kubernetes.io/role"
values:
- "master"
- "worker"
sequential: true
match_host_groups: true
config:
shell:
- command: "date;hostname;sleep 5"

View File

@ -0,0 +1,20 @@
# This CR executes the shell and kubeadm commands
# on the master nodes and stops execution on all the
# nodes whenever a master node fails at any task
# as part of the execution.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example6
spec:
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
stop_on_failure: true
config:
shell:
- command: "date;hostname"
kubeadm:
- command: "alpha certs check-expiration"

View File

@ -1,15 +1,15 @@
# This CR when executed configures the passed sysctl and ulimit
# configuration on the kubernetes master nodes.
apiVersion: hostconfig.airshipit.org/v1alpha1
kind: HostConfig
metadata:
name: example8
spec:
# Add fields here
host_groups:
- name: "kubernetes.io/role"
values:
- "master"
sequential: false
reexecute: false
config:
sysctl:
- name: "net.ipv6.route.gc_interval"
@ -17,7 +17,7 @@ spec:
- name: "net.netfilter.nf_conntrack_frag6_timeout"
value: "120"
ulimit:
- user: "sirisha"
- user: "vagrant"
type: "hard"
item: "cpu"
value: "unlimited"

BIN
docs/Node_resiliency.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 356 KiB

View File

@ -2,35 +2,57 @@
## Overview
An ansible based operator for performing host configuration LCM operations on Kubernetes Nodes. It is built to perform the configuration on kubernetes nodes after the intial kubernetes setup is done on the nodes. It is managed by the cluster itself.
An ansible based operator to perform host configuration LCM operations
on Kubernetes Nodes. It is built to execute the required configuration
on kubernetes nodes after the intial kubernetes setup is done on the nodes.
The application is deployed as a pod on the existing cluster itself.
Current implementation have been tested with running one replica of the hostconfig operator deployment on one of the master node in the kubernetes setup.
Current implementation have been tested with running three replicas
of the hostconfig operator deployment launched on different master nodes
in the kubernetes setup.
Once the hostconfig operator is deployed and the corresponding CRD is created on the kubernetes cluster, we can then create the HostConfig CR objects to perform the required configuration on the nodes.
Once the hostconfig operator is deployed and the corresponding CRD is
created on the kubernetes cluster, we can then create the HostConfig CR objects
to perform the required configuration on the nodes.
The host configuration on the kubernetes nodes is done by executing the appropriate ansible playbook on that Kubernetes node by the hostconfig operator pod.
The host configuration on the kubernetes nodes is done by executing the
appropriate ansible playbook on that Kubernetes node by the
hostconfig operator pod.
## Scope and Features
* Perform host configuration LCM operations on Kubernetes hosts
* LCM operations managed using HostConfig CR objects
* Inventory built dynamically, at the time of playbook execution
* Connects to hosts using the secrets associated with the nodes, which have the ssh keys associated in them.
* Supports execution based on host-groups, which are built based out of labels associated with kubernetes nodes
* Connects to hosts using the secrets associated with the nodes, which have the
ssh keys associated in them.
* Supports execution based on host-groups, which are built based out of labels
associated with kubernetes nodes
* Supports serial/parallel execution of configuration on hosts
* Supports host selection with AND and OR operations of the labels mentioned in the host-groups of the CR object
* Reconcile on failed nodes, based on reconcile period - feature available from ansible-operator
* Current support is available to perform `sysctl` and `ulimit` operations on the kubernetes nodes
* WIP: Display the status of each Hostconfig CR object as part of the `kubectl describe hostconfig <name>`
* Supports host selection with AND and OR operations of the labels mentioned
in the host-groups of the CR object
* Reconcile on failed nodes, based on reconcile period - feature available
from ansible-operator
* Current support is available to perform `sysctl` and `ulimit` operations
on the kubernetes nodes. Also any shell command that needs to be executed
on the nodes.
* Display the status of each Hostconfig CR object as part of the
`kubectl describe hostconfig <name>`
* We have also added an anisble role to execute the
"kubeadm alpha cert check-expiration" command and annotate the nodes
with expiration detail.
## Architecture
![](Deployment_Architecture.png)
Hostconfig operator will be running as a kubernetes deployment on the target kubernetes cluster.
Hostconfig operator will be running as a kubernetes deployment on the target
kubernetes cluster.
This repository also have vagrants scripts to build kubernetes cluster on the Vagrant VMs and has to deploy and configure the hostconfig-operator pod on the K8 setup.
This repository also have vagrants scripts to build kubernetes cluster on the
Vagrant VMs and has to deploy and configure the hostconfig-operator pod
on the K8 setup.
## Deployment and Host Configuration Flow
@ -47,7 +69,9 @@ Using operator pod to perform host configuration on kubernetes nodes
**Pre-requisite:**
1. The Kubernetes nodes should be labelled with any one of the below label to execute based on host-groups, if not labelled by default executes on all the nodes as no selection happens.
1. The Kubernetes nodes should be labelled with any one of the below label
to execute based on host-groups, if not labelled by default executes on
all the nodes as no selection happens.
Valid labels:
* [`topology.kubernetes.io/region`](https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#topologykubernetesiozone)
* [`topology.kubernetes.io/zone`](https://kubernetes.io/docs/reference/kubernetes-api/labels-annotations-taints/#topologykubernetesioregion)
@ -58,34 +82,41 @@ Using operator pod to perform host configuration on kubernetes nodes
2. **Operator pod connecting to Kubernetes Nodes:**
The kubernetes nodes should be annotated with secret name having the username and private key as part of the contents.
The kubernetes nodes should be annotated with secret name having
the username and private key as part of the contents.
git clone the hostconfig repository
git clone the hostconfig repository
`git clone https://github.com/SirishaGopigiri/airship-host-config.git`
`git clone https://opendev.org/airship/hostconfig-operator.git`
Move to airship-host-config directory
Move to airship-host-config directory
`cd airship-host-config/airship-host-config`
`cd hostconfig-operator/airship-host-config`
Create a HostConfig CRD
Create a HostConfig CRD
`kubectl create -f deploy/crds/hostconfig.airshipit.org_hostconfigs_crd.yaml`
`kubectl create -f deploy/crds/hostconfig.airshipit.org_hostconfigs_crd.yaml`
Create hostconfig role, service account, role-binding and cluster-role-binding which is used to deploy and manage the operations done using the hostconfig operator pod
Create hostconfig role, service account, role-binding and
cluster-role-binding which is used to deploy and manage the operations done
using the hostconfig operator pod
`kubectl create -f deploy/role.yaml`
`kubectl create -f deploy/service_account.yaml`
`kubectl create -f deploy/role_binding.yaml`
`kubectl create -f deploy/cluster_role_binding.yaml`
`kubectl create -f deploy/role.yaml`
`kubectl create -f deploy/service_account.yaml`
`kubectl create -f deploy/role_binding.yaml`
`kubectl create -f deploy/cluster_role_binding.yaml`
Now deploy the hostconfig operator pod
Now deploy the hostconfig operator pod
`kubectl create -f deploy/operator.yaml`
`kubectl create -f deploy/operator.yaml`
Once the hostconfig operator pod is deployed, we can create the desired HostConfig CR with the required configuration. And this CR can be passed to the operator pod which performs the required operation.
Once the hostconfig operator pod is deployed, we can create the desired
HostConfig CR with the required configuration. And this CR is passed to the
operator pod which performs the required operation.
Some example CRs are available in the demo_examples directory.
Some example CRs are available in the demo_examples directory.
Please refer to README.md file for more detailed steps.
## References

78
docs/Resiliency.md Normal file
View File

@ -0,0 +1,78 @@
## HostConfig-Operator tested with Node Resiliency
Tested hostconfig-operator in a HA mode and also when the node
goes down on which the leader pod is launched. Here are the
scenarios tested and configuration used
### Topology and Configuration Details
Launched hostconfig-operator with 3 replica-sets on a
kubernetes cluster having 3 master and 5 worker nodes.
The deployment has nodeantiaffinity rule and tolerations
so that each pod is launched on a different master node.
So 3 master have one replica of the hostconfig-operator pod
running on them.
Please refer to operator.yaml file for deployment details.
Once all the pods comes to running state, we can check
the pod logs to see which pod has been elected as leader pod.
### Scenarios tested for Node Resiliency
Once the pods are up and running and the leader pod is elected,
executed a sample CR object(demo_examples/example_host_groups.yaml).
This CR object will be executed by the leader pod.
After the above CR has executed successfully, simulated some
failure scenario on couple of nodes and executed a new CR object.
This CR object keeps failing and will be trying to reconcile until
successful by the hostconfig-operator pod.
Now drained the kubernetes node on which the leader pod is launched
and simultaneously executed a new CR object just before draining the node.
Once the node gets drained and the pod is deleted new pod is
elected as leader. The third replica pod which tries to come up
will be in pending state as there will be no more master nodes
available to launch the new pod.
The new leader which has been elected as leader reads all the HostConfig CR
objects present in the cluster and tries to execute each CR sequentially.
At the same time if any CR fails to execute the pod tries reconcile the CR
until it is successful.
#### Scenarios
1. Launch a HA setup with 3 pods running on different masters.
Create a new CR and validate the execution by the leader pod.
Check the status of the hc (kubectl get hc example1 -o json).
Now delete the leader node(cordoned).
Behaviour: CR is executed again by the new HostConfig Leader pod.
2. Launch a HA setup, create a CR and when CR has started the
execution delete the leader node(cordoned). And check if the new leader
is again executing the CR.
Behavior :The new pod should re-execute the CR again from starting and stop
when all nodes executed successfully.
3. Launch a HA setup, apply a successful CR and when the CR is completed.
Apply a new CR i.e. sysctl-ulimit. Such that the second CR works for few nodes
successfully and fails for the remaining nodes. As the pod keeps trying to
execute on all the nodes, delete the leader pod and uncordon the node.
Behavior: The pod gets each CR object and re-executes again until they
execute successfully. And the second CR which was failing will also gets
executed again and the new pod re-attempts to run on that CR until successful.
4. When number of replicas of hostconfig-operator are more than number
of master nodes pods should be in pending state as it is one pod per node.
Behavior: Pod is expected to be in pending state waiting for the node
to be available.
5. Multiple CR are applied continuously.
Behavior: Tested with 20 HC and are applied sequentially.
Video demonstrating the node resiliency behaviour of HostConfig-Operator Pod:
It demonstrates all the above scenarios consolidated.
[![Alt text](Node_resiliency.png)
](https://drive.google.com/file/d/1lwA_Zqrax0ECc0K2b2BKTj3eCMYVmbqJ/view?usp=sharing)

View File

@ -1,81 +0,0 @@
# Kubernetes cluster
A vagrant script for setting up a Kubernetes cluster using Kubeadm
## Pre-requisites
* **[Vagrant 2.1.4+](https://www.vagrantup.com)**
* **[Virtualbox 5.2.18+](https://www.virtualbox.org)**
## How to Run
Git clone the repo on the host machine which has vagrant and virtual box installed
```
git clone https://github.com/SirishaGopigiri/airship-host-config.git
```
Navigate to the kubernetes folder
```
cd airship-host-config/kubernetes/
```
Execute the following vagrant command to start a new Kubernetes cluster, this will start three master and five nodes:
```
vagrant up
```
You can also start individual machines by vagrant up k8s-head, vagrant up k8s-node-1 and vagrant up k8s-node-2
If you would need more master nodes, you can edit the servers array in the Vagrantfile. Please change the name, and IP address for eth1.
```
servers = [
{
:name => "k8s-master-1",
:type => "master",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.10",
:mem => "2048",
:cpu => "2"
}
]
```
Also update the haproxy.cfg file to add more master servers.
```
balance roundrobin
server k8s-api-1 192.168.205.10:6443 check
server k8s-api-2 192.168.205.11:6443 check
server k8s-api-3 192.168.205.12:6443 check
server k8s-api-4 <ip:port> check
```
If more than five nodes are required, you can edit the servers array in the Vagrantfile. Please change the name, and IP address for eth1.
```
servers = [
{
:name => "k8s-node-3",
:type => "node",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.14",
:mem => "2048",
:cpu => "2"
}
]
```
As you can see above, you can also configure IP address, memory and CPU in the servers array.
## Clean-up
Execute the following command to remove the virtual machines created for the Kubernetes cluster.
```
vagrant destroy -f
```
You can destroy individual machines by vagrant destroy k8s-node-1 -f

220
kubernetes/Vagrantfile vendored
View File

@ -1,220 +0,0 @@
# -*- mode: ruby -*-
# vi: set ft=ruby :
servers = [
{
:name => "k8s-lbhaproxy",
:type => "lbhaproxy",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.13",
:mem => "2048",
:cpu => "2"
},
{
:name => "k8s-master-1",
:type => "master",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.10",
:mem => "2048",
:cpu => "2"
},
{
:name => "k8s-master-2",
:type => "master-join",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.11",
:mem => "2048",
:cpu => "2"
},
{
:name => "k8s-master-3",
:type => "master-join",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.12",
:mem => "2048",
:cpu => "2"
},
{
:name => "k8s-node-1",
:type => "node",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.14",
:mem => "2048",
:cpu => "2"
},
{
:name => "k8s-node-2",
:type => "node",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.15",
:mem => "2048",
:cpu => "2"
},
{
:name => "k8s-node-3",
:type => "node",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.16",
:mem => "2048",
:cpu => "2"
},
{
:name => "k8s-node-4",
:type => "node",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.17",
:mem => "2048",
:cpu => "2"
},
{
:name => "k8s-node-5",
:type => "node",
:box => "ubuntu/xenial64",
:box_version => "20180831.0.0",
:eth1 => "192.168.205.18",
:mem => "2048",
:cpu => "2"
}
]
# This script to install k8s using kubeadm will get executed after a box is provisioned
$configureBox = <<-SCRIPT
# install docker v17.03
# reason for not using docker provision is that it always installs latest version of the docker, but kubeadm requires 17.03 or older
apt-get update
apt-get install -y apt-transport-https ca-certificates curl software-properties-common
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
add-apt-repository "deb https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") $(lsb_release -cs) stable"
apt-get update && apt-get install -y docker-ce=$(apt-cache madison docker-ce | grep 17.03 | head -1 | awk '{print $3}')
# run docker commands as vagrant user (sudo not required)
usermod -aG docker vagrant
# install kubeadm
apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install -y kubelet kubeadm kubectl
apt-mark hold kubelet kubeadm kubectl
# kubelet requires swap off
swapoff -a
# keep swap off after reboot
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# ip of this box
IP_ADDR=`ifconfig enp0s8 | grep Mask | awk '{print $2}'| cut -f2 -d:`
# set node-ip
sudo wget https://raw.githubusercontent.com/SirishaGopigiri/airship-host-config/master/kubernetes/config/kubelet -O /etc/default/kubelet
sudo sed -i "/^[^#]*KUBELET_EXTRA_ARGS=/c\KUBELET_EXTRA_ARGS=--node-ip=$IP_ADDR" /etc/default/kubelet
sudo systemctl restart kubelet
sudo --user=vagrant touch /home/vagrant/.Xauthority
# required for setting up password less ssh between guest VMs
sudo sed -i "/^[^#]*PasswordAuthentication[[:space:]]no/c\PasswordAuthentication yes" /etc/ssh/sshd_config
sudo service sshd restart
SCRIPT
$configureMaster = <<-SCRIPT
echo -e "\nThis is master:\n"
# ip of this box
IP_ADDR=`ifconfig enp0s8 | grep Mask | awk '{print $2}'| cut -f2 -d:`
# install k8s master
HOST_NAME=$(hostname -s)
kubeadm init --apiserver-advertise-address=$IP_ADDR --apiserver-cert-extra-sans=$IP_ADDR --node-name $HOST_NAME --pod-network-cidr=172.16.0.0/16 --control-plane-endpoint "192.168.205.13:443" --upload-certs
#copying credentials to regular user - vagrant
sudo --user=vagrant mkdir -p /home/vagrant/.kube
cp -i /etc/kubernetes/admin.conf /home/vagrant/.kube/config
chown $(id -u vagrant):$(id -g vagrant) /home/vagrant/.kube/config
# install Calico pod network addon
export KUBECONFIG=/etc/kubernetes/admin.conf
kubectl apply -f https://raw.githubusercontent.com/SirishaGopigiri/airship-host-config/master/kubernetes/calico/calico.yaml
kubeadm init phase upload-certs --upload-certs > /etc/upload_cert
kubeadm token create --print-join-command >> /etc/kubeadm_join_cmd.sh
chmod +x /etc/kubeadm_join_cmd.sh
cat /etc/kubeadm_join_cmd.sh > /etc/kubeadm_join_master.sh
CERT=`tail -1 /etc/upload_cert`
sed -i '$ s/$/ --control-plane --certificate-key '"$CERT"'/' /etc/kubeadm_join_master.sh
#Install sshpass for futher docker image copy
apt-get install -y sshpass
SCRIPT
$configureMasterJoin = <<-SCRIPT
echo -e "\nThis is Master with Join Commadn:\n"
apt-get install -y sshpass
sshpass -p "vagrant" scp -o StrictHostKeyChecking=no vagrant@192.168.205.10:/etc/kubeadm_join_master.sh .
IP_ADDR=`ifconfig enp0s8 | grep Mask | awk '{print $2}'| cut -f2 -d:`
sed -i '$ s/$/ --apiserver-advertise-address '"$IP_ADDR"'/' kubeadm_join_master.sh
sh ./kubeadm_join_master.sh
SCRIPT
$configureNode = <<-SCRIPT
echo -e "\nThis is worker:\n"
apt-get install -y sshpass
sshpass -p "vagrant" scp -o StrictHostKeyChecking=no vagrant@192.168.205.10:/etc/kubeadm_join_cmd.sh .
sh ./kubeadm_join_cmd.sh
SCRIPT
Vagrant.configure("2") do |config|
servers.each do |opts|
config.vm.define opts[:name] do |config|
config.vm.box = opts[:box]
config.vm.box_version = opts[:box_version]
config.vm.hostname = opts[:name]
config.vm.network :private_network, ip: opts[:eth1]
config.vm.network :forwarded_port, guest: 22, host: 2222, id: "ssh", disabled: true
config.vm.network :forwarded_port, guest: 22, host: 2200, auto_correct: true
config.vm.provider "virtualbox" do |v|
v.name = opts[:name]
v.customize ["modifyvm", :id, "--groups", "/Ballerina Development"]
v.customize ["modifyvm", :id, "--memory", opts[:mem]]
v.customize ["modifyvm", :id, "--cpus", opts[:cpu]]
end
if opts[:type] == "master"
config.vm.provision "shell", inline: $configureBox
config.vm.provision "shell", inline: $configureMaster
config.vm.provision "file", source: "../airship-host-config", destination: "/home/vagrant/airship-host-config/airship-host-config"
elsif opts[:type] == "lbhaproxy"
config.vm.provision "shell", :path => "haproxy.sh"
elsif opts[:type] == "master-join"
config.vm.provision "shell", inline: $configureBox
config.vm.provision "shell", inline: $configureMasterJoin
else
config.vm.provision "shell", inline: $configureBox
config.vm.provision "shell", inline: $configureNode
end
end
end
end

View File

@ -1,839 +0,0 @@
---
# Source: calico/templates/calico-config.yaml
# This ConfigMap is used to configure a self-hosted Calico installation.
kind: ConfigMap
apiVersion: v1
metadata:
name: calico-config
namespace: kube-system
data:
# Typha is disabled.
typha_service_name: "none"
# Configure the backend to use.
calico_backend: "bird"
# Configure the MTU to use for workload interfaces and the
# tunnels. For IPIP, set to your network MTU - 20; for VXLAN
# set to your network MTU - 50.
veth_mtu: "1440"
# The CNI network configuration to install on each node. The special
# values in this config will be automatically populated.
cni_network_config: |-
{
"name": "k8s-pod-network",
"cniVersion": "0.3.1",
"plugins": [
{
"type": "calico",
"log_level": "info",
"datastore_type": "kubernetes",
"nodename": "__KUBERNETES_NODE_NAME__",
"mtu": __CNI_MTU__,
"ipam": {
"type": "calico-ipam"
},
"policy": {
"type": "k8s"
},
"kubernetes": {
"kubeconfig": "__KUBECONFIG_FILEPATH__"
}
},
{
"type": "portmap",
"snat": true,
"capabilities": {"portMappings": true}
},
{
"type": "bandwidth",
"capabilities": {"bandwidth": true}
}
]
}
---
# Source: calico/templates/kdd-crds.yaml
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: bgpconfigurations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BGPConfiguration
plural: bgpconfigurations
singular: bgpconfiguration
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: bgppeers.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BGPPeer
plural: bgppeers
singular: bgppeer
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: blockaffinities.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: BlockAffinity
plural: blockaffinities
singular: blockaffinity
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: clusterinformations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: ClusterInformation
plural: clusterinformations
singular: clusterinformation
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: felixconfigurations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: FelixConfiguration
plural: felixconfigurations
singular: felixconfiguration
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: globalnetworkpolicies.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: GlobalNetworkPolicy
plural: globalnetworkpolicies
singular: globalnetworkpolicy
shortNames:
- gnp
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: globalnetworksets.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: GlobalNetworkSet
plural: globalnetworksets
singular: globalnetworkset
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: hostendpoints.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: HostEndpoint
plural: hostendpoints
singular: hostendpoint
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamblocks.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMBlock
plural: ipamblocks
singular: ipamblock
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamconfigs.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMConfig
plural: ipamconfigs
singular: ipamconfig
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ipamhandles.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPAMHandle
plural: ipamhandles
singular: ipamhandle
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: ippools.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: IPPool
plural: ippools
singular: ippool
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: kubecontrollersconfigurations.crd.projectcalico.org
spec:
scope: Cluster
group: crd.projectcalico.org
version: v1
names:
kind: KubeControllersConfiguration
plural: kubecontrollersconfigurations
singular: kubecontrollersconfiguration
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: networkpolicies.crd.projectcalico.org
spec:
scope: Namespaced
group: crd.projectcalico.org
version: v1
names:
kind: NetworkPolicy
plural: networkpolicies
singular: networkpolicy
---
apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
metadata:
name: networksets.crd.projectcalico.org
spec:
scope: Namespaced
group: crd.projectcalico.org
version: v1
names:
kind: NetworkSet
plural: networksets
singular: networkset
---
---
# Source: calico/templates/rbac.yaml
# Include a clusterrole for the kube-controllers component,
# and bind it to the calico-kube-controllers serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-kube-controllers
rules:
# Nodes are watched to monitor for deletions.
- apiGroups: [""]
resources:
- nodes
verbs:
- watch
- list
- get
# Pods are queried to check for existence.
- apiGroups: [""]
resources:
- pods
verbs:
- get
# IPAM resources are manipulated when nodes are deleted.
- apiGroups: ["crd.projectcalico.org"]
resources:
- ippools
verbs:
- list
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
- ipamblocks
- ipamhandles
verbs:
- get
- list
- create
- update
- delete
# kube-controllers manages hostendpoints.
- apiGroups: ["crd.projectcalico.org"]
resources:
- hostendpoints
verbs:
- get
- list
- create
- update
- delete
# Needs access to update clusterinformations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- clusterinformations
verbs:
- get
- create
- update
# KubeControllersConfiguration is where it gets its config
- apiGroups: ["crd.projectcalico.org"]
resources:
- kubecontrollersconfigurations
verbs:
# read its own config
- get
# create a default if none exists
- create
# update status
- update
# watch for changes
- watch
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-kube-controllers
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-kube-controllers
subjects:
- kind: ServiceAccount
name: calico-kube-controllers
namespace: kube-system
---
# Include a clusterrole for the calico-node DaemonSet,
# and bind it to the calico-node serviceaccount.
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: calico-node
rules:
# The CNI plugin needs to get pods, nodes, and namespaces.
- apiGroups: [""]
resources:
- pods
- nodes
- namespaces
verbs:
- get
- apiGroups: [""]
resources:
- endpoints
- services
verbs:
# Used to discover service IPs for advertisement.
- watch
- list
# Used to discover Typhas.
- get
# Pod CIDR auto-detection on kubeadm needs access to config maps.
- apiGroups: [""]
resources:
- configmaps
verbs:
- get
- apiGroups: [""]
resources:
- nodes/status
verbs:
# Needed for clearing NodeNetworkUnavailable flag.
- patch
# Calico stores some configuration information in node annotations.
- update
# Watch for changes to Kubernetes NetworkPolicies.
- apiGroups: ["networking.k8s.io"]
resources:
- networkpolicies
verbs:
- watch
- list
# Used by Calico for policy information.
- apiGroups: [""]
resources:
- pods
- namespaces
- serviceaccounts
verbs:
- list
- watch
# The CNI plugin patches pods/status.
- apiGroups: [""]
resources:
- pods/status
verbs:
- patch
# Calico monitors various CRDs for config.
- apiGroups: ["crd.projectcalico.org"]
resources:
- globalfelixconfigs
- felixconfigurations
- bgppeers
- globalbgpconfigs
- bgpconfigurations
- ippools
- ipamblocks
- globalnetworkpolicies
- globalnetworksets
- networkpolicies
- networksets
- clusterinformations
- hostendpoints
- blockaffinities
verbs:
- get
- list
- watch
# Calico must create and update some CRDs on startup.
- apiGroups: ["crd.projectcalico.org"]
resources:
- ippools
- felixconfigurations
- clusterinformations
verbs:
- create
- update
# Calico stores some configuration information on the node.
- apiGroups: [""]
resources:
- nodes
verbs:
- get
- list
- watch
# These permissions are only requried for upgrade from v2.6, and can
# be removed after upgrade or on fresh installations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- bgpconfigurations
- bgppeers
verbs:
- create
- update
# These permissions are required for Calico CNI to perform IPAM allocations.
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
- ipamblocks
- ipamhandles
verbs:
- get
- list
- create
- update
- delete
- apiGroups: ["crd.projectcalico.org"]
resources:
- ipamconfigs
verbs:
- get
# Block affinities must also be watchable by confd for route aggregation.
- apiGroups: ["crd.projectcalico.org"]
resources:
- blockaffinities
verbs:
- watch
# The Calico IPAM migration needs to get daemonsets. These permissions can be
# removed if not upgrading from an installation using host-local IPAM.
- apiGroups: ["apps"]
resources:
- daemonsets
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: calico-node
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: calico-node
subjects:
- kind: ServiceAccount
name: calico-node
namespace: kube-system
---
# Source: calico/templates/calico-node.yaml
# This manifest installs the calico-node container, as well
# as the CNI plugins and network config on
# each master and worker node in a Kubernetes cluster.
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: calico-node
namespace: kube-system
labels:
k8s-app: calico-node
spec:
selector:
matchLabels:
k8s-app: calico-node
updateStrategy:
type: RollingUpdate
rollingUpdate:
maxUnavailable: 1
template:
metadata:
labels:
k8s-app: calico-node
annotations:
# This, along with the CriticalAddonsOnly toleration below,
# marks the pod as a critical add-on, ensuring it gets
# priority scheduling and that its resources are reserved
# if it ever gets evicted.
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
nodeSelector:
kubernetes.io/os: linux
hostNetwork: true
tolerations:
# Make sure calico-node gets scheduled on all nodes.
- effect: NoSchedule
operator: Exists
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- effect: NoExecute
operator: Exists
serviceAccountName: calico-node
# Minimize downtime during a rolling upgrade or deletion; tell Kubernetes to do a "force
# deletion": https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods.
terminationGracePeriodSeconds: 0
priorityClassName: system-node-critical
initContainers:
# This container performs upgrade from host-local IPAM to calico-ipam.
# It can be deleted if this is a fresh installation, or if you have already
# upgraded to use calico-ipam.
- name: upgrade-ipam
image: calico/cni:v3.14.0
command: ["/opt/cni/bin/calico-ipam", "-upgrade"]
env:
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
volumeMounts:
- mountPath: /var/lib/cni/networks
name: host-local-net-dir
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
securityContext:
privileged: true
# This container installs the CNI binaries
# and CNI network config file on each node.
- name: install-cni
image: calico/cni:v3.14.0
command: ["/install-cni.sh"]
env:
# Name of the CNI config file to create.
- name: CNI_CONF_NAME
value: "10-calico.conflist"
# The CNI network config to install on each node.
- name: CNI_NETWORK_CONFIG
valueFrom:
configMapKeyRef:
name: calico-config
key: cni_network_config
# Set the hostname based on the k8s node name.
- name: KUBERNETES_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# CNI MTU Config variable
- name: CNI_MTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# Prevents the container from sleeping forever.
- name: SLEEP
value: "false"
volumeMounts:
- mountPath: /host/opt/cni/bin
name: cni-bin-dir
- mountPath: /host/etc/cni/net.d
name: cni-net-dir
securityContext:
privileged: true
# Adds a Flex Volume Driver that creates a per-pod Unix Domain Socket to allow Dikastes
# to communicate with Felix over the Policy Sync API.
- name: flexvol-driver
image: calico/pod2daemon-flexvol:v3.14.0
volumeMounts:
- name: flexvol-driver-host
mountPath: /host/driver
securityContext:
privileged: true
containers:
# Runs calico-node container on each Kubernetes node. This
# container programs network policy and routes on each
# host.
- name: calico-node
image: calico/node:v3.14.0
env:
# Use Kubernetes API as the backing datastore.
- name: DATASTORE_TYPE
value: "kubernetes"
# Wait for the datastore.
- name: WAIT_FOR_DATASTORE
value: "true"
# Set based on the k8s node name.
- name: NODENAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
# Choose the backend to use.
- name: CALICO_NETWORKING_BACKEND
valueFrom:
configMapKeyRef:
name: calico-config
key: calico_backend
# Cluster type to identify the deployment type
- name: CLUSTER_TYPE
value: "k8s,bgp"
# Auto-detect the BGP IP address.
- name: IP
value: "autodetect"
# Enable IPIP
- name: CALICO_IPV4POOL_IPIP
value: "Always"
# Set MTU for tunnel device used if ipip is enabled
- name: FELIX_IPINIPMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# Set MTU for the VXLAN tunnel device.
- name: FELIX_VXLANMTU
valueFrom:
configMapKeyRef:
name: calico-config
key: veth_mtu
# The default IPv4 pool to create on startup if none exists. Pod IPs will be
# chosen from this range. Changing this value after installation will have
# no effect. This should fall within `--cluster-cidr`.
# - name: CALICO_IPV4POOL_CIDR
# value: "192.168.0.0/16"
# Disable file logging so `kubectl logs` works.
- name: CALICO_DISABLE_FILE_LOGGING
value: "true"
# Set Felix endpoint to host default action to ACCEPT.
- name: FELIX_DEFAULTENDPOINTTOHOSTACTION
value: "ACCEPT"
# Disable IPv6 on Kubernetes.
- name: FELIX_IPV6SUPPORT
value: "false"
# Set Felix logging to "info"
- name: FELIX_LOGSEVERITYSCREEN
value: "info"
- name: FELIX_HEALTHENABLED
value: "true"
securityContext:
privileged: true
resources:
requests:
cpu: 250m
livenessProbe:
exec:
command:
- /bin/calico-node
- -felix-live
- -bird-live
periodSeconds: 10
initialDelaySeconds: 10
failureThreshold: 6
readinessProbe:
exec:
command:
- /bin/calico-node
- -felix-ready
- -bird-ready
periodSeconds: 10
volumeMounts:
- mountPath: /lib/modules
name: lib-modules
readOnly: true
- mountPath: /run/xtables.lock
name: xtables-lock
readOnly: false
- mountPath: /var/run/calico
name: var-run-calico
readOnly: false
- mountPath: /var/lib/calico
name: var-lib-calico
readOnly: false
- name: policysync
mountPath: /var/run/nodeagent
volumes:
# Used by calico-node.
- name: lib-modules
hostPath:
path: /lib/modules
- name: var-run-calico
hostPath:
path: /var/run/calico
- name: var-lib-calico
hostPath:
path: /var/lib/calico
- name: xtables-lock
hostPath:
path: /run/xtables.lock
type: FileOrCreate
# Used to install CNI.
- name: cni-bin-dir
hostPath:
path: /opt/cni/bin
- name: cni-net-dir
hostPath:
path: /etc/cni/net.d
# Mount in the directory for host-local IPAM allocations. This is
# used when upgrading from host-local to calico-ipam, and can be removed
# if not using the upgrade-ipam init container.
- name: host-local-net-dir
hostPath:
path: /var/lib/cni/networks
# Used to create per-pod Unix Domain Sockets
- name: policysync
hostPath:
type: DirectoryOrCreate
path: /var/run/nodeagent
# Used to install Flex Volume Driver
- name: flexvol-driver-host
hostPath:
type: DirectoryOrCreate
path: /usr/libexec/kubernetes/kubelet-plugins/volume/exec/nodeagent~uds
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-node
namespace: kube-system
---
# Source: calico/templates/calico-kube-controllers.yaml
# See https://github.com/projectcalico/kube-controllers
apiVersion: apps/v1
kind: Deployment
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
spec:
# The controllers can only have a single active instance.
replicas: 1
selector:
matchLabels:
k8s-app: calico-kube-controllers
strategy:
type: Recreate
template:
metadata:
name: calico-kube-controllers
namespace: kube-system
labels:
k8s-app: calico-kube-controllers
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
nodeSelector:
kubernetes.io/os: linux
tolerations:
# Mark the pod as a critical add-on for rescheduling.
- key: CriticalAddonsOnly
operator: Exists
- key: node-role.kubernetes.io/master
effect: NoSchedule
serviceAccountName: calico-kube-controllers
priorityClassName: system-cluster-critical
containers:
- name: calico-kube-controllers
image: calico/kube-controllers:v3.14.0
env:
# Choose which controllers to run.
- name: ENABLED_CONTROLLERS
value: node
- name: DATASTORE_TYPE
value: kubernetes
readinessProbe:
exec:
command:
- /usr/bin/check-status
- -r
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: calico-kube-controllers
namespace: kube-system
---
# Source: calico/templates/calico-etcd-secrets.yaml
---
# Source: calico/templates/calico-typha.yaml
---
# Source: calico/templates/configure-canal.yaml

View File

@ -1 +0,0 @@
KUBELET_EXTRA_ARGS=

View File

@ -1,73 +0,0 @@
#!/bin/bash
if [ ! -f /etc/haproxy/haproxy.cfg ]; then
# Install haproxy
sudo sed -i "/^[^#]*PasswordAuthentication[[:space:]]no/c\PasswordAuthentication yes" /etc/ssh/sshd_config
sudo service sshd restart
/usr/bin/apt-get -y install haproxy
cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.orig
# Configure haproxy
cat > /etc/default/haproxy <<EOD
# Set ENABLED to 1 if you want the init script to start haproxy.
ENABLED=1
# Add extra flags here.
#EXTRAOPTS="-de -m 16"
EOD
cat > /etc/haproxy/haproxy.cfg <<EOD
# /etc/haproxy/haproxy.cfg
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
log /dev/log local0
log /dev/log local1 notice
daemon
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 1
timeout http-request 10s
timeout queue 20s
timeout connect 5s
timeout client 20s
timeout server 20s
timeout http-keep-alive 10s
timeout check 10s
#---------------------------------------------------------------------
# apiserver frontend which proxys to the masters
#---------------------------------------------------------------------
frontend k8-apiserver
bind *:443
mode tcp
option tcplog
default_backend k8-apiserver
#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend k8-apiserver
option httpchk GET /healthz
http-check expect status 200
mode tcp
option ssl-hello-chk
balance roundrobin
server k8s-api-1 192.168.205.10:6443 check
server k8s-api-2 192.168.205.11:6443 check
server k8s-api-3 192.168.205.12:6443 check
EOD
/usr/sbin/service haproxy restart
fi

View File

@ -0,0 +1,25 @@
# Copyright 2017 The Openstack-Helm Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- hosts: primary
vars:
logs_dir: "/tmp/logs"
roles:
- gather-system-logs
- airship-gather-runtime-logs
tasks:
- name: save logs for ephemeral cluster
include_role:
name: airship-gather-pod-logs

View File

@ -0,0 +1,29 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- hosts: primary
tasks:
- name: "set default gate scripts"
set_fact:
gate_scripts_default:
- ./tools/deployment/host_config/01_install_kubectl.sh
- ./tools/deployment/host_config/10_test_config.sh
- ./tools/deployment/host_config/20_deploy_k8cluster_vagrant.sh
- ./tools/deployment/host_config/30_deploy_host_config.sh
- ./tools/deployment/host_config/40_test_host_config_cr.sh
- name: "Run gate scripts"
include_role:
name: hostconfig-run-script
vars:
gate_script_path: "{{ item }}"
with_items: "{{ gate_scripts | default(gate_scripts_default) }}"

View File

@ -0,0 +1,15 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- hosts: all
roles:
- docker-install

View File

@ -0,0 +1,14 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
kctl_context: kind-hostconfig
kubeconfig: "{{ default(ansible_env.HOME) }}/.kube/kubeconfig"

View File

@ -0,0 +1,66 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# NOTE this role has been copied from https://github.com/openstack/openstack-helm-infra/blob/8617c8c1e0ea5fc55d652ccd2a8c2eedf16f69ad/roles/gather-pod-logs/tasks/main.yaml
- name: "creating directory for pod logs"
file:
path: "{{ logs_dir }}/pod-logs/{{ kctl_context }}"
state: directory
- name: "creating directory for failed pod logs"
file:
path: "{{ logs_dir }}/pod-logs/{{ kctl_context }}/failed-pods"
state: directory
- name: "retrieve all container logs, current and previous (if they exist)"
shell: |-
set -e
export KUBECONFIG="{{ kubeconfig }}"
PARALLELISM_FACTOR=2
function get_namespaces () {
kubectl get namespaces -o name | awk -F '/' '{ print $NF }'
}
function get_pods () {
NAMESPACE=$1
kubectl get pods -n ${NAMESPACE} -o name | awk -F '/' '{ print $NF }' | xargs -L1 -P 1 -I {} echo ${NAMESPACE} {}
}
export -f get_pods
function get_pod_logs () {
NAMESPACE=${1% *}
POD=${1#* }
INIT_CONTAINERS=$(kubectl get pod $POD -n ${NAMESPACE} -o jsonpath='{.spec.initContainers[*].name}')
CONTAINERS=$(kubectl get pod $POD -n ${NAMESPACE} -o jsonpath='{.spec.containers[*].name}')
for CONTAINER in ${INIT_CONTAINERS} ${CONTAINERS}; do
echo "${NAMESPACE}/${POD}/${CONTAINER}"
mkdir -p "{{ logs_dir }}/pod-logs/{{ kctl_context }}/${NAMESPACE}/${POD}"
mkdir -p "{{ logs_dir }}/pod-logs/{{ kctl_context }}/failed-pods/${NAMESPACE}/${POD}"
kubectl logs ${POD} -n ${NAMESPACE} -c ${CONTAINER} > "{{ logs_dir }}/pod-logs/{{ kctl_context }}/${NAMESPACE}/${POD}/${CONTAINER}.txt"
kubectl logs --previous ${POD} -n ${NAMESPACE} -c ${CONTAINER} > "{{ logs_dir }}/pod-logs/{{ kctl_context }}/failed-pods/${NAMESPACE}/${POD}/${CONTAINER}.txt"
done
}
export -f get_pod_logs
kubectl config use-context {{ kctl_context | default("kind-hostconfig") }}
get_namespaces | \
xargs -r -n 1 -P ${PARALLELISM_FACTOR} -I {} bash -c 'get_pods "$@"' _ {} | \
xargs -r -n 2 -P ${PARALLELISM_FACTOR} -I {} bash -c 'get_pod_logs "$@"' _ {}
args:
executable: /bin/bash
ignore_errors: True
- name: "Downloads pod logs to executor"
synchronize:
src: "{{ logs_dir }}/pod-logs"
dest: "{{ zuul.executor.log_root }}/{{ inventory_hostname }}"
mode: pull
ignore_errors: True

View File

@ -0,0 +1,45 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- name: populate service facts
service_facts:
- name: set runtime logs dir
set_fact:
runtime_logs_dir: "{{ logs_dir }}/runtime"
- name: ensure directory for runtime logs exists
file:
state: directory
path: "{{ runtime_logs_dir }}"
- name: dump docker logs
shell: |-
journalctl --unit "docker" --no-pager > "{{ runtime_logs_dir }}/docker.log"
when: ansible_facts.services['docker.service'] is defined
args:
executable: /bin/bash
become: yes
- name: dump containerd logs
shell: |-
journalctl --unit "containerd" --no-pager > "{{ runtime_logs_dir }}/containerd.log"
when: ansible_facts.services['containerd.service'] is defined
args:
executable: /bin/bash
become: yes
- name: "Downloads logs to executor"
synchronize:
src: "{{ runtime_logs_dir }}"
dest: "{{ zuul.executor.log_root }}/{{ inventory_hostname }}"
mode: pull

View File

@ -0,0 +1,28 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
docker_config_path: "/etc/docker"
docker_config_log_driver: "journald"
docker_config_log_opts: {}
docker_config: |
{
"log-driver": "{{ docker_config_log_driver }}",
"log-opts": {{ docker_config_log_opts | to_json }}
}
proxy:
enabled: false
http:
https:
noproxy:

View File

@ -0,0 +1,80 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
---
- name: Ensuring docker and support packages are present
become: yes
package:
name:
- docker.io
- runc
update_cache: yes
state: present
- name: Ensure docker group exists
group:
name: docker
state: present
- name: Add user "{{ ansible_user }}" to docker group
become: yes
user:
name: "{{ ansible_user }}"
groups:
- docker
append: yes
- name: Reset ssh connection to add docker group to user
meta: reset_connection
ignore_errors: true
- block:
- name: Create docker directory
file:
path: /etc/systemd/system/docker.service.d/
state: directory
mode: '0755'
- name: Configure proxy for docker if enabled
template:
src: http-proxy-conf.j2
dest: /etc/systemd/system/docker.service.d/http-proxy.conf
when: proxy.enabled|bool == true
become: yes
- name: Create docker directory
file:
path: "{{ docker_config_path }}"
state: directory
mode: '0755'
become: yes
- name: Save docker daemon configuration
copy:
content: "{{ docker_config | to_nice_json }}"
dest: "{{ docker_config_path }}/daemon.json"
become: yes
- name: Start docker
become: yes
systemd:
name: docker
state: restarted
daemon_reload: yes
enabled: true
- name: Change group ownership on docker sock
become: yes
file:
path: /var/run/docker.sock
group: docker

View File

@ -0,0 +1,4 @@
[Service]
Environment="HTTP_PROXY={{ proxy.http }}"
Environment="HTTPS_PROXY={{ proxy.https }}"
Environment="NO_PROXY={{ proxy.noproxy }}"

View File

@ -0,0 +1,25 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- name: install docker
include_role:
name: docker-install
- name: check if docker is installed
shell: "docker version"
register: docker_version
- name: verify docker is installed
assert:
that:
- docker_version.rc == 0

View File

@ -0,0 +1,41 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- name: "creating directory for system status"
file:
path: "{{ logs_dir }}/system"
state: directory
- name: "Get logs for each host"
become: yes
shell: |-
set -x
systemd-cgls --full --all --no-pager > {{ logs_dir }}/system/systemd-cgls.txt
ip addr > {{ logs_dir }}/system/ip-addr.txt
ip route > {{ logs_dir }}/system/ip-route.txt
lsblk > {{ logs_dir }}/system/lsblk.txt
mount > {{ logs_dir }}/system/mount.txt
docker images > {{ logs_dir }}/system/docker-images.txt
ps aux --sort=-%mem > {{ logs_dir }}/system/ps.txt
netstat -plantu > {{ logs_dir }}/system/netstat.txt
iptables -vxnL > {{ logs_dir }}/system/iptables_filter.txt
iptables -t nat -vxnL > {{ logs_dir }}/system/iptables_nat.txt
args:
executable: /bin/bash
ignore_errors: True
- name: "Downloads logs to executor"
synchronize:
src: "{{ logs_dir }}/system"
dest: "{{ zuul.executor.log_root }}/{{ inventory_hostname }}"
mode: pull
ignore_errors: True

View File

@ -0,0 +1,21 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- name: "Run script {{ gate_script_path }}"
shell: |
set -xe;
{{ gate_script_path }}
args:
chdir: "{{ zuul.project.src_dir }}"
environment:
remote_work_dir: "{{ ansible_user_dir }}/{{ zuul.project.src_dir }}"
zuul_site_mirror_fqdn: "{{ zuul_site_mirror_fqdn }}"

View File

@ -0,0 +1,27 @@
#!/bin/bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This downloads kind, puts it in a temp directory, and prints the directory
set -e
KIND_VERSION="v0.8.1"
echo "Installing Kind Version $KIND_VERSION"
: ${KIND_URL:="https://kind.sigs.k8s.io/dl/$KIND_VERSION/kind-$(uname)-amd64"}
TMP=$(mktemp -d)
KIND="${TMP}/kind"
wget -O ${KIND} ${KIND_URL}
chmod +x ${KIND}
sudo cp ${KIND} /usr/local/bin

View File

@ -0,0 +1,24 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
: ${KUBE_VERSION:="v1.17.4"}
# Install kubectl
URL="https://storage.googleapis.com"
sudo wget -O /usr/local/bin/kubectl \
"${URL}"/kubernetes-release/release/"${KUBE_VERSION}"/bin/linux/amd64/kubectl
sudo chmod +x /usr/local/bin/kubectl

View File

@ -0,0 +1,23 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
RELEASE_VERSION=v0.8.0
echo "Installing Operator-SDK to build image"
wget -O operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
chmod +x operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
sudo mv operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu /usr/local/bin/operator-sdk

View File

@ -0,0 +1,58 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
#Default wait timeout is 3600 seconds
export TIMEOUT=${TIMEOUT:-3600}
REMOTE_WORK_DIR=/tmp
echo "Create Kind Cluster"
cat <<EOF > ${REMOTE_WORK_DIR}/kind-hostconfig.yaml
kind: Cluster
apiVersion: kind.sigs.k8s.io/v1alpha3
nodes:
- role: control-plane
- role: control-plane
- role: control-plane
- role: worker
- role: worker
EOF
kind create cluster --config ${REMOTE_WORK_DIR}/kind-hostconfig.yaml --name hostconfig -v 2
#Wait till HostConfig Cluster is ready
end=$(($(date +%s) + $TIMEOUT))
echo "Waiting $TIMEOUT seconds for HostConfig Cluster to be ready."
hosts=(`kubectl get nodes -o wide | awk '{print $1}' | sed -e '1d'`)
for i in "${!hosts[@]}"
do
while true; do
if (kubectl --request-timeout 20s get nodes ${hosts[i]} -o jsonpath='{.status.conditions[?(@.type=="Ready")].status}' | grep -q True) ; then
echo -e "\nHostConfig Cluster Nodes are ready."
kubectl --request-timeout 20s get nodes
break
else
now=$(date +%s)
if [ $now -gt $end ]; then
echo -e "\nHostConfig Cluster Nodes were not ready before TIMEOUT."
exit 1
fi
fi
echo -n .
sleep 15
done
done

View File

@ -0,0 +1,40 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
hosts=(`kubectl get nodes -o wide | awk '{print $1}' | sed -e '1d'`)
hosts_ips=(`kubectl get nodes -o wide | awk '{print $6}' | sed -e '1d'`)
export USERNAME=${USERNAME:-"hostconfig"}
export PASSWORD=${PASSWORD:-"hostconfig"}
# Installing openssl, sshpass and jq modules
sudo apt-get install -y openssl sshpass jq
ENCRYPTED_PASSWORD=`openssl passwd -crypt $PASSWORD`
# Configuring SSH on Kubernetes nodes
for i in "${!hosts[@]}"
do
sudo docker exec ${hosts[i]} apt-get update
sudo docker exec ${hosts[i]} apt-get install -y sudo openssh-server
sudo docker exec ${hosts[i]} service sshd start
sudo docker exec ${hosts[i]} useradd -m -p $ENCRYPTED_PASSWORD -s /bin/bash $USERNAME
sudo docker exec ${hosts[i]} bash -c "echo '$USERNAME ALL=(ALL) NOPASSWD:ALL' >> /etc/sudoers.d/hostconfig"
printf 'Working on host %s with Indexs and having IP %s\n' "${hosts[i]}" "$i" "${hosts_ips[i]}"
ssh-keygen -q -t rsa -N '' -f ${hosts[i]}
sshpass -p $PASSWORD ssh-copy-id -o StrictHostKeyChecking=no -i ${hosts[i]} $USERNAME@${hosts_ips[i]}
kubectl create secret generic ${hosts[i]} --from-literal=username=$USERNAME --from-file=ssh_private_key=${hosts[i]}
kubectl annotate node ${hosts[i]} secret=${hosts[i]}
done

View File

@ -0,0 +1,36 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
# Labeling kubernetes nodes with role
kubectl label node hostconfig-control-plane kubernetes.io/role=master
kubectl label node hostconfig-control-plane2 kubernetes.io/role=master
kubectl label node hostconfig-control-plane3 kubernetes.io/role=master
kubectl label node hostconfig-worker kubernetes.io/role=worker
kubectl label node hostconfig-worker2 kubernetes.io/role=worker
# Labeling kubernetes nodes with region
kubectl label node hostconfig-control-plane topology.kubernetes.io/region=us-east
kubectl label node hostconfig-control-plane2 topology.kubernetes.io/region=us-west
kubectl label node hostconfig-control-plane3 topology.kubernetes.io/region=us-east
kubectl label node hostconfig-worker topology.kubernetes.io/region=us-east
kubectl label node hostconfig-worker2 topology.kubernetes.io/region=us-west
# Labeling kubernetes nodes with zone
kubectl label node hostconfig-control-plane topology.kubernetes.io/zone=us-east-1a
kubectl label node hostconfig-control-plane2 topology.kubernetes.io/zone=us-west-1a
kubectl label node hostconfig-control-plane3 topology.kubernetes.io/zone=us-east-1b
kubectl label node hostconfig-worker topology.kubernetes.io/zone=us-east-1a
kubectl label node hostconfig-worker2 topology.kubernetes.io/zone=us-west-1a

View File

@ -0,0 +1,46 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
export HOSTCONFIG_WS=${HOSTCONFIG_WS:-$PWD}
export HOSTCONFIG=${HOSTCONFIG:-"$HOSTCONFIG_WS/airship-host-config"}
export IMAGE_NAME=${IMAGE_NAME:-"airship-hostconfig:local"}
# Building hostconfig image
cd $HOSTCONFIG
operator-sdk build $IMAGE_NAME
# Copying hostconfig image to nodes
kind load docker-image $IMAGE_NAME --name hostconfig
# Deploying HostConfig Operator Pod
cd $HOSTCONFIG_WS
sed -i "s/AIRSHIP_HOSTCONFIG_IMAGE/$IMAGE_NAME/g" $HOSTCONFIG/deploy/operator.yaml
sed -i "s/PULL_POLICY/IfNotPresent/g" $HOSTCONFIG/deploy/operator.yaml
kubectl apply -f $HOSTCONFIG/deploy/crds/hostconfig.airshipit.org_hostconfigs_crd.yaml
kubectl apply -f $HOSTCONFIG/deploy/role.yaml
kubectl apply -f $HOSTCONFIG/deploy/service_account.yaml
kubectl apply -f $HOSTCONFIG/deploy/role_binding.yaml
kubectl apply -f $HOSTCONFIG/deploy/cluster_role_binding.yaml
kubectl apply -f $HOSTCONFIG/deploy/operator.yaml
kubectl wait --for=condition=available deploy --all --timeout=1000s -A
kubectl get pods -o wide
kubectl get pods -A
kubectl get nodes -o wide --show-labels

View File

@ -0,0 +1,129 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
export TIMEOUT=${TIMEOUT:-3600}
export AIRSHIP_HOSTCONFIG=${AIRSHIP_HOSTCONFIG:-$PWD}
check_status(){
hostconfig=$1
end=$(($(date +%s) + $TIMEOUT))
while true; do
# Getting the failed and unreachable nodes status
failures=$(kubectl get hostconfig $hostconfig -o jsonpath='{.status.ansibleSummary.failures}')
unreachable=$(kubectl get hostconfig $hostconfig -o jsonpath='{.status.ansibleSummary.unreachable}')
if [[ $failures == "map[]" && $unreachable == "map[]" ]]; then
kubectl get hostconfig $hostconfig -o json
hosts=$2
ok=$(kubectl get hostconfig $hostconfig -o json | jq '.status.ansibleSummary.ok | keys')
hostNames=$(kubectl get hostconfig $hostconfig -o json | jq '.status.hostConfigStatus | keys')
ok_array=${ok[@]}
hostNames_array=${hostNames[@]}
# Checking if all hosts has executed
if [ "$hosts" == "$ok_array" ] && [ "$hosts" == "$hostNames_array" ]; then
if $3; then
# Checking if the execution has happened in sequence
# based on the date command executed on the nodes at the time of execution
# Please refer to the demo_examples sample CRs for the configuration
loop=$4
shift 4
pre_hosts_date=""
for ((i=0;i<loop;i++)); do
hosts=( "${@:2:$1}" ); shift "$(( $1 + 1 ))"
pre_host_date=""
for j in "${!hosts[@]}"; do
kubectl_stdout=$(kubectl get hostconfig $hostconfig -o "jsonpath={.status.hostConfigStatus.${hosts[j]}.Execute\ shell\ command\ on\ nodes.results[0].stdout}" | head -1)
echo $kubectl_stdout
host_date=$(date --date="$kubectl_stdout" +"%s")
if [ ! -z "$pre_host_date" ]; then
differ=$((pre_host_date-host_date))
if [[ $differ -lt 0 ]]; then
differ=${differ#-}
fi
if [[ $differ -gt 5 ]] ; then
echo "HostConfig CR $hostconfig didn't execute in sequence!"
exit 1
fi
fi
pre_host_date=$host_date
hosts_date=$host_date
done
if [ ! -z "$hosts_date" ] && [ ! -z "$pre_hosts_date" ]; then
hosts_differ=$((hosts_date-pre_hosts_date ))
if [[ $hosts_differ -lt 0 ]]; then
hosts_differ=${hosts_differ#-}
fi
if [[ $hosts_differ -lt 5 ]]; then
echo "HostConfig CR $hostconfig didn't execute in sequence!"
exit 1
fi
fi
pre_hosts_date=$hosts_date
done
fi
echo "$hostconfig hostconfig executed successfully"
return 0
else
# Failing the execution is the hosts hasn't matched.
echo "$hostconfig hostconfig execution failed!"
exit 1
fi
elif [ -z "$failures" ] && [ -z "$unreachable" ]; then
# Waiting for the HostConfig CR status till timeout is reached.
now=$(date +%s)
if [ $now -gt $end ]; then
kubectl get hostconfig $hostconfig -o json
echo -e "HostConfig CR execution not completed even after timeout"
exit 1
fi
else
# Failing the execution if the HostConfig CR object execution has failed.
kubectl get hostconfig $hostconfig -o json
echo "HostConfig CR execution failed"
exit 1
fi
sleep 30
done
}
# Checking HostConfig CR with host_groups configuration
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_host_groups.yaml
hosts=("hostconfig-control-plane" "hostconfig-control-plane2" "hostconfig-control-plane3")
check_status example1 '[ "hostconfig-control-plane", "hostconfig-control-plane2", "hostconfig-control-plane3" ]' false
# Checking HostConfig CR, if nodes are executing in sequence
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_sequential.yaml
hosts1=("hostconfig-control-plane" "hostconfig-control-plane2" "hostconfig-control-plane3")
hosts2=("hostconfig-worker" "hostconfig-worker2")
check_status example2 '[ "hostconfig-control-plane", "hostconfig-control-plane2", "hostconfig-control-plane3", "hostconfig-worker", "hostconfig-worker2" ]' true 2 "${#hosts1[@]}" "${hosts1[@]}" "${#hosts2[@]}" "${hosts2[@]}"
# Checking if the nodes are matched with the given labels in the host_groups
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_match_host_groups.yaml
check_status example3 '[ "hostconfig-control-plane", "hostconfig-control-plane3", "hostconfig-worker" ]' false
# Checking if the executing is happening in sequence on the host_groups matched
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_sequential_match_host_groups.yaml
hosts1=("hostconfig-control-plane")
hosts2=("hostconfig-worker")
hosts3=("hostconfig-control-plane3")
check_status example4 '[ "hostconfig-control-plane", "hostconfig-control-plane3", "hostconfig-worker" ]' true 3 "${#hosts1[@]}" "${hosts1[@]}" "${#hosts2[@]}" "${hosts2[@]}" "${#hosts3[@]}" "${hosts3[@]}"
# Executing configuration on hosts in parallel with the given number of hosts getting executed in sequence
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_parallel.yaml
check_status example7 '[ "hostconfig-control-plane", "hostconfig-control-plane2", "hostconfig-control-plane3", "hostconfig-worker", "hostconfig-worker2" ]' false
# Executing sample sysctl and ulimit configuration on the kubernetes nodes
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_sysctl_ulimit.yaml
check_status example8 '[ "hostconfig-control-plane", "hostconfig-control-plane2", "hostconfig-control-plane3" ]' false

View File

@ -0,0 +1,86 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
export TIMEOUT=${TIMEOUT:-3600}
export AIRSHIP_HOSTCONFIG=${AIRSHIP_HOSTCONFIG:-$PWD}
check_status(){
hostconfig=$1
end=$(($(date +%s) + $TIMEOUT))
while true; do
failures=$(kubectl get hostconfig $hostconfig -o jsonpath='{.status.ansibleSummary.failures}')
unreachable=$(kubectl get hostconfig $hostconfig -o jsonpath='{.status.ansibleSummary.unreachable}')
# Checking for failures and unreachable hosts in the HostConfig CR
if [[ $failures == "map[]" && $unreachable == "map[]" ]]; then
hosts=$2
ok=$(kubectl get hostconfig $hostconfig -o json | jq '.status.ansibleSummary.ok | keys')
echo $ok
hostNames=$(kubectl get hostconfig $hostconfig -o json | jq '.status.hostConfigStatus | keys')
ok_array=${ok[@]}
hostNames_array=${hostNames[@]}
# Checking of all the hosts has executed
if [ "$hosts" == "$ok_array" ] && [ "$hosts" == "$hostNames_array" ]; then
reconcile=$(kubectl get hostconfig $hostconfig -o "jsonpath={.status.reconcileStatus.msg}")
reconcilemsg=$3
if [[ $reconcile == $reconcilemsg ]]; then
# Checking if the reconciles has completed
echo "$hostconfig hostconfig executed successfully"
return 0
else
# Waiting for reconcile executions to complete
now=$(date +%s)
if [ $now -gt $end ]; then
echo -e "HostConfig CR execution not completed even after timeout"
exit 1
fi
fi
else
echo "$hostconfig hostconfig execution failed!"
exit 1
fi
elif [ -z "$failures" ] && [ -z "$unreachable" ]; then
# Checking for status till timeout has reached
now=$(date +%s)
if [ $now -gt $end ]; then
echo -e "HostConfig CR execution not completed even after timeout"
exit 1
fi
else
# Failing execution if HostConfig CR has failed
echo "HostConfig CR execution failed"
exit 1
fi
sleep 30
done
}
# Executing HostConfig CR in simple reconcile loop
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_reconcile.yaml
hosts=("hostconfig-control-plane" "hostconfig-control-plane2" "hostconfig-control-plane3")
check_status example9 '[ "hostconfig-control-plane", "hostconfig-control-plane2", "hostconfig-control-plane3" ]' "Reconcile iterations or interval not specified. Running simple reconcile."
kubectl delete -f $AIRSHIP_HOSTCONFIG/demo_examples/example_reconcile.yaml
# Executing HostConfig CR with reconcile_iterations configuration
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_reconcile_iterations.yaml
hosts=("hostconfig-control-plane" "hostconfig-control-plane2" "hostconfig-control-plane3")
check_status example10 '[ "hostconfig-control-plane", "hostconfig-control-plane2", "hostconfig-control-plane3" ]' "Running reconcile completed. Total iterations completed are 3"
kubectl delete -f $AIRSHIP_HOSTCONFIG/demo_examples/example_reconcile_iterations.yaml
# Executing HostConfig CR with reconcile_interval configuration
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_reconcile_interval.yaml
hosts=("hostconfig-control-plane" "hostconfig-control-plane2" "hostconfig-control-plane3")
check_status example11 '[ "hostconfig-control-plane", "hostconfig-control-plane2", "hostconfig-control-plane3" ]' "Running reconcile completed. Total iterations completed are 5"
kubectl delete -f $AIRSHIP_HOSTCONFIG/demo_examples/example_reconcile_interval.yaml

View File

@ -0,0 +1,73 @@
#!/usr/bin/env bash
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -xe
export TIMEOUT=${TIMEOUT:-3600}
export AIRSHIP_HOSTCONFIG=${AIRSHIP_HOSTCONFIG:-$PWD}
check_status(){
hostconfig=$1
echo $2
end=$(($(date +%s) + $TIMEOUT))
while true; do
failures=$(kubectl get hostconfig $hostconfig -o json | jq ".status.ansibleSummary.failures | length")
unreachable=$(kubectl get hostconfig $hostconfig -o jsonpath='{.status.ansibleSummary.unreachable}')
# Checking for number of failed hosts and unreachable hosts
if [[ $failures == $3 && $unreachable == "map[]" ]]; then
hosts=$2
ok=$(kubectl get hostconfig $hostconfig -o json | jq '.status.ansibleSummary.ok | keys')
ok_array=${ok[@]}
# Checking if all the remaining hosts has executed successfully
if [ "$hosts" == "$ok_array" ]; then
echo "$hostconfig hostconfig executed successfully"
return 0
else
echo "$hostconfig hostconfig execution failed!"
exit 1
fi
elif [ $failures == 0 ] && [ -z "$unreachable" ]; then
# Waiting for HostConfig CR to complete and stopping after timeout
now=$(date +%s)
if [ $now -gt $end ]; then
echo -e "HostConfig CR execution not completed even after timeout"
exit 1
fi
else
# Stopping execution of incase if the HostConfig CR fails
echo "HostConfig CR execution failed"
exit 1
fi
sleep 30
done
}
# Removing sudo access for the user so that the execution node fails
sudo docker exec hostconfig-control-plane3 rm -rf /etc/sudoers.d/hostconfig
# Executing the stop_on_failure example
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_stop_on_failure.yaml
check_status example6 '[ "hostconfig-control-plane", "hostconfig-control-plane2" ]' 1
# Executing the max failure nodes example
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_max_percentage.yaml
check_status example5 '[ "hostconfig-control-plane", "hostconfig-control-plane2" ]' 1
kubectl delete -f $AIRSHIP_HOSTCONFIG/demo_examples/example_max_percentage.yaml
# Failing more master nodes
sudo docker exec hostconfig-control-plane2 rm -rf /etc/sudoers.d/hostconfig
kubectl apply -f $AIRSHIP_HOSTCONFIG/demo_examples/example_max_percentage.yaml
check_status example5 '[ "hostconfig-control-plane" ]' 2
kubectl delete -f $AIRSHIP_HOSTCONFIG/demo_examples/example_max_percentage.yaml

View File

@ -1,16 +1,11 @@
#!/bin/bash
hosts=(`kubectl get nodes -o wide | awk '{print $1}' | sed -e '1d'`)
hosts_ips=(`kubectl get nodes -o wide | awk '{print $6}' | sed -e '1d'`)
if [[ $1 ]] && [[ $2 ]]; then
USERNAME=$1
PASSWORD=$2
get_username_password(){
if [ -z "$1" ]; then USERNAME="vagrant"; else USERNAME=$1; fi
if [ -z "$2" ]; then PASSWORD="vagrant"; else PASSWORD=$2; fi
echo $USERNAME $PASSWORD
}
copy_ssh_keys(){
read USERNAME PASSWORD < <(get_username_password $1 $2)
hosts=(`kubectl get nodes -o wide | awk '{print $1}' | sed -e '1d'`)
hosts_ips=(`kubectl get nodes -o wide | awk '{print $6}' | sed -e '1d'`)
for i in "${!hosts[@]}"
do
printf 'Working on host %s with Index %s and having IP %s\n' "${hosts[i]}" "$i" "${hosts_ips[i]}"
@ -19,6 +14,6 @@ copy_ssh_keys(){
kubectl create secret generic ${hosts[i]} --from-literal=username=$USERNAME --from-file=ssh_private_key=${hosts[i]}
kubectl annotate node ${hosts[i]} secret=${hosts[i]}
done
}
copy_ssh_keys $1 $2
else
echo "Please send username/password as arguments to the script."
fi

33
zuul.d/jobs.yaml Normal file
View File

@ -0,0 +1,33 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- job:
name: airship-host-config
nodeset: airship-hostconfig-single-node
pre-run: playbooks/airship-hostconfig-deploy-docker.yaml
timeout: 3600
run: playbooks/airship-host-config.yaml
post-run: playbooks/airship-collect-logs.yaml
attempts: 1
vars:
gate_scripts:
- ./tools/deployment/00_install_kind.sh
- ./tools/deployment/01_install_kubectl.sh
- ./tools/deployment/02_install_operator_sdk.sh
- ./tools/deployment/10_create_hostconfig_cluster.sh
- ./tools/deployment/20_configure_ssh_on_nodes.sh
- ./tools/deployment/30_create_labels.sh
- ./tools/deployment/40_deploy_hostconfig_operator.sh
- ./tools/deployment/50_test_hostconfig_cr.sh
- ./tools/deployment/51_test_hostconfig_cr_reconcile.sh
- ./tools/deployment/52_test_hostconfig_cr_failure.sh
voting: false

27
zuul.d/nodesets.yaml Normal file
View File

@ -0,0 +1,27 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- nodeset:
name: airship-hostconfig-single-node
nodes:
- name: primary
label: ubuntu-bionic
- nodeset:
name: airship-hostconfig-single-16GB-bionic-node
nodes:
- name: primary
label: ubuntu-bionic-expanded
- nodeset:
name: airship-hostconfig-single-32GB-bionic-node
nodes:
- name: primary
label: ubuntu-bionic-32GB

View File

@ -13,7 +13,7 @@
- project:
check:
jobs:
- noop
- airship-host-config
gate:
jobs:
- noop