Support dcos installation on centos vm cluster

This patch provides support for installing dcos on centos using magnum.
A new coe type(dcos) is added. This provides support for vm.
Design spec and steps on how to test can be found in
contrib/drivers/dcos_centos_v1/README.md.

Public agent nodes are not supported now.

Co-Authored-By: fengbeihong (fengbeihong@gmail.com)
Co-Authored-By: vmud213 (vinay50muddu@yahoo.com)

Change-Id: I58b378b4bd6df34fd43307e4252cfbbd9bf593d8
Partially-Implements: blueprint mesos-dcos
This commit is contained in:
fengbeihong 2016-10-12 10:17:02 +08:00
parent 1f2f002c52
commit 977f3af83f
18 changed files with 2195 additions and 3 deletions

View File

@ -0,0 +1,103 @@
How to build a centos image which contains DC/OS 1.8.x
======================================================
Here is the advanced DC/OS 1.8 installation guide.
See [Advanced DC/OS Installation Guide]
(https://dcos.io/docs/1.8/administration/installing/custom/advanced/)
See [Install Docker on CentOS]
(https://dcos.io/docs/1.8/administration/installing/custom/system-requirements/install-docker-centos/)
See [Adding agent nodes]
(https://dcos.io/docs/1.8/administration/installing/custom/add-a-node/)
Create a centos image using DIB following the steps outlined in DC/OS installation guide.
1. Install and configure docker in chroot.
2. Install system requirements in chroot.
3. Download `dcos_generate_config.sh` outside chroot.
This file will be used to run `dcos_generate_config.sh --genconf` to generate
config files on the node during magnum cluster creation.
4. Some configuration changes are required for DC/OS, i.e disabling the firewalld
and adding the group named nogroup.
See comments in the script file.
Use the centos image to build a DC/OS cluster.
Command:
`magnum cluster-template-create`
`magnum cluster-create`
After all the instances with centos image are created.
1. Pass parameters to config.yaml with magnum cluster template properties.
2. Run `dcos_generate_config.sh --genconf` to generate config files.
3. Run `dcos_install.sh master` on master node and `dcos_install.sh slave` on slave node.
If we want to scale the DC/OS cluster.
Command:
`magnum cluster-update`
The same steps as cluster creation.
1. Create new instances, generate config files on them and install.
2. Or delete those agent nodes where containers are not running.
How to use magnum dcos coe
===============================================
We are assuming that magnum has been installed and the magnum path is `/opt/stack/magnum`.
1. Copy dcos magnum coe source code
$ mv -r /opt/stack/magnum/contrib/drivers/dcos_centos_v1 /opt/stack/magnum/magnum/drivers/
$ mv /opt/stack/magnum/contrib/drivers/common/dcos_* /opt/stack/magnum/magnum/drivers/common/
$ cd /opt/stack/magnum
$ sudo python setup.py install
2. Add driver in setup.cfg
dcos_centos_v1 = magnum.drivers.dcos_centos_v1.driver:Driver
3. Restart your magnum services.
4. Prepare centos image with elements dcos and docker installed
See how to build a centos image in /opt/stack/magnum/magnum/drivers/dcos_centos_v1/image/README.md
5. Create glance image
$ glance image-create --name centos-7-dcos.qcow2 \
--visibility public \
--disk-format qcow2 \
--container-format bare \
--os-distro=centos \
< centos-7-dcos.qcow2
6. Create magnum cluster template
Configure DC/OS cluster with --labels
See https://dcos.io/docs/1.8/administration/installing/custom/configuration-parameters/
$ magnum cluster-template-create --name dcos-cluster-template \
--image-id centos-7-dcos.qcow2 \
--keypair-id testkey \
--external-network-id public \
--dns-nameserver 8.8.8.8 \
--flavor-id m1.medium \
--labels oauth_enabled=false \
--coe dcos
Here is an example to specify the overlay network in DC/OS,
'dcos_overlay_network' should be json string format.
$ magnum cluster-template-create --name dcos-cluster-template \
--image-id centos-7-dcos.qcow2 \
--keypair-id testkey \
--external-network-id public \
--dns-nameserver 8.8.8.8 \
--flavor-id m1.medium \
--labels oauth_enabled=false \
--labels dcos_overlay_enable='true' \
--labels dcos_overlay_config_attempts='6' \
--labels dcos_overlay_mtu='9001' \
--labels dcos_overlay_network='{"vtep_subnet": "44.128.0.0/20",\
"vtep_mac_oui": "70:B3:D5:00:00:00","overlays":\
[{"name": "dcos","subnet": "9.0.0.0/8","prefix": 26}]}' \
--coe dcos
7. Create magnum cluster
$ magnum cluster-create --name dcos-cluster --cluster-template dcos-cluster-template --node-count 1
8. You need to wait for a while after magnum cluster creation completed to make
DC/OS web interface accessible.

View File

@ -0,0 +1,36 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
from magnum.drivers.dcos_centos_v1 import monitor
from magnum.drivers.dcos_centos_v1.scale_manager import DcosScaleManager
from magnum.drivers.dcos_centos_v1 import template_def
from magnum.drivers.heat import driver
class Driver(driver.HeatDriver):
@property
def provides(self):
return [
{'server_type': 'vm',
'os': 'centos',
'coe': 'dcos'},
]
def get_template_definition(self):
return template_def.DcosCentosVMTemplateDefinition()
def get_monitor(self, context, cluster):
return monitor.DcosMonitor(context, cluster)
def get_scale_manager(self, context, osclient, cluster):
return DcosScaleManager(context, osclient, cluster)

View File

@ -0,0 +1,74 @@
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from oslo_serialization import jsonutils
from magnum.common import urlfetch
from magnum.conductor import monitors
class DcosMonitor(monitors.MonitorBase):
def __init__(self, context, cluster):
super(DcosMonitor, self).__init__(context, cluster)
self.data = {}
@property
def metrics_spec(self):
return {
'memory_util': {
'unit': '%',
'func': 'compute_memory_util',
},
'cpu_util': {
'unit': '%',
'func': 'compute_cpu_util',
},
}
# See https://github.com/dcos/adminrouter#ports-summary
# Use http://<mesos-master>/mesos/ instead of http://<mesos-master>:5050
def _build_url(self, url, protocol='http', server_name='mesos', path='/'):
return protocol + '://' + url + '/' + server_name + path
def _is_leader(self, state):
return state['leader'] == state['pid']
def pull_data(self):
self.data['mem_total'] = 0
self.data['mem_used'] = 0
self.data['cpu_total'] = 0
self.data['cpu_used'] = 0
for master_addr in self.cluster.master_addresses:
mesos_master_url = self._build_url(master_addr,
server_name='mesos',
path='/state')
master = jsonutils.loads(urlfetch.get(mesos_master_url))
if self._is_leader(master):
for slave in master['slaves']:
self.data['mem_total'] += slave['resources']['mem']
self.data['mem_used'] += slave['used_resources']['mem']
self.data['cpu_total'] += slave['resources']['cpus']
self.data['cpu_used'] += slave['used_resources']['cpus']
break
def compute_memory_util(self):
if self.data['mem_total'] == 0 or self.data['mem_used'] == 0:
return 0
else:
return self.data['mem_used'] * 100 / self.data['mem_total']
def compute_cpu_util(self):
if self.data['cpu_total'] == 0 or self.data['cpu_used'] == 0:
return 0
else:
return self.data['cpu_used'] * 100 / self.data['cpu_total']

View File

@ -0,0 +1,29 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
from magnum.conductor.scale_manager import ScaleManager
from marathon import MarathonClient
class DcosScaleManager(ScaleManager):
def __init__(self, context, osclient, cluster):
super(DcosScaleManager, self).__init__(context, osclient, cluster)
def _get_hosts_with_container(self, context, cluster):
marathon_client = MarathonClient(
'http://' + cluster.api_address + '/marathon/')
hosts = set()
for task in marathon_client.list_tasks():
hosts.add(task.host)
return hosts

View File

@ -0,0 +1,28 @@
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import os
from magnum.drivers.heat import dcos_centos_template_def as dctd
class DcosCentosVMTemplateDefinition(dctd.DcosCentosTemplateDefinition):
"""DC/OS template for Centos VM."""
@property
def driver_module_path(self):
return __name__[:__name__.rindex('.')]
@property
def template_path(self):
return os.path.join(os.path.dirname(os.path.realpath(__file__)),
'templates/dcoscluster.yaml')

View File

@ -0,0 +1,674 @@
heat_template_version: 2014-10-16
description: >
This template will boot a DC/OS cluster with one or more masters
(as specified by number_of_masters, default is 1) and one or more slaves
(as specified by the number_of_slaves parameter, which
defaults to 1).
parameters:
cluster_name:
type: string
description: human readable name for the DC/OS cluster
default: my-cluster
number_of_masters:
type: number
description: how many DC/OS masters to spawn initially
default: 1
# In DC/OS, there are two types of slave nodes, public and private.
# Public slave nodes have external access and private slave nodes don't.
# Magnum only supports one type of slave nodes and I decide not to modify
# cluster template properties. So I create slave nodes as private agents.
number_of_slaves:
type: number
description: how many DC/OS agents or slaves to spawn initially
default: 1
master_flavor:
type: string
default: m1.medium
description: flavor to use when booting the master servers
slave_flavor:
type: string
default: m1.medium
description: flavor to use when booting the slave servers
server_image:
type: string
default: centos-dcos
description: glance image used to boot the server
ssh_key_name:
type: string
description: name of ssh key to be provisioned on our server
external_network:
type: string
description: uuid/name of a network to use for floating ip addresses
default: public
fixed_network:
type: string
description: uuid/name of an existing network to use to provision machines
default: ""
fixed_subnet:
type: string
description: uuid/name of an existing subnet to use to provision machines
default: ""
fixed_network_cidr:
type: string
description: network range for fixed ip network
default: 10.0.0.0/24
dns_nameserver:
type: string
description: address of a dns nameserver reachable in your environment
http_proxy:
type: string
description: http proxy address for docker
default: ""
https_proxy:
type: string
description: https proxy address for docker
default: ""
no_proxy:
type: string
description: no proxies for docker
default: ""
######################################################################
#
# Rexray Configuration
#
trustee_domain_id:
type: string
description: domain id of the trustee
default: ""
trustee_user_id:
type: string
description: user id of the trustee
default: ""
trustee_username:
type: string
description: username of the trustee
default: ""
trustee_password:
type: string
description: password of the trustee
default: ""
hidden: true
trust_id:
type: string
description: id of the trust which is used by the trustee
default: ""
hidden: true
######################################################################
#
# Rexray Configuration
#
volume_driver:
type: string
description: volume driver to use for container storage
default: ""
username:
type: string
description: user name
tenant_name:
type: string
description: >
tenant_name is used to isolate access to cloud resources
domain_name:
type: string
description: >
domain is to define the administrative boundaries for management
of Keystone entities
region_name:
type: string
description: a logically separate section of the cluster
rexray_preempt:
type: string
description: >
enables any host to take control of a volume irrespective of whether
other hosts are using the volume
default: "false"
auth_url:
type: string
description: url for keystone
slaves_to_remove:
type: comma_delimited_list
description: >
List of slaves to be removed when doing an update. Individual slave may
be referenced several ways: (1) The resource name (e.g.['1', '3']),
(2) The private IP address ['10.0.0.4', '10.0.0.6']. Note: the list should
be empty when doing a create.
default: []
wait_condition_timeout:
type: number
description: >
timeout for the Wait Conditions
default: 6000
password:
type: string
description: >
user password, not set in current implementation, only used to
fill in for DC/OS config file
default:
password
hidden: true
######################################################################
#
# DC/OS parameters
#
# cluster_name
exhibitor_storage_backend:
type: string
default: "static"
exhibitor_zk_hosts:
type: string
default: ""
exhibitor_zk_path:
type: string
default: ""
aws_access_key_id:
type: string
default: ""
aws_region:
type: string
default: ""
aws_secret_access_key:
type: string
default: ""
exhibitor_explicit_keys:
type: string
default: ""
s3_bucket:
type: string
default: ""
s3_prefix:
type: string
default: ""
exhibitor_azure_account_name:
type: string
default: ""
exhibitor_azure_account_key:
type: string
default: ""
exhibitor_azure_prefix:
type: string
default: ""
# master_discovery default set to "static"
# If --master-lb-enabled is specified,
# master_discovery will be set to "master_http_loadbalancer"
master_discovery:
type: string
default: "static"
# master_list
# exhibitor_address
# num_masters
####################################################
# Networking
dcos_overlay_enable:
type: string
default: ""
constraints:
- allowed_values:
- "true"
- "false"
- ""
dcos_overlay_config_attempts:
type: string
default: ""
dcos_overlay_mtu:
type: string
default: ""
dcos_overlay_network:
type: string
default: ""
dns_search:
type: string
description: >
This parameter specifies a space-separated list of domains that
are tried when an unqualified domain is entered
default: ""
# resolvers
# use_proxy
####################################################
# Performance and Tuning
check_time:
type: string
default: "true"
constraints:
- allowed_values:
- "true"
- "false"
docker_remove_delay:
type: number
default: 1
gc_delay:
type: number
default: 2
log_directory:
type: string
default: "/genconf/logs"
process_timeout:
type: number
default: 120
####################################################
# Security And Authentication
oauth_enabled:
type: string
default: "true"
constraints:
- allowed_values:
- "true"
- "false"
telemetry_enabled:
type: string
default: "true"
constraints:
- allowed_values:
- "true"
- "false"
resources:
######################################################################
#
# network resources. allocate a network and router for our server.
#
network:
type: ../../common/templates/network.yaml
properties:
existing_network: {get_param: fixed_network}
existing_subnet: {get_param: fixed_subnet}
private_network_cidr: {get_param: fixed_network_cidr}
dns_nameserver: {get_param: dns_nameserver}
external_network: {get_param: external_network}
api_lb:
type: lb.yaml
properties:
fixed_subnet: {get_attr: [network, fixed_subnet]}
external_network: {get_param: external_network}
######################################################################
#
# security groups. we need to permit network traffic of various
# sorts.
#
secgroup:
type: secgroup.yaml
######################################################################
#
# resources that expose the IPs of either the dcos master or a given
# LBaaS pool depending on whether LBaaS is enabled for the cluster.
#
api_address_lb_switch:
type: Magnum::ApiGatewaySwitcher
properties:
pool_public_ip: {get_attr: [api_lb, floating_address]}
pool_private_ip: {get_attr: [api_lb, address]}
master_public_ip: {get_attr: [dcos_masters, resource.0.dcos_master_external_ip]}
master_private_ip: {get_attr: [dcos_masters, resource.0.dcos_master_ip]}
######################################################################
#
# Master SoftwareConfig.
#
write_params_master:
type: OS::Heat::SoftwareConfig
properties:
group: script
config: {get_file: fragments/write-heat-params.sh}
inputs:
- name: HTTP_PROXY
type: String
- name: HTTPS_PROXY
type: String
- name: NO_PROXY
type: String
- name: AUTH_URL
type: String
- name: USERNAME
type: String
- name: PASSWORD
type: String
- name: TENANT_NAME
type: String
- name: VOLUME_DRIVER
type: String
- name: REGION_NAME
type: String
- name: DOMAIN_NAME
type: String
- name: REXRAY_PREEMPT
type: String
- name: CLUSTER_NAME
type: String
- name: EXHIBITOR_STORAGE_BACKEND
type: String
- name: EXHIBITOR_ZK_HOSTS
type: String
- name: EXHIBITOR_ZK_PATH
type: String
- name: AWS_ACCESS_KEY_ID
type: String
- name: AWS_REGION
type: String
- name: AWS_SECRET_ACCESS_KEY
type: String
- name: EXHIBITOR_EXPLICIT_KEYS
type: String
- name: S3_BUCKET
type: String
- name: S3_PREFIX
type: String
- name: EXHIBITOR_AZURE_ACCOUNT_NAME
type: String
- name: EXHIBITOR_AZURE_ACCOUNT_KEY
type: String
- name: EXHIBITOR_AZURE_PREFIX
type: String
- name: MASTER_DISCOVERY
type: String
- name: MASTER_LIST
type: String
- name: EXHIBITOR_ADDRESS
type: String
- name: NUM_MASTERS
type: String
- name: DCOS_OVERLAY_ENABLE
type: String
- name: DCOS_OVERLAY_CONFIG_ATTEMPTS
type: String
- name: DCOS_OVERLAY_MTU
type: String
- name: DCOS_OVERLAY_NETWORK
type: String
- name: DNS_SEARCH
type: String
- name: RESOLVERS
type: String
- name: CHECK_TIME
type: String
- name: DOCKER_REMOVE_DELAY
type: String
- name: GC_DELAY
type: String
- name: LOG_DIRECTORY
type: String
- name: PROCESS_TIMEOUT
type: String
- name: OAUTH_ENABLED
type: String
- name: TELEMETRY_ENABLED
type: String
- name: ROLES
type: String
######################################################################
#
# DC/OS configuration SoftwareConfig.
# Configuration files are readered and injected into instance.
#
dcos_config:
type: OS::Heat::SoftwareConfig
properties:
group: script
config: {get_file: fragments/configure-dcos.sh}
######################################################################
#
# Master SoftwareDeployment.
#
write_params_master_deployment:
type: OS::Heat::SoftwareDeploymentGroup
properties:
config: {get_resource: write_params_master}
servers: {get_attr: [dcos_masters, attributes, dcos_server_id]}
input_values:
HTTP_PROXY: {get_param: http_proxy}
HTTPS_PROXY: {get_param: https_proxy}
NO_PROXY: {get_param: no_proxy}
AUTH_URL: {get_param: auth_url}
USERNAME: {get_param: username}
PASSWORD: {get_param: password}
TENANT_NAME: {get_param: tenant_name}
VOLUME_DRIVER: {get_param: volume_driver}
REGION_NAME: {get_param: region_name}
DOMAIN_NAME: {get_param: domain_name}
REXRAY_PREEMPT: {get_param: rexray_preempt}
CLUSTER_NAME: {get_param: cluster_name}
EXHIBITOR_STORAGE_BACKEND: {get_param: exhibitor_storage_backend}
EXHIBITOR_ZK_HOSTS: {get_param: exhibitor_zk_hosts}
EXHIBITOR_ZK_PATH: {get_param: exhibitor_zk_path}
AWS_ACCESS_KEY_ID: {get_param: aws_access_key_id}
AWS_REGION: {get_param: aws_region}
AWS_SECRET_ACCESS_KEY: {get_param: aws_secret_access_key}
EXHIBITOR_EXPLICIT_KEYS: {get_param: exhibitor_explicit_keys}
S3_BUCKET: {get_param: s3_bucket}
S3_PREFIX: {get_param: s3_prefix}
EXHIBITOR_AZURE_ACCOUNT_NAME: {get_param: exhibitor_azure_account_name}
EXHIBITOR_AZURE_ACCOUNT_KEY: {get_param: exhibitor_azure_account_key}
EXHIBITOR_AZURE_PREFIX: {get_param: exhibitor_azure_prefix}
MASTER_DISCOVERY: {get_param: master_discovery}
MASTER_LIST: {list_join: [' ', {get_attr: [dcos_masters, dcos_master_ip]}]}
EXHIBITOR_ADDRESS: {get_attr: [api_lb, address]}
NUM_MASTERS: {get_param: number_of_masters}
DCOS_OVERLAY_ENABLE: {get_param: dcos_overlay_enable}
DCOS_OVERLAY_CONFIG_ATTEMPTS: {get_param: dcos_overlay_config_attempts}
DCOS_OVERLAY_MTU: {get_param: dcos_overlay_mtu}
DCOS_OVERLAY_NETWORK: {get_param: dcos_overlay_network}
DNS_SEARCH: {get_param: dns_search}
RESOLVERS: {get_param: dns_nameserver}
CHECK_TIME: {get_param: check_time}
DOCKER_REMOVE_DELAY: {get_param: docker_remove_delay}
GC_DELAY: {get_param: gc_delay}
LOG_DIRECTORY: {get_param: log_directory}
PROCESS_TIMEOUT: {get_param: process_timeout}
OAUTH_ENABLED: {get_param: oauth_enabled}
TELEMETRY_ENABLED: {get_param: telemetry_enabled}
ROLES: master
dcos_config_deployment:
type: OS::Heat::SoftwareDeploymentGroup
depends_on:
- write_params_master_deployment
properties:
config: {get_resource: dcos_config}
servers: {get_attr: [dcos_masters, attributes, dcos_server_id]}
######################################################################
#
# DC/OS masters. This is a resource group that will create
# <number_of_masters> masters.
#
dcos_masters:
type: OS::Heat::ResourceGroup
depends_on:
- network
properties:
count: {get_param: number_of_masters}
resource_def:
type: dcosmaster.yaml
properties:
ssh_key_name: {get_param: ssh_key_name}
server_image: {get_param: server_image}
master_flavor: {get_param: master_flavor}
external_network: {get_param: external_network}
fixed_network: {get_attr: [network, fixed_network]}
fixed_subnet: {get_attr: [network, fixed_subnet]}
secgroup_base_id: {get_attr: [secgroup, secgroup_base_id]}
secgroup_dcos_id: {get_attr: [secgroup, secgroup_dcos_id]}
api_pool_80_id: {get_attr: [api_lb, pool_80_id]}
api_pool_443_id: {get_attr: [api_lb, pool_443_id]}
api_pool_8080_id: {get_attr: [api_lb, pool_8080_id]}
api_pool_5050_id: {get_attr: [api_lb, pool_5050_id]}
api_pool_2181_id: {get_attr: [api_lb, pool_2181_id]}
api_pool_8181_id: {get_attr: [api_lb, pool_8181_id]}
######################################################################
#
# DC/OS slaves. This is a resource group that will initially
# create <number_of_slaves> public or private slaves,
# and needs to be manually scaled.
#
dcos_slaves:
type: OS::Heat::ResourceGroup
depends_on:
- network
properties:
count: {get_param: number_of_slaves}
removal_policies: [{resource_list: {get_param: slaves_to_remove}}]
resource_def:
type: dcosslave.yaml
properties:
ssh_key_name: {get_param: ssh_key_name}
server_image: {get_param: server_image}
slave_flavor: {get_param: slave_flavor}
fixed_network: {get_attr: [network, fixed_network]}
fixed_subnet: {get_attr: [network, fixed_subnet]}
external_network: {get_param: external_network}
wait_condition_timeout: {get_param: wait_condition_timeout}
secgroup_base_id: {get_attr: [secgroup, secgroup_base_id]}
# DC/OS params
auth_url: {get_param: auth_url}
username: {get_param: username}
password: {get_param: password}
tenant_name: {get_param: tenant_name}
volume_driver: {get_param: volume_driver}
region_name: {get_param: region_name}
domain_name: {get_param: domain_name}
rexray_preempt: {get_param: rexray_preempt}
http_proxy: {get_param: http_proxy}
https_proxy: {get_param: https_proxy}
no_proxy: {get_param: no_proxy}
cluster_name: {get_param: cluster_name}
exhibitor_storage_backend: {get_param: exhibitor_storage_backend}
exhibitor_zk_hosts: {get_param: exhibitor_zk_hosts}
exhibitor_zk_path: {get_param: exhibitor_zk_path}
aws_access_key_id: {get_param: aws_access_key_id}
aws_region: {get_param: aws_region}
aws_secret_access_key: {get_param: aws_secret_access_key}
exhibitor_explicit_keys: {get_param: exhibitor_explicit_keys}
s3_bucket: {get_param: s3_bucket}
s3_prefix: {get_param: s3_prefix}
exhibitor_azure_account_name: {get_param: exhibitor_azure_account_name}
exhibitor_azure_account_key: {get_param: exhibitor_azure_account_key}
exhibitor_azure_prefix: {get_param: exhibitor_azure_prefix}
master_discovery: {get_param: master_discovery}
master_list: {list_join: [' ', {get_attr: [dcos_masters, dcos_master_ip]}]}
exhibitor_address: {get_attr: [api_lb, address]}
num_masters: {get_param: number_of_masters}
dcos_overlay_enable: {get_param: dcos_overlay_enable}
dcos_overlay_config_attempts: {get_param: dcos_overlay_config_attempts}
dcos_overlay_mtu: {get_param: dcos_overlay_mtu}
dcos_overlay_network: {get_param: dcos_overlay_network}
dns_search: {get_param: dns_search}
resolvers: {get_param: dns_nameserver}
check_time: {get_param: check_time}
docker_remove_delay: {get_param: docker_remove_delay}
gc_delay: {get_param: gc_delay}
log_directory: {get_param: log_directory}
process_timeout: {get_param: process_timeout}
oauth_enabled: {get_param: oauth_enabled}
telemetry_enabled: {get_param: telemetry_enabled}
outputs:
api_address:
value: {get_attr: [api_address_lb_switch, public_ip]}
description: >
This is the API endpoint of the DC/OS master. Use this to access
the DC/OS API from outside the cluster.
dcos_master_private:
value: {get_attr: [dcos_masters, dcos_master_ip]}
description: >
This is a list of the "private" addresses of all the DC/OS masters.
dcos_master:
value: {get_attr: [dcos_masters, dcos_master_external_ip]}
description: >
This is the "public" ip address of the DC/OS master server. Use this address to
log in to the DC/OS master via ssh or to access the DC/OS API
from outside the cluster.
dcos_slaves_private:
value: {get_attr: [dcos_slaves, dcos_slave_ip]}
description: >
This is a list of the "private" addresses of all the DC/OS slaves.
dcos_slaves:
value: {get_attr: [dcos_slaves, dcos_slave_external_ip]}
description: >
This is a list of the "public" addresses of all the DC/OS slaves.

View File

@ -0,0 +1,161 @@
heat_template_version: 2014-10-16
description: >
This is a nested stack that defines a single DC/OS master, This stack is
included by a ResourceGroup resource in the parent template
(dcoscluster.yaml).
parameters:
server_image:
type: string
description: glance image used to boot the server
master_flavor:
type: string
description: flavor to use when booting the server
ssh_key_name:
type: string
description: name of ssh key to be provisioned on our server
external_network:
type: string
description: uuid/name of a network to use for floating ip addresses
fixed_network:
type: string
description: Network from which to allocate fixed addresses.
fixed_subnet:
type: string
description: Subnet from which to allocate fixed addresses.
secgroup_base_id:
type: string
description: ID of the security group for base.
secgroup_dcos_id:
type: string
description: ID of the security group for DC/OS master.
api_pool_80_id:
type: string
description: ID of the load balancer pool of Http.
api_pool_443_id:
type: string
description: ID of the load balancer pool of Https.
api_pool_8080_id:
type: string
description: ID of the load balancer pool of Marathon.
api_pool_5050_id:
type: string
description: ID of the load balancer pool of Mesos master.
api_pool_2181_id:
type: string
description: ID of the load balancer pool of Zookeeper.
api_pool_8181_id:
type: string
description: ID of the load balancer pool of Exhibitor.
resources:
######################################################################
#
# DC/OS master server.
#
dcos_master:
type: OS::Nova::Server
properties:
image: {get_param: server_image}
flavor: {get_param: master_flavor}
key_name: {get_param: ssh_key_name}
user_data_format: SOFTWARE_CONFIG
networks:
- port: {get_resource: dcos_master_eth0}
dcos_master_eth0:
type: OS::Neutron::Port
properties:
network: {get_param: fixed_network}
security_groups:
- {get_param: secgroup_base_id}
- {get_param: secgroup_dcos_id}
fixed_ips:
- subnet: {get_param: fixed_subnet}
replacement_policy: AUTO
dcos_master_floating:
type: Magnum::Optional::DcosMaster::Neutron::FloatingIP
properties:
floating_network: {get_param: external_network}
port_id: {get_resource: dcos_master_eth0}
api_pool_80_member:
type: Magnum::Optional::Neutron::LBaaS::PoolMember
properties:
pool: {get_param: api_pool_80_id}
address: {get_attr: [dcos_master_eth0, fixed_ips, 0, ip_address]}
subnet: { get_param: fixed_subnet }
protocol_port: 80
api_pool_443_member:
type: Magnum::Optional::Neutron::LBaaS::PoolMember
properties:
pool: {get_param: api_pool_443_id}
address: {get_attr: [dcos_master_eth0, fixed_ips, 0, ip_address]}
subnet: { get_param: fixed_subnet }
protocol_port: 443
api_pool_8080_member:
type: Magnum::Optional::Neutron::LBaaS::PoolMember
properties:
pool: {get_param: api_pool_8080_id}
address: {get_attr: [dcos_master_eth0, fixed_ips, 0, ip_address]}
subnet: { get_param: fixed_subnet }
protocol_port: 8080
api_pool_5050_member:
type: Magnum::Optional::Neutron::LBaaS::PoolMember
properties:
pool: {get_param: api_pool_5050_id}
address: {get_attr: [dcos_master_eth0, fixed_ips, 0, ip_address]}
subnet: { get_param: fixed_subnet }
protocol_port: 5050
api_pool_2181_member:
type: Magnum::Optional::Neutron::LBaaS::PoolMember
properties:
pool: {get_param: api_pool_2181_id}
address: {get_attr: [dcos_master_eth0, fixed_ips, 0, ip_address]}
subnet: { get_param: fixed_subnet }
protocol_port: 2181
api_pool_8181_member:
type: Magnum::Optional::Neutron::LBaaS::PoolMember
properties:
pool: {get_param: api_pool_8181_id}
address: {get_attr: [dcos_master_eth0, fixed_ips, 0, ip_address]}
subnet: { get_param: fixed_subnet }
protocol_port: 8181
outputs:
dcos_master_ip:
value: {get_attr: [dcos_master_eth0, fixed_ips, 0, ip_address]}
description: >
This is the "private" address of the DC/OS master node.
dcos_master_external_ip:
value: {get_attr: [dcos_master_floating, floating_ip_address]}
description: >
This is the "public" address of the DC/OS master node.
dcos_server_id:
value: {get_resource: dcos_master}
description: >
This is the logical id of the DC/OS master node.

View File

@ -0,0 +1,338 @@
heat_template_version: 2014-10-16
description: >
This is a nested stack that defines a single DC/OS slave, This stack is
included by a ResourceGroup resource in the parent template
(dcoscluster.yaml).
parameters:
server_image:
type: string
description: glance image used to boot the server
slave_flavor:
type: string
description: flavor to use when booting the server
ssh_key_name:
type: string
description: name of ssh key to be provisioned on our server
external_network:
type: string
description: uuid/name of a network to use for floating ip addresses
wait_condition_timeout:
type: number
description : >
timeout for the Wait Conditions
http_proxy:
type: string
description: http proxy address for docker
https_proxy:
type: string
description: https proxy address for docker
no_proxy:
type: string
description: no proxies for docker
auth_url:
type: string
description: >
url for DC/OS to authenticate before sending request
username:
type: string
description: user name
password:
type: string
description: >
user password, not set in current implementation, only used to
fill in for Kubernetes config file
hidden: true
tenant_name:
type: string
description: >
tenant_name is used to isolate access to Compute resources
volume_driver:
type: string
description: volume driver to use for container storage
region_name:
type: string
description: A logically separate section of the cluster
domain_name:
type: string
description: >
domain is to define the administrative boundaries for management
of Keystone entities
fixed_network:
type: string
description: Network from which to allocate fixed addresses.
fixed_subnet:
type: string
description: Subnet from which to allocate fixed addresses.
secgroup_base_id:
type: string
description: ID of the security group for base.
rexray_preempt:
type: string
description: >
enables any host to take control of a volume irrespective of whether
other hosts are using the volume
######################################################################
#
# DC/OS parameters
#
cluster_name:
type: string
description: human readable name for the DC/OS cluster
default: my-cluster
exhibitor_storage_backend:
type: string
exhibitor_zk_hosts:
type: string
exhibitor_zk_path:
type: string
aws_access_key_id:
type: string
aws_region:
type: string
aws_secret_access_key:
type: string
exhibitor_explicit_keys:
type: string
s3_bucket:
type: string
s3_prefix:
type: string
exhibitor_azure_account_name:
type: string
exhibitor_azure_account_key:
type: string
exhibitor_azure_prefix:
type: string
master_discovery:
type: string
master_list:
type: string
exhibitor_address:
type: string
default: 127.0.0.1
num_masters:
type: number
dcos_overlay_enable:
type: string
dcos_overlay_config_attempts:
type: string
dcos_overlay_mtu:
type: string
dcos_overlay_network:
type: string
dns_search:
type: string
resolvers:
type: string
check_time:
type: string
docker_remove_delay:
type: number
gc_delay:
type: number
log_directory:
type: string
process_timeout:
type: number
oauth_enabled:
type: string
telemetry_enabled:
type: string
resources:
slave_wait_handle:
type: OS::Heat::WaitConditionHandle
slave_wait_condition:
type: OS::Heat::WaitCondition
depends_on: dcos_slave
properties:
handle: {get_resource: slave_wait_handle}
timeout: {get_param: wait_condition_timeout}
secgroup_all_open:
type: OS::Neutron::SecurityGroup
properties:
rules:
- protocol: icmp
- protocol: tcp
- protocol: udp
######################################################################
#
# software configs. these are components that are combined into
# a multipart MIME user-data archive.
#
write_heat_params:
type: OS::Heat::SoftwareConfig
properties:
group: ungrouped
config:
str_replace:
template: {get_file: fragments/write-heat-params.sh}
params:
"$HTTP_PROXY": {get_param: http_proxy}
"$HTTPS_PROXY": {get_param: https_proxy}
"$NO_PROXY": {get_param: no_proxy}
"$AUTH_URL": {get_param: auth_url}
"$USERNAME": {get_param: username}
"$PASSWORD": {get_param: password}
"$TENANT_NAME": {get_param: tenant_name}
"$VOLUME_DRIVER": {get_param: volume_driver}
"$REGION_NAME": {get_param: region_name}
"$DOMAIN_NAME": {get_param: domain_name}
"$REXRAY_PREEMPT": {get_param: rexray_preempt}
"$CLUSTER_NAME": {get_param: cluster_name}
"$EXHIBITOR_STORAGE_BACKEND": {get_param: exhibitor_storage_backend}
"$EXHIBITOR_ZK_HOSTS": {get_param: exhibitor_zk_hosts}
"$EXHIBITOR_ZK_PATH": {get_param: exhibitor_zk_path}
"$AWS_ACCESS_KEY_ID": {get_param: aws_access_key_id}
"$AWS_REGION": {get_param: aws_region}
"$AWS_SECRET_ACCESS_KEY": {get_param: aws_secret_access_key}
"$EXHIBITOR_EXPLICIT_KEYS": {get_param: exhibitor_explicit_keys}
"$S3_BUCKET": {get_param: s3_bucket}
"$S3_PREFIX": {get_param: s3_prefix}
"$EXHIBITOR_AZURE_ACCOUNT_NAME": {get_param: exhibitor_azure_account_name}
"$EXHIBITOR_AZURE_ACCOUNT_KEY": {get_param: exhibitor_azure_account_key}
"$EXHIBITOR_AZURE_PREFIX": {get_param: exhibitor_azure_prefix}
"$MASTER_DISCOVERY": {get_param: master_discovery}
"$MASTER_LIST": {get_param: master_list}
"$EXHIBITOR_ADDRESS": {get_param: exhibitor_address}
"$NUM_MASTERS": {get_param: num_masters}
"$DCOS_OVERLAY_ENABLE": {get_param: dcos_overlay_enable}
"$DCOS_OVERLAY_CONFIG_ATTEMPTS": {get_param: dcos_overlay_config_attempts}
"$DCOS_OVERLAY_MTU": {get_param: dcos_overlay_mtu}
"$DCOS_OVERLAY_NETWORK": {get_param: dcos_overlay_network}
"$DNS_SEARCH": {get_param: dns_search}
"$RESOLVERS": {get_param: resolvers}
"$CHECK_TIME": {get_param: check_time}
"$DOCKER_REMOVE_DELAY": {get_param: docker_remove_delay}
"$GC_DELAY": {get_param: gc_delay}
"$LOG_DIRECTORY": {get_param: log_directory}
"$PROCESS_TIMEOUT": {get_param: process_timeout}
"$OAUTH_ENABLED": {get_param: oauth_enabled}
"$TELEMETRY_ENABLED": {get_param: telemetry_enabled}
"$ROLES": slave
dcos_config:
type: OS::Heat::SoftwareConfig
properties:
group: ungrouped
config: {get_file: fragments/configure-dcos.sh}
slave_wc_notify:
type: OS::Heat::SoftwareConfig
properties:
group: ungrouped
config:
str_replace:
template: |
#!/bin/bash -v
wc_notify --data-binary '{"status": "SUCCESS"}'
params:
wc_notify: {get_attr: [slave_wait_handle, curl_cli]}
dcos_slave_init:
type: OS::Heat::MultipartMime
properties:
parts:
- config: {get_resource: write_heat_params}
- config: {get_resource: dcos_config}
- config: {get_resource: slave_wc_notify}
######################################################################
#
# a single DC/OS slave.
#
dcos_slave:
type: OS::Nova::Server
properties:
image: {get_param: server_image}
flavor: {get_param: slave_flavor}
key_name: {get_param: ssh_key_name}
user_data_format: RAW
user_data: {get_resource: dcos_slave_init}
networks:
- port: {get_resource: dcos_slave_eth0}
dcos_slave_eth0:
type: OS::Neutron::Port
properties:
network: {get_param: fixed_network}
security_groups:
- get_resource: secgroup_all_open
- get_param: secgroup_base_id
fixed_ips:
- subnet: {get_param: fixed_subnet}
dcos_slave_floating:
type: Magnum::Optional::DcosSlave::Neutron::FloatingIP
properties:
floating_network: {get_param: external_network}
port_id: {get_resource: dcos_slave_eth0}
outputs:
dcos_slave_ip:
value: {get_attr: [dcos_slave_eth0, fixed_ips, 0, ip_address]}
description: >
This is the "private" address of the DC/OS slave node.
dcos_slave_external_ip:
value: {get_attr: [dcos_slave_floating, floating_ip_address]}
description: >
This is the "public" address of the DC/OS slave node.

View File

@ -0,0 +1,187 @@
#!/bin/bash
. /etc/sysconfig/heat-params
GENCONF_SCRIPT_DIR=/opt/dcos
sudo mkdir -p $GENCONF_SCRIPT_DIR/genconf
sudo chown -R centos $GENCONF_SCRIPT_DIR/genconf
# Configure ip-detect
cat > $GENCONF_SCRIPT_DIR/genconf/ip-detect <<EOF
#!/usr/bin/env bash
set -o nounset -o errexit
export PATH=/usr/sbin:/usr/bin:\$PATH
echo \$(ip addr show eth0 | grep -Eo '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' | head -1)
EOF
# Configure config.yaml
CONFIG_YAML_FILE=$GENCONF_SCRIPT_DIR/genconf/config.yaml
####################################################
# Cluster Setup
# bootstrap_url is not configurable
echo "bootstrap_url: file://$GENCONF_SCRIPT_DIR/genconf/serve" > $CONFIG_YAML_FILE
# cluster_name
echo "cluster_name: $CLUSTER_NAME" >> $CONFIG_YAML_FILE
# exhibitor_storage_backend
if [ "static" == "$EXHIBITOR_STORAGE_BACKEND" ]; then
echo "exhibitor_storage_backend: static" >> $CONFIG_YAML_FILE
elif [ "zookeeper" == "$EXHIBITOR_STORAGE_BACKEND" ]; then
echo "exhibitor_storage_backend: zookeeper" >> $CONFIG_YAML_FILE
echo "exhibitor_zk_hosts: $EXHIBITOR_ZK_HOSTS" >> $CONFIG_YAML_FILE
echo "exhibitor_zk_path: $EXHIBITOR_ZK_PATH" >> $CONFIG_YAML_FILE
elif [ "aws_s3" == "$EXHIBITOR_STORAGE_BACKEND" ]; then
echo "exhibitor_storage_backend: aws_s3" >> $CONFIG_YAML_FILE
echo "aws_access_key_id: $AWS_ACCESS_KEY_ID" >> $CONFIG_YAML_FILE
echo "aws_region: $AWS_REGIION" >> $CONFIG_YAML_FILE
echo "aws_secret_access_key: $AWS_SECRET_ACCESS_KEY" >> $CONFIG_YAML_FILE
echo "exhibitor_explicit_keys: $EXHIBITOR_EXPLICIT_KEYS" >> $CONFIG_YAML_FILE
echo "s3_bucket: $S3_BUCKET" >> $CONFIG_YAML_FILE
echo "s3_prefix: $S3_PREFIX" >> $CONFIG_YAML_FILE
elif [ "azure" == "$EXHIBITOR_STORAGE_BACKEND" ]; then
echo "exhibitor_storage_backend: azure" >> $CONFIG_YAML_FILE
echo "exhibitor_azure_account_name: $EXHIBITOR_AZURE_ACCOUNT_NAME" >> $CONFIG_YAML_FILE
echo "exhibitor_azure_account_key: $EXHIBITOR_AZURE_ACCOUNT_KEY" >> $CONFIG_YAML_FILE
echo "exhibitor_azure_prefix: $EXHIBITOR_AZURE_PREFIX" >> $CONFIG_YAML_FILE
fi
# master_discovery
if [ "static" == "$MASTER_DISCOVERY" ]; then
echo "master_discovery: static" >> $CONFIG_YAML_FILE
echo "master_list:" >> $CONFIG_YAML_FILE
for ip in $MASTER_LIST; do
echo "- ${ip}" >> $CONFIG_YAML_FILE
done
elif [ "master_http_loadbalancer" == "$MASTER_DISCOVERY" ]; then
echo "master_discovery: master_http_loadbalancer" >> $CONFIG_YAML_FILE
echo "exhibitor_address: $EXHIBITOR_ADDRESS" >> $CONFIG_YAML_FILE
echo "num_masters: $NUM_MASTERS" >> $CONFIG_YAML_FILE
echo "master_list:" >> $CONFIG_YAML_FILE
for ip in $MASTER_LIST; do
echo "- ${ip}" >> $CONFIG_YAML_FILE
done
fi
####################################################
# Networking
# dcos_overlay_enable
if [ "false" == "$DCOS_OVERLAY_ENABLE" ]; then
echo "dcos_overlay_enable: false" >> $CONFIG_YAML_FILE
elif [ "true" == "$DCOS_OVERLAY_ENABLE" ]; then
echo "dcos_overlay_enable: true" >> $CONFIG_YAML_FILE
echo "dcos_overlay_config_attempts: $DCOS_OVERLAY_CONFIG_ATTEMPTS" >> $CONFIG_YAML_FILE
echo "dcos_overlay_mtu: $DCOS_OVERLAY_MTU" >> $CONFIG_YAML_FILE
echo "dcos_overlay_network:" >> $CONFIG_YAML_FILE
echo "$DCOS_OVERLAY_NETWORK" >> $CONFIG_YAML_FILE
fi
# dns_search
if [ -n "$DNS_SEARCH" ]; then
echo "dns_search: $DNS_SEARCH" >> $CONFIG_YAML_FILE
fi
# resolvers
echo "resolvers:" >> $CONFIG_YAML_FILE
for ip in $RESOLVERS; do
echo "- ${ip}" >> $CONFIG_YAML_FILE
done
# use_proxy
if [ -n "$HTTP_PROXY" ] && [ -n "$HTTPS_PROXY" ]; then
echo "use_proxy: true" >> $CONFIG_YAML_FILE
echo "http_proxy: $HTTP_PROXY" >> $CONFIG_YAML_FILE
echo "https_proxy: $HTTPS_PROXY" >> $CONFIG_YAML_FILE
if [ -n "$NO_PROXY" ]; then
echo "no_proxy:" >> $CONFIG_YAML_FILE
for ip in $NO_PROXY; do
echo "- ${ip}" >> $CONFIG_YAML_FILE
done
fi
fi
####################################################
# Performance and Tuning
# check_time
if [ "false" == "$CHECK_TIME" ]; then
echo "check_time: false" >> $CONFIG_YAML_FILE
fi
# docker_remove_delay
if [ "1" != "$DOCKER_REMOVE_DELAY" ]; then
echo "docker_remove_delay: $DOCKER_REMOVE_DELAY" >> $CONFIG_YAML_FILE
fi
# gc_delay
if [ "2" != "$GC_DELAY" ]; then
echo "gc_delay: $GC_DELAY" >> $CONFIG_YAML_FILE
fi
# log_directory
if [ "/genconf/logs" != "$LOG_DIRECTORY" ]; then
echo "log_directory: $LOG_DIRECTORY" >> $CONFIG_YAML_FILE
fi
# process_timeout
if [ "120" != "$PROCESS_TIMEOUT" ]; then
echo "process_timeout: $PROCESS_TIMEOUT" >> $CONFIG_YAML_FILE
fi
####################################################
# Security And Authentication
# oauth_enabled
if [ "false" == "$OAUTH_ENABLED" ]; then
echo "oauth_enabled: false" >> $CONFIG_YAML_FILE
fi