diff --git a/doc/source/index.rst b/doc/source/index.rst index 69cb8e8d83..7ac29cbb64 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -29,7 +29,6 @@ Contents: test-infra-requirements sysadmin systems - infra-cloud .. toctree:: :hidden: diff --git a/doc/source/infra-cloud.rst b/doc/source/infra-cloud.rst deleted file mode 100644 index dd0bf7c91a..0000000000 --- a/doc/source/infra-cloud.rst +++ /dev/null @@ -1,249 +0,0 @@ -:title: Infra Cloud - -.. _infra_cloud: - -Infra Cloud -########### - -Introduction -============ - -With donated hardware and datacenter space, we can run an optimized -semi-private cloud for the purpose of adding testing capacity and also -with an eye for "dog fooding" OpenStack itself. - -Current Status -============== - -Currently this cloud is in the planning and design phases. This section -will be updated or removed as that changes. - -Mission -======= - -The infra-cloud's mission is to turn donated raw hardware resources into -expanded capacity for the OpenStack infrastructure nodepool. - -Methodology -=========== - -Infra-cloud is run like any other infra managed service. Puppet modules -and Ansible do the bulk of configuring hosts, and Gerrit code review -drives 99% of activities, with logins used only for debugging and -repairing the service. - -Requirements -============ - - * Compute - The intended workload is mostly nodepool launched Jenkins - slaves. Thus flavors that are capable of running these tests in a - reasonable amount of time must be available. The flavor(s) must provide: - - * 8GB RAM - - * 8 * `vcpu` - - * 30GB root disk - - * Images - Image upload must be allowed for nodepool. - - * Uptime - Because there are other clouds that can keep some capacity - running, 99.9% uptime should be acceptable. - - * Performance - The performance of compute and networking in infra-cloud - should be at least as good as, if not better than, the other nodepool - clouds that infra uses today. - - * Infra-core - Infra-core is in charge of running the service. - -Implementation -============== - -Multi-Site ----------- - -Despite the servers being in the same physical location and network, -they are divided in at least two logical "sites", vanilla and chocolate, -Each site will have its own cloud, and these clouds will share no data. - -Vanilla -~~~~~~~ - -The vanilla cloud has 48 machines. Each machine has 96G of RAM, 1.8TiB of disk and -24 Cores of Intel Xeon X5650 @ 2.67GHz processors. - -Chocolate -~~~~~~~~~ - -The chocolate cloud has 100 machines. Each machine has 96G of RAM, 1.8TiB of disk and -32 Cores of Intel Xeon E5-2670 0 @ 2.60GHz processors. - -Software --------- - -Infra-cloud runs the most recent OpenStack stable release. During the -period following a release, plans must be made to upgrade as soon as -possible. In the future the cloud may be continuously deployed. - -Management ----------- - - * Currently a single "Ironic Controller" is installed by hand and used by both - sites. That machine is enrolled into the puppet/ansible infrastructure and - can be reached at baremetal00.vanilla.ic.openstack.org. - - * The "Ironic Controller" will have bifrost installed on it. All of the - other machines in that site will be enrolled in the Ironic that bifrost - manages. bifrost will be responsible for booting base OS with IP address - and ssh key for each machine. - - * You can interact with the Bifrost Ironic installation by sourcing - ``/opt/stack/bifrost/env-vars`` then running the ironic cli client (for - example: ``ironic node-list``). - - * The machines will all be added to a manual ansible inventory file adjacent - to the dynamic inventory that ansible currently uses to run puppet. Any - metadata that the ansible infrastructure for running puppet needs that - would have come from OpenStack infrastructure will simply be put into - static ansible group_vars. - - * The static inventory should be put into puppet so that it is public, with - the IPMI passwords in hiera. - - * An OpenStack Cloud with KVM as the hypervisor will be installed using - OpenStack puppet modules as per normal infra installation of services. - - * As with all OpenStack services, metrics will be collected in public - cacti and graphite services. The particular metrics are TBD. - - * As a cloud has a large amount of pertinent log data, a public ELK cluster - will be needed to capture and expose it. - - * All Infra services run on the public internet, and the same will be true - for the Infra Clouds and the Ironic Clouds. Insecure services that need - to be accessible across machine boundaries will employ per-IP iptables - rules rather then relying on a squishy middle. - -Architecture ------------- - -The generally accepted "Controller" and "Compute" layout is used, -with controllers running all non-compute services and compute nodes -running only nova-compute and supporting services. - - * The cloud is deployed with two controllers in a DRBD storage pair - with ACTIVE/PASSIVE configured and a VIP shared between the two. - This is done to avoid complications with Galera and RabbitMQ at - the cost of making failovers more painful and under-utilizing the - passive stand-by controller. - - * The cloud will use KVM because it is the default free hypervisor and - has the widest user base in OpenStack. - - * The cloud will use Neutron configured for Provider VLAN because we - do not require tenant isolation and this simplifies our networking on - compute nodes. - - * The cloud will not use floating IPs because every node will need to be - reachable via routable IPs and thus there is no need for separation. Also - Nodepool is under our control, so we don't have to worry about DNS TTLs - or anything else causing a need for a particular endpoint to remain at - a stable IP. - - * The cloud will not use security groups because these are single use VMs - and they will configure any firewall inside the VM. - - * The cloud will use MySQL because it is the default in OpenStack and has - the widest user base. - - * The cloud will use RabbitMQ because it is the default in OpenStack and - has the widest user base. We don't have scaling demands that come close - to pushing the limits of RabbitMQ. - - * The cloud will run swift as a backend for glance so that we can scale - image storage out as need arises. - - * The cloud will run keystone v3 and glance v2 APIs because these are the - versions upstream recommends using. - - * The cloud will run keystone on port 443. - - * The cloud will not use the glance task API for image uploads, it will use - the PUT interface because the task API does not function and we are not - expecting a wide user base to be uploading many images simultaneously. - - * The cloud will provide DHCP directly to its nodes because we trust DHCP. - - * The cloud will have config drive enabled because we believe it to be more - robust than the EC2-style metadata service. - - * The cloud will not have the meta-data service enabled because we do not - believe it to be robust. - -Networking ----------- - -Neutron is used, with a single `provider VLAN`_ attached to VMs for the -simplest possible networking. DHCP is configured to hand the machine a -routable IP which can be reached directly from the internet to facilitate -nodepool/zuul communications. - -.. _provider VLAN: http://docs.openstack.org/networking-guide/scenario-provider-lb.html - -Each site will need 2 VLANs. One for the public IPs which every NIC of every -host will be attached to. That VLAN will get a publicly routable /23. Also, -there should be a second VLAN that is connected only to the NIC of the -Ironic Cloud and is routed to the IPMI management network of all of the other -nodes. Whether we use LinuxBridge or Open vSwitch is still TBD. - -SSL ---- - -Since we are the single user of Infracloud we have configured Vanilla and -Chocolate controllers to use the snakeoil ssl certs for each controller. -This gives us simple to generate certs with long lifetimes which we can trust -directly by asserting trust against the public cert. - -If you need to update certs in one of the clouds simply run:: - - /usr/sbin/make-ssl-cert generate-default-snakeoil --force-overwrite - -on the controller in question. Then copy the contents of -``/etc/ssl/certs/ssl-cert-snakeoil.pem`` to public system-config hiera and -``/etc/ssl/private/ssl-cert-snakeoil.key`` to private hiera on the -puppetmaster. - -Puppet will then ensure we trust the public key everywhere that talks to the -controller (puppetmaster, nodepool, controller itself, compute nodes, etc) -and deploy the private key so that it is used by services. - -Troubleshooting -=============== - -Regenerating images -------------------- - -When redeploying servers with bifrost, we may have the need to refresh the image -that is deployed to them, because we may need to add some packages, update the -elements that we use, consume latest versions of projects... - -To generate an image, you need to follow these steps:: - - 1. In the baremetal server, remove everything under /httpboot directory. - This will clean the generated qcow2 image that is consumed by servers. - - 2. If there is a need to also update the CoreOS image, remove everything - under /tftpboot directory. This will clean the ramdisk image that is - used when PXE booting. - - 3. Run the install playbook again, so it generates the image. You need to - be sure that you pass the skip_install flag, to avoid the update of all - the bifrost related projects (ironic, dib, etc...): - - ansible-playbook -vvv -e @/etc/bifrost/bifrost_global_vars \ - -e skip_install=true \ - -i /opt/stack/bifrost/playbooks/inventory/bifrost_inventory.py \ - /opt/stack/bifrost/playbooks/install.yaml - - 4. After the install finishes, you can redeploy the servers again - using ``run_bifrost.sh`` script. diff --git a/modules/openstack_project/manifests/infracloud/baremetal.pp b/modules/openstack_project/manifests/infracloud/baremetal.pp deleted file mode 100644 index 7df7b5e7ba..0000000000 --- a/modules/openstack_project/manifests/infracloud/baremetal.pp +++ /dev/null @@ -1,38 +0,0 @@ -# == Class: openstack_project::baremetal -# -class openstack_project::infracloud::baremetal ( - $ironic_inventory, - $ironic_db_password, - $ipmi_passwords, - $mysql_password, - $ssh_private_key, - $ssh_public_key, - $bridge_name, - $vlan, - $gateway_ip, - $default_network_interface, - $dhcp_pool_start, - $dhcp_pool_end, - $network_interface, - $ipv4_nameserver, - $ipv4_subnet_mask, -) { - class { '::infracloud::bifrost': - bridge_name => $bridge_name, - ironic_inventory => $ironic_inventory, - ironic_db_password => $ironic_db_password, - mysql_password => $mysql_password, - ipmi_passwords => $ipmi_passwords, - ssh_private_key => $ssh_private_key, - ssh_public_key => $ssh_public_key, - vlan => $vlan, - gateway_ip => $gateway_ip, - default_network_interface => $default_network_interface, - dhcp_pool_start => $dhcp_pool_start, - dhcp_pool_end => $dhcp_pool_end, - network_interface => $network_interface, - ipv4_nameserver => $ipv4_nameserver, - ipv4_subnet_mask => $ipv4_subnet_mask, - } - -} diff --git a/modules/openstack_project/manifests/infracloud/base.pp b/modules/openstack_project/manifests/infracloud/base.pp deleted file mode 100644 index 7755171536..0000000000 --- a/modules/openstack_project/manifests/infracloud/base.pp +++ /dev/null @@ -1,10 +0,0 @@ -# == Class: openstack_project::infracloud::base -# -# A template host with no running services -# -class openstack_project::infracloud::base ( -) { - class { '::unbound': - install_resolv_conf => true, - } -} diff --git a/modules/openstack_project/manifests/infracloud/compute.pp b/modules/openstack_project/manifests/infracloud/compute.pp deleted file mode 100644 index fc5ab3ab4f..0000000000 --- a/modules/openstack_project/manifests/infracloud/compute.pp +++ /dev/null @@ -1,21 +0,0 @@ -class openstack_project::infracloud::compute ( - $nova_rabbit_password, - $neutron_rabbit_password, - $neutron_admin_password, - $ssl_key_file_contents, - $ssl_cert_file_contents, - $br_name, - $controller_public_address, -) { - include ::openstack_project::infracloud::base - - class { '::infracloud::compute': - nova_rabbit_password => $nova_rabbit_password, - neutron_rabbit_password => $neutron_rabbit_password, - neutron_admin_password => $neutron_admin_password, - ssl_key_file_contents => $ssl_key_file_contents, - ssl_cert_file_contents => $ssl_cert_file_contents, - br_name => $br_name, - controller_public_address => $controller_public_address, - } -} diff --git a/modules/openstack_project/manifests/infracloud/controller.pp b/modules/openstack_project/manifests/infracloud/controller.pp deleted file mode 100644 index ce0c6e301d..0000000000 --- a/modules/openstack_project/manifests/infracloud/controller.pp +++ /dev/null @@ -1,53 +0,0 @@ -class openstack_project::infracloud::controller ( - $keystone_rabbit_password, - $neutron_rabbit_password, - $nova_rabbit_password, - $root_mysql_password, - $keystone_mysql_password, - $glance_mysql_password, - $neutron_mysql_password, - $nova_mysql_password, - $glance_admin_password, - $keystone_admin_password, - $neutron_admin_password, - $nova_admin_password, - $keystone_admin_token, - $ssl_key_file_contents, - $ssl_cert_file_contents, - $br_name, - $controller_public_address = $::fqdn, - $openstackci_password = 'tmpvalue', - $openstackci_email = 'infra-root@openstack.org', - $openstackjenkins_password = 'tmpvalue', - $openstackjenkins_email = 'infra-root@openstack.org', - $neutron_subnet_cidr, - $neutron_subnet_gateway, - $neutron_subnet_allocation_pools, - $mysql_max_connections = 1024, -) { - include ::openstack_project::infracloud::base - - class { '::infracloud::controller': - keystone_rabbit_password => $keystone_rabbit_password, - neutron_rabbit_password => $neutron_rabbit_password, - nova_rabbit_password => $nova_rabbit_password, - root_mysql_password => $root_mysql_password, - keystone_mysql_password => $keystone_mysql_password, - glance_mysql_password => $glance_mysql_password, - neutron_mysql_password => $neutron_mysql_password, - nova_mysql_password => $nova_mysql_password, - keystone_admin_password => $keystone_admin_password, - glance_admin_password => $glance_admin_password, - neutron_admin_password => $neutron_admin_password, - nova_admin_password => $nova_admin_password, - keystone_admin_token => $keystone_admin_token, - ssl_key_file_contents => $ssl_key_file_contents, - ssl_cert_file_contents => $ssl_cert_file_contents, - br_name => $br_name, - controller_public_address => $controller_public_address, - neutron_subnet_cidr => $neutron_subnet_cidr, - neutron_subnet_gateway => $neutron_subnet_gateway, - neutron_subnet_allocation_pools => $neutron_subnet_allocation_pools, - mysql_max_connections => $mysql_max_connections, - } -} diff --git a/playbooks/allow_all_traffic_default_secgroup.yml b/playbooks/allow_all_traffic_default_secgroup.yml deleted file mode 100644 index 67cd66ca49..0000000000 --- a/playbooks/allow_all_traffic_default_secgroup.yml +++ /dev/null @@ -1,28 +0,0 @@ ---- -- hosts: localhost - connection: local - gather_facts: false - user: root - roles: - - { role: allow_all_traffic_default_secgroup, os_client_config_cloud: 'openstackci-infracloud-west' } - -- hosts: localhost - connection: local - gather_facts: false - user: root - roles: - - { role: allow_all_traffic_default_secgroup, os_client_config_cloud: 'openstackjenkins-infracloud-west' } - -- hosts: localhost - connection: local - gather_facts: false - user: root - roles: - - { role: allow_all_traffic_default_secgroup, os_client_config_cloud: 'openstackci-infracloud-east' } - -- hosts: localhost - connection: local - gather_facts: false - user: root - roles: - - { role: allow_all_traffic_default_secgroup, os_client_config_cloud: 'openstackjenkins-infracloud-east' } diff --git a/playbooks/infracloud/manage_power.yml b/playbooks/infracloud/manage_power.yml deleted file mode 100755 index 046b6656e8..0000000000 --- a/playbooks/infracloud/manage_power.yml +++ /dev/null @@ -1,16 +0,0 @@ -# This playbook will allow to manage power state of the baremetal group -# power_state setting needs to be passed, with on/off values -# Following settings are available (can be passed with -e flag): -# - target: group or host where to run the play. -etarget=baremetal will run -# this play in all servers managed by bifrost. -# - power_state: It will take off/on values, and will set the servers to -# this power state. ---- -- hosts: "{{ target }}" - connection: local - gather_facts: true - tasks: - - name: 'Manage power state of the given host' - shell: "ironic node-set-power-state '{{ inventory_hostname }}' '{{ power_state }}'" - failed_when: ( power_state != 'on' ) and ( power_state != 'off' ) - diff --git a/playbooks/remote_puppet_afs.yaml b/playbooks/remote_puppet_afs.yaml index 2d5b0d875b..7359a6a445 100644 --- a/playbooks/remote_puppet_afs.yaml +++ b/playbooks/remote_puppet_afs.yaml @@ -1,4 +1,3 @@ ---- - hosts: "afs*:!disabled" strategy: free roles: diff --git a/playbooks/remote_puppet_else.yaml b/playbooks/remote_puppet_else.yaml index 6dc0346ff9..e4ef951e31 100644 --- a/playbooks/remote_puppet_else.yaml +++ b/playbooks/remote_puppet_else.yaml @@ -1,4 +1,4 @@ -- hosts: 'puppet:!review:!git0*:!zuul-scheduler:!afs*:!baremetal*:!controller*:!compute*:!puppetmaster*:!disabled' +- hosts: 'puppet:!review:!git0*:!zuul-scheduler:!afs*:!puppetmaster*:!disabled' strategy: free roles: - puppet diff --git a/playbooks/remote_puppet_infracloud.yaml b/playbooks/remote_puppet_infracloud.yaml deleted file mode 100644 index 26453f24df..0000000000 --- a/playbooks/remote_puppet_infracloud.yaml +++ /dev/null @@ -1,10 +0,0 @@ -- hosts: "controller*.ic.openstack.org:!disabled" - serial: 1 - roles: - - puppet - -- hosts: "compute*.ic.openstack.org:!disabled" - max_fail_percentage: 100 - serial: "10%" - roles: - - puppet diff --git a/playbooks/remote_puppet_infracloud_baremetal.yaml b/playbooks/remote_puppet_infracloud_baremetal.yaml deleted file mode 100644 index 3bb219197a..0000000000 --- a/playbooks/remote_puppet_infracloud_baremetal.yaml +++ /dev/null @@ -1,4 +0,0 @@ ---- -- hosts: "baremetal*.ic.openstack.org:!disabled" - roles: - - puppet diff --git a/playbooks/roles/allow_all_traffic_default_secgroup/tasks/main.yml b/playbooks/roles/allow_all_traffic_default_secgroup/tasks/main.yml deleted file mode 100644 index c5a0247aab..0000000000 --- a/playbooks/roles/allow_all_traffic_default_secgroup/tasks/main.yml +++ /dev/null @@ -1,21 +0,0 @@ -- name: Delete any previously default security group rules - shell: /usr/local/bin/openstack security group rule delete "{{ item }}" - environment: - OS_CLOUD: "{{ os_client_config_cloud }}" - with_lines: OS_CLOUD="{{ os_client_config_cloud }}" /usr/local/bin/openstack security group rule list -f value -c ID default - -- name: Allow all IPv4 traffic on default security group - os_security_group_rule: - cloud: "{{ os_client_config_cloud }}" - security_group: default - direction: ingress - ethertype: IPv4 - remote_ip_prefix: 0.0.0.0/0 - -- name: Allow all IPv6 traffic on default security group - os_security_group_rule: - cloud: "{{ os_client_config_cloud }}" - security_group: default - direction: ingress - ethertype: IPv6 - remote_ip_prefix: ::0/0 diff --git a/playbooks/set_infracloud_project_quotas.yml b/playbooks/set_infracloud_project_quotas.yml deleted file mode 100644 index 045e7b3251..0000000000 --- a/playbooks/set_infracloud_project_quotas.yml +++ /dev/null @@ -1,12 +0,0 @@ ---- -- hosts: localhost - connection: local - gather_facts: false - tasks: - - shell: 'openstack quota set openstackzuul --cores 800 --ram 800000 --instances 100' - environment: - OS_CLOUD: admin-infracloud-vanilla - - - shell: 'openstack quota set openstackzuul --cores 800 --ram 800000 --instances 100' - environment: - OS_CLOUD: admin-infracloud-chocolate diff --git a/run_bifrost.sh b/run_bifrost.sh deleted file mode 100755 index ef910f8000..0000000000 --- a/run_bifrost.sh +++ /dev/null @@ -1,31 +0,0 @@ -#!/bin/bash - -# Copyright 2014 Hewlett-Packard Development Company, L.P. -# -# Licensed under the Apache License, Version 2.0 (the "License"); you may -# not use this file except in compliance with the License. You may obtain -# a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT -# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the -# License for the specific language governing permissions and limitations -# under the License. - -set -e - -export BIFROST_INVENTORY_SOURCE=/opt/stack/baremetal.json - -apt-get update - -# Enroll-dynamic -ansible-playbook -e @/etc/bifrost/bifrost_global_vars -vvvv \ - -i /opt/stack/bifrost/playbooks/inventory/bifrost_inventory.py \ - /opt/stack/bifrost/playbooks/enroll-dynamic.yaml - -# Deploy-dynamic -ansible-playbook -e @/etc/bifrost/bifrost_global_vars -vvvv \ - -i /opt/stack/bifrost/playbooks/inventory/bifrost_inventory.py \ - /opt/stack/bifrost/playbooks/deploy-dynamic.yaml diff --git a/run_infracloud.sh b/run_infracloud.sh deleted file mode 100755 index 88ec2ca789..0000000000 --- a/run_infracloud.sh +++ /dev/null @@ -1,34 +0,0 @@ -#!/bin/bash - -# Copyright 2014 Hewlett-Packard Development Company, L.P. -# -# Licensed under the Apache License, Version 2.0 (the "License"); you may -# not use this file except in compliance with the License. You may obtain -# a copy of the License at -# -# http://www.apache.org/licenses/LICENSE-2.0 -# -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT -# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the -# License for the specific language governing permissions and limitations -# under the License. - -# If updating the puppet system-config repo or installing puppet modules -# fails then abort the puppet run as we will not get the results we -# expect. -set -e -export ANSIBLE_LOG_PATH=/var/log/puppet_run_all_infracloud.log -SYSTEM_CONFIG=/opt/system-config -ANSIBLE_PLAYBOOKS=$SYSTEM_CONFIG/playbooks - -# It's possible for connectivity to a server or manifest application to break -# for indeterminate periods of time, so the playbooks should be run without -# errexit -set +e - -# Run all the ansible playbooks under timeout to prevent them from getting -# stuck if they are oomkilled - -timeout -k 2m 120m ansible-playbook -f 10 ${ANSIBLE_PLAYBOOKS}/remote_puppet_infracloud_baremetal.yaml -timeout -k 2m 120m ansible-playbook -f 10 ${ANSIBLE_PLAYBOOKS}/remote_puppet_infracloud.yaml diff --git a/tools/infracloud_dns_from_bifrost.py b/tools/infracloud_dns_from_bifrost.py deleted file mode 100755 index 989be4da7c..0000000000 --- a/tools/infracloud_dns_from_bifrost.py +++ /dev/null @@ -1,13 +0,0 @@ -#!/usr/bin/python - -import yaml - -f = open('hiera/group/infracloud.yaml') - -bf = yaml.load(f.read()) - -for node in bf['ironic_inventory_hpuswest']: - name = node - ip = bf['ironic_inventory_hpuswest'][node]['ipv4_public_address'] - print "rackdns record-create --name {0} --type A".format(name), - print "--data {0} --ttl 3600 openstack.org".format(ip)