Browse Source

POC of ML2/OVS to OVN migration using ansible.

Closes-bug: 1528674
Co-authored-by: Numan Siddique <nusiddiq@redhat.com>
Co-authored-by: Chris Cuzner <cuzner@gmail.com>
Change-Id: I0ea50031d75052bf0c56ecfd9ab9267a1c7fc25e
changes/30/427230/3
Russell Bryant 5 years ago
parent
commit
e5fb48caf5
  1. 40
      migration/README.rst
  2. 37
      migration/hosts.sample
  3. 187
      migration/migrate-to-ovn.yml

40
migration/README.rst

@ -0,0 +1,40 @@
Migration from ML2/OVS to ML2/OVN
=================================
Proof-of-concept ansible script for migrating an OpenStack deployment
that uses ML2/OVS to OVN.
Prerequisites:
1. Ansible 2.2 or greater.
2. ML2/OVS must be using the OVS firewall driver.
To use:
1. Create an ansible inventory with the expected set of groups and variables
as indicated by the hosts-sample file.
2. Run the playbook::
$ ansible-playbook migrate-to-ovn.yml -i hosts
Testing Status:
- Tested on an RDO cloud on CentOS 7.3 based on Ocata.
- The cloud had 3 controller nodes and 6 compute nodes.
- Observed network downtime was 10 seconds.
- The "--forks 10" option was used with ansible-playbook to ensure
that commands could be run across the entire environment in parallel.
MTU:
- If migrating an ML2/OVS deployment using VXLAN tenant networks
to an OVN deployment using Geneve for tenant networks, we have
an unresolved issue around MTU. The VXLAN overhead is 30 bytes.
OVN with Geneve has an overhead of 38 bytes. We need the tenant
networks MTU adjusted for OVN and then we need all VMs to receive
the updated MTU value through DHCP before the migration can take
place. For testing purposes, we've just hacked the Neutron code
to indicate that the VXLAN overhead was 38 bytes instead of 30,
bypassing the issue at migration time.

37
migration/hosts.sample

@ -0,0 +1,37 @@
# All controller nodes running OpenStack control services, particularly
# neutron-api. Also indicate which controller you'd like to have run
# the OVN central control services.
[controller]
overcloud-controller-0 ovn_central=true
overcloud-controller-1
overcloud-controller-2
# All compute nodes. We will replace the openvswitch agent
# with ovn-controller on these nodes.
#
# The ovn_encap_ip variable should be filled in with the IP
# address that other compute hosts should use as the tunnel
# endpoint for tunnels to that host.
[compute]
overcloud-novacompute-0 ovn_encap_ip=192.0.2.10
overcloud-novacompute-1 ovn_encap_ip=192.0.2.11
overcloud-novacompute-2 ovn_encap_ip=192.0.2.12
overcloud-novacompute-3 ovn_encap_ip=192.0.2.13
overcloud-novacompute-4 ovn_encap_ip=192.0.2.14
overcloud-novacompute-5 ovn_encap_ip=192.0.2.15
# Configure bridge mappings to be used on compute hosts.
[compute:vars]
ovn_bridge_mappings=net1:br-em1
is_compute_node=true
[overcloud:children]
controller
compute
# Fill in "ovn_db_ip" with an IP address on a management network
# that the controller and compute nodes should reach. This address
# should not be reachable otherwise.
[overcloud:vars]
ovn_db_ip=192.0.2.50
remote_user=heat-admin

187
migration/migrate-to-ovn.yml

@ -0,0 +1,187 @@
# Migrate a Neutron deployment using ML2/OVS to OVN.
#
# See hosts-sample for expected contents of the ansible inventory.
---
- hosts: compute
remote_user: "{{ remote_user }}"
become: true
tasks:
- name: Ensure OVN packages are installed on compute nodes.
yum:
name: openvswitch-ovn-host
state: present
# TODO to make ansible-lint happy, all of these commands should be conditionally run
# only if the config value needs to be changed.
- name: Configure ovn-encap-type.
command: "ovs-vsctl set open . external_ids:ovn-encap-type=geneve"
- name: Configure ovn-encap-ip.
command: "ovs-vsctl set open . external_ids:ovn-encap-ip={{ ovn_encap_ip }}"
- name: Configure ovn-remote.
command: "ovs-vsctl set open . external_ids:ovn-remote=tcp:{{ ovn_db_ip }}:6642"
# TODO We could discover the appropriate value for ovn-bridge-mappings based on
# the openvswitch agent configuration instead of requiring it to be configured
# in the inventory.
- name: Configure ovn-bridge-mappings.
command: "ovs-vsctl set open . external_ids:ovn-bridge-mappings={{ ovn_bridge_mappings }}"
- name: Get hostname
shell: hostname -f
register: hostname
check_mode: no
- name: Set host name
command: "ovs-vsctl set Open_vSwitch . external-ids:hostname={{ hostname.stdout }}"
# TODO ansible has an "iptables" module, but it does not allow you specify a "rule number"
# which we require here.
- name: Open Geneve UDP port for tunneling.
command: iptables -I INPUT 10 -m state --state NEW -p udp --dport 6081 -j ACCEPT
- name: Persist our iptables changes after a reboot
shell: iptables-save > /etc/sysconfig/iptables.save
# TODO Remove this once the metadata API is supported.
# https://bugs.launchpad.net/networking-ovn/+bug/1562132
- name: Force config drive until the metadata API is supported.
ini_file:
dest: /etc/nova/nova.conf
section: DEFAULT
option: force_config_drive
value: true
- name: Restart nova-compute service to reflect force_config_drive value.
systemd:
name: openstack-nova-compute
state: restarted
enabled: yes
- hosts: controller
remote_user: "{{ remote_user }}"
become: true
tasks:
- name: Ensure OVN packages are installed on the central OVN host.
when: ovn_central is defined
yum:
name: openvswitch-ovn-central
state: present
# TODO Set up SSL for OVN databases
# TODO ansible has an "iptables" module, but it does not allow you specify a "rule number"
# which we require here.
- name: Open OVN database ports.
command: "iptables -I INPUT 10 -m state --state NEW -p tcp --dport {{ item }} -j ACCEPT"
with_items: [ 6641, 6642 ]
- name: Persist our iptables changes after a reboot
shell: iptables-save > /etc/sysconfig/iptables.save
# TODO Integrate HA support for the OVN control services.
- name: Start ovn-northd and the OVN databases.
when: ovn_central is defined
systemd:
name: ovn-northd
state: started
enabled: yes
- name: Enable remote access to the northbound database.
command: "ovn-nbctl set-connection ptcp:6641:{{ ovn_db_ip }}"
when: ovn_central is defined
- name: Enable remote access to the southbound database.
command: "ovn-sbctl set-connection ptcp:6642:{{ ovn_db_ip }}"
when: ovn_central is defined
- name: Ensure the Neutron ML2 plugin is installed on neutron-api hosts.
yum:
name: python-networking-ovn
state: present
- name: Update Neutron configuration files
ini_file: dest={{ item.dest }} section={{ item.section }} option={{ item.option }} value={{ item.value }}
with_items:
- { dest: '/etc/neutron/neutron.conf', section: 'DEFAULT', option: 'service_plugins', value: 'qos,networking_ovn.l3.l3_ovn.OVNL3RouterPlugin' }
- { dest: '/etc/neutron/neutron.conf', section: 'DEFAULT', option: 'notification_drivers', value: 'ovn-qos' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ml2', option: 'mechanism_drivers', value: 'ovn' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ml2', option: 'type_drivers', value: 'geneve,vxlan,vlan,flat' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ml2', option: 'tenant_network_types', value: 'geneve' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ml2_type_geneve', option: 'vni_ranges', value: '1:65536' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ml2_type_geneve', option: 'max_header_size', value: '38' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ovn', option: 'ovn_nb_connection', value: '"tcp:{{ ovn_db_ip }}:6641"' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ovn', option: 'ovn_sb_connection', value: '"tcp:{{ ovn_db_ip }}:6642"' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ovn', option: 'ovsdb_connection_timeout', value: '180' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ovn', option: 'neutron_sync_mode', value: 'repair' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ovn', option: 'ovn_l3_mode', value: 'true' }
- { dest: '/etc/neutron/plugins/ml2/ml2_conf.ini', section: 'ovn', option: 'vif_type', value: 'ovs' }
- name: Note that API downtime begins now.
debug:
msg: NEUTRON API DOWNTIME STARTING NOW FOR THIS HOST
- name: Shut down neutron-server so that we can begin data sync to OVN.
systemd:
name: neutron-server
state: stopped
- hosts: controller
remote_user: "{{ remote_user }}"
become: true
tasks:
- name: Sync Neutron state to OVN.
when: ovn_central is defined
command: neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini
- hosts: overcloud
remote_user: "{{ remote_user }}"
become: true
tasks:
- name: Note that data plane imact starts now.
debug:
msg: DATA PLANE IMPACT BEGINS NOW.
- name: Stop metadata, DHCP, L3 and openvswitch agent if needed.
systemd: name={{ item.name }} state={{ item.state }} enabled=no
with_items:
- { name: 'neutron-metadata-agent', state: 'stopped' }
- { name: 'neutron-dhcp-agent', state: 'stopped' }
- { name: 'neutron-l3-agent', state: 'stopped' }
- { name: 'neutron-openvswitch-agent', state: 'stopped' }
- hosts: compute
remote_user: "{{ remote_user }}"
become: true
tasks:
- name: Note that data plane is being restored.
debug:
msg: DATA PLANE IS NOW BEING RESTORED.
- name: Delete br-tun as it is no longer used.
command: "ovs-vsctl del-br br-tun"
- name: Reset OpenFlow protocol version before ovn-controller takes over.
with_items: [ br-int, br-ex ]
command: "ovs-vsctl set Bridge {{ item }} protocols=[]"
ignore_errors: True
- name: Start ovn-controller.
systemd:
name: ovn-controller
state: started
enabled: yes
- hosts: controller
remote_user: "{{ remote_user }}"
become: true
tasks:
# TODO The sync util scheduling gateway routers depends on this patch:
# https://review.openstack.org/#/c/427020/
# If the patch is not merged, this command is harmless, but the gateway
# routers won't get scheduled until later when neutron-server starts.
- name: Schedule gateway routers by running the sync util.
when: ovn_central is defined
command: neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini
- hosts: overcloud
remote_user: "{{ remote_user }}"
become: true
tasks:
# TODO Make this smarter so that it only deletes net namespaces that were
# # created by neutron. In the simple case, this is fine, but will break
# # once containers are in use on the overcloud.
- name: Delete network namespaces.
command: ip -all netns delete
- hosts: controller
remote_user: "{{ remote_user }}"
become: true
tasks:
- name: Note that the Neutron API is coming back online.
debug:
msg: THE NEUTRON API IS NOW BEING RESTORED.
- name: Start neutron-server.
systemd:
name: neutron-server
state: started
# TODO In our grenade script we had to restart rabbitmq. Is that needed?
Loading…
Cancel
Save