The correct path should be hashed-container-startup-config-step_4.json
instead of hashed-docker-container-startup-config-step_4.json in
cleanup-dataplane.yml.
When this file was originally commited the path was
hashed-docker-container-startup-config-step_4.json but we have changed
it a few releases ago.
Change-Id: I901eedf64f7bd4281f94c156c33489355ff97aee
After the last call to tripleo's overcloud deploy to ensure
tripleo knows about "br-int" and not "br-migration" run an
extra round of validation.
Update the migration guide with such detail and also
extend the details of what's happening under the hood since
administrators need to know.
Change-Id: I6c9d2537c0d04b46fae3340f522ab660c56d3c1e
We remove a workaround which was applied because of [1], which
now it's fixed in openvswitch. The problem is that the workaround
introduced unnecessary downtime.
Change-Id: I20a17896bc911b59ad85eefa9c76c41c058a0c16
Closes-Bug: 1814831
We need to configure them in standalone mode (to let traffic going
around in the absence of controller), and clean up the controller.
Change-Id: If9ecf21ec178a1e73eda5684a6b41ded8e3db8b3
Closes-Bug: 1814817
During the migration process we were cleaning up all ml2/ovs
neutron agents, but, in some cases we still rely on the
neutron-dhcp-agents, and we need to avoid cleaning those up.
Change-Id: I6064bb207e27928f29d39773b30d98a2345efd43
Closes-bug: 1814812
Fixed Ansible lint errors ANSIBLE0012 and ANSIBLE0013
which address the usage of shell and command modules properly.
* When a shell/command task creates a file, it should explictly
specify it with a 'creates' argument.
* When shell is not required, use the command module.
* If a task is not changing anything (e.g. cat <file>)
it should be marked as "changed_when: false".
Also, fixed lines with 'ignore_error' to be 'ignore_errors'.
Change-Id: I1ef35242e2d1427d96320c47aff22e86bc035ada
This patch avoids the duplication/conflict of resources
(pre-migration and post-migration) if the migration was stopped
by an error and it's retried.
Also makes sure that a non-working validation will fail.
Change-Id: I5ce17e23ee52e936aeee78f64fb0df127816b1bd
Closes-Bug: #1804477
The migration script "delete-neutron-resources" was being
executed on controller-0 instead of the undercloud node regardless
of the defer_to:localhost ansible setting.
In that controller ~/overcloudrc does not exist, so the cleanup
was failing silently.
This commit moves the cleanup to a separate role that we tie
to localhost (undercloud) from ovn-migration.yml
Closes-Bug: #1804194
Change-Id: I05f2411604ba01d170440ac655491a624f98aafc
Because of [1], when a controller is disconnected and you add a new
port to an Open vSwitch bridge ovs-vswitchd crashes.
There's a patch upstream, but this will allow us to continue
without
Closes-Bug: #1798377
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1640045
Change-Id: Id47086d50167fd417d0be67b720f63be62e3ff35
This patch checks the availability of the external networks
and floating ips before generating inventory (so the admin
is alerted as soon as possible) and before starting migration
to make sure that still there are enough floating ips available.
This patch also removes the DVR flag, as the ansible scripts
are independent of the DVR configuration on the system.
Change-Id: Ifca00065ce2457437794e34fd6700394db2c3dc5
The ovn_migration.yaml playbook uses the tag 'include_role'
which is not allowed. We see the below error.
This patch fixes it
ERROR! 'include_role' is not a valid attribute for a Play
The error appears to have been in '/home/stack/ovn_migration/
tripleo_environment/playbooks/ovn-migration.yml': line 76, column 3, but may
be elsewhere in the file depending on the exact syntax problem.
This patch also fixes another error. When the pre migation VM is created, we
test it by running the command '/sbin/ip'. In some of the cirros images
the right path is '/bin/ip'. So change this to the comand 'date'.
Change-Id: Id8f8e2bbc71fee63f14339bc6041c04f3a177495
Closes-bug: #1788158
- We only have dhcp and migration top level ymls
- Role default variables like 'var: "{{ var }}"' have been removed
- Containerless variables removed, as we don't provide support for that
since Queens.
Change-Id: I02b0b2c644e931592c14ade5b3de66ffbd136b03
Something recently changed how the inventory json file is generated
and now the hosts are available straight away under the
Controller / Compute roles, but we still keep support for the old
form.
We add an extra lookup to get the ip address and include it as
ansible_host.
In this same commit we silence the greps which verify proper
configuration of the deployment scripts to avoid grep findings
being output to the console.
Change-Id: I676a8dfbbac350943d4586bb09fd1e97086091fc
This patch doesn't stop ovn-controller during migration, but
instead changes the strategy to replicating the provider
bridges into fake br-mig-{0..N} bridges, and also copies
existing ports (non-router, etc) from br-int to br-migration,
so ovn-controller is able to do work in advance while the
neutron database is synchronized to the NBDB.
Then, when we're ready to do the dataplane switch we simply
point ovn-controller back to br-int and we delete all the
dummy br-migration + br-mig-{0..N} bridges.
Change-Id: Icbf98361f8cadbf8e45af9503e53b3ab9f71a9c7
This is very helpful during development to, for example
run only specific tags you're developing, while still
retaining all the configuration options.
Change-Id: Ic61638451b207efa7bf9ef1a2259b6a02f360e6b
Without this fix the "$port" weren't properly cleaned up, and
the subsequent ovs-vsctl calls failed (seen in the logs).
With this change it works.
Change-Id: I7acfc5ce7f584d99d4e4e041531191b6fd4e71ec
Otherwise, the ovn-controllers all around, even though they are
connected to br-migration to avoid br-int manipulation at that stage
they will start sending gratuitous ARPs which will switch FIPs and
router IPs to the wrong hosts.
After the db sync operation has finished, and the configuration
has been pointed to br-int, ovn_controller container are brought back
up.
Change-Id: I323d6e56c0b9bcbb2d7810dc03b1644356de153e
Before we were restarting all the sidecar containers for DHCP,
this way we target only the dhcp_agent containers themselves.
Change-Id: I190f6f8ffd16bf6c01dbc4d87d81543c67795ea1
We add a callback module for the ansible-playbooks which will profile
all tasks and provide execution timestamp. We need to rise verbosity
with -vv to show the timestamps during execution along other
useful details.
Change-Id: I3cce5ab17f01d807e74941662c61ca16ea72cd61
Prior to this patch, we were relying on overcloud node names to decide
where to run ovn-controller and ovn-dbs. This is a bad idea as names
can be anything.
With TripleO composable roles, the safest option would be to run
ovn-dbs everywhere where neutron_api was running in ML2/OVS and run
ovn-controller where OVS agent was. This patch is figuring that out
through tripleo-ansible-inventory output.
Change-Id: Ic053f5a1ae152d564bf9f0f052ee8c80de786313
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
The changes in this commit make use of neutron-netns-cleanup within
the kolla containers used by tripleo to cleanup the neutron
resources as those are found in each host.
Also neutron security group filtering rules inserted in iptables
are removed.
Change-Id: I43fb721d9b73ff09e981f3c2555efa8aca067cf9
Without this change it will try to update the dhcp-agents on
compute nodes too, which, don't exist.
Change-Id: I4f5968c9a3939f57e1011ff6c0d24aedaa8c8a5a
The DHCP MTU steps need to be separated into
1) Updating the dhcp agent t1 parameter, so the VMs
will start renewing the MTU details more quickly
2) Updating the MTU parameters on private vxlan networks.
Otherwise when the MTU is updated the VMs will still be renewing
on the old lease time for a very long time (>24h) until they
attempt a new renew and get the new T1 parameter from DHCP
server.
Change-Id: I0570fdc38ea62ba254cf425dab7c3ccc5796aebf
This patch adds the tool to update the networks MTU to the
entry points for networking-ovn. The idea is to have a command
for the migration tool so that we can better productize it.
NOTE: As this patch is adding new binaries, the tripleo job will
fail while trying to build the RPM as the new files are not present
in the spec. This is a chicken-egg problem as we can't merge the
spec file without this patch so I'm making the job non-voting for
now.
Change-Id: I265baabdc80a3df9df07d7c33d850184b432d9d6
Signed-off-by: Daniel Alvarez <dalvarez@redhat.com>
This patch adds few commands in the ovn_migration.sh script so that the migration
process can be run in steps. It also adds a new ansible playbook file to configure
'dhcp_renewal_time' in dhcp agent configuration file.
With this patch, the user has to run
Step 1. ./ovn_migration.sh generate-inventory -> This generates the ansible inventory
file - hosts_for_migration
Step 2. ./ovn_migration.sh reduce-mtu -> This reduces the MTU of the pre migration
VxLAN tenant networks and configures 'dhcp_renewal_time' in dhcp agent
configuration file in the nodes where DHCP agent docker service is running.
Step 3. ./ovn_migration.sh start-migration -> This kicks start the migration.
This patch also updates the documentation accordingly.
Change-Id: I27ec5d2e42fa8bb1c9e10d9e45a491e0e448cf9f
This patch adds ansible playbooks and roles to carry out migration of
an existing ML2/OVS tripleo setup to ML2/OVN.
In this patch, all the migration tasks are carried out using ansible.
Below are the migration steps
1. Generates a hosts file with the ip addresses of the controllers
and computes.
2. Creates pre migration resources (including a VM) and validates them
3. Runs the overcloud deploy script with ovn-controller's configured to
use a temporary ovs bridge (OVNIntegrationBridge: "br-migration")
4. Carry out the migration tasks
- Generate the OVN north db by running ovn sync util
- Configure ovn-controllers to take over br-int
- Delete all the qrouter/q-dhcp namespaces
5. Validates the pre migration resources again to check if everything is fine
or not.
6. Deletes the pre migration resources and creates and validates post migration
resources.
7. Deletes the post migration resources
8. And finally runs the overcloud deploy script again by setting
"OVNIntegrationBridge: "br-int"
Co-authored-by: Miguel Angel Ajo <majopela@redhat.com>
Change-Id: I29f0d729f8e2ad644aa1eead7c0802995ee279a9