nova/nova
Stephen Finucane ac05bc3b38 Handle multiple 'vcpusched' elements during live migrate
When live migrating a pinned instance, we recalculate pinning
information for the destination host and then update the instance's XML
before spawning the instance there. As part of the pinning information
recalculation, we must also recalculate information for realtime cores,
which are configured using the '<vcpusched>' element. The
'nova.virt.libvirt.migration._update_numa_xml' function, which handles
this updating, was assuming there would only be one of these elements.
This is a reasonably sane assumption since this is all we create in the
'nova.virt.libvirt.LibvirtDriver._get_guest_numa_config' function used
to generate the initial instance XML. However, a look at logs show that
at least some (all?) versions of libvirt actually rewrite the XML we're
providing them. Compare what is returned from '_get_guest_xml':

  DEBUG nova.virt.libvirt.driver [...] [instance: ...] End _get_guest_xml xml=<domain type="kvm">
    ...
    <cputune>
      <shares>4096</shares>
      ...
      <vcpusched vcpus="2-3" scheduler="fifo" priority="1"/>
    </cputune>
    ...
  </domain>
   {{(pid=12600) _get_guest_xml /opt/stack/nova/nova/virt/libvirt/driver.py:6331}}

to what is seen when we enter '_update_numa_xml' (or via 'virsh dumpxml'
at any point after instance creation):

  DEBUG nova.virt.libvirt.migration [-] _update_numa_xml input xml=<domain type="kvm">
    ...
    <cputune>
      <shares>4096</shares>
      ...
      <vcpusched vcpus="2" scheduler="fifo" priority="1"/>
      <vcpusched vcpus="3" scheduler="fifo" priority="1"/>
    </cputune
    ...
  </domain>
   {{(pid=12600) _update_numa_xml /opt/stack/nova/nova/virt/libvirt/migration.py:97}

The solution is simple: rather than trying to modify the existing XML,
simply scrap it and rebuild the elements from scratch. We should
probably do this for all elements, but that can/should be tackled
separately.

Change-Id: Ic01603a91f6099f1068af0e955f3e1056021d673
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #1889257
(cherry picked from commit b6aef1ec4f)
2020-07-29 18:00:43 +01:00
..
accelerator Delete ARQs for an instance when the instance is deleted. 2020-03-24 22:44:18 -07:00
api Merge "Pass the actual target in quota class policy" 2020-04-23 01:28:44 +00:00
cmd Add nova-status upgrade check and reno for policy new defaults 2020-05-04 18:33:37 +00:00
compute compute: Allow snapshots to be created from PAUSED volume backed instances 2020-05-19 09:46:57 +01:00
conductor Support live migration with vpmem 2020-04-07 13:13:13 +00:00
conf Reserve DISK_GB resource for the image cache 2020-05-20 07:25:32 +00:00
console Merge "Allow TLS ciphers/protocols to be configurable for console proxies" 2020-02-24 17:27:02 +00:00
db Merge "remove DISTINCT ON SQL instruction that does nothing on MySQL" 2020-03-25 23:18:58 +00:00
hacking Switch to hacking 2.x 2020-01-17 11:30:40 +00:00
image Remove 'nova.image.api' module 2020-02-18 11:45:39 +00:00
keymgr
locale Imported Translations from Zanata 2020-04-28 08:35:34 +00:00
network Merge "nova-net: Remove unused parameters" 2020-03-26 21:46:16 +00:00
notifications Remove 'nova.image.api' module 2020-02-18 11:45:39 +00:00
objects objects: Update keypairs when saving an instance 2020-07-23 09:46:14 +00:00
pci support pci numa affinity policies in flavor and image 2019-12-11 14:39:12 +00:00
policies Merge "Add new default roles in quota class policies" 2020-04-21 08:39:41 +00:00
privsep images: Make JSON the default output format of calls to qemu-img info 2020-04-16 16:38:24 +01:00
scheduler Enable and use COMPUTE_ACCELERATORS trait. 2020-03-27 22:42:37 -07:00
servicegroup Handle ServiceNotFound in DbDriver._report_state 2019-12-04 09:50:17 -05:00
tests Handle multiple 'vcpusched' elements during live migrate 2020-07-29 18:00:43 +01:00
virt Handle multiple 'vcpusched' elements during live migrate 2020-07-29 18:00:43 +01:00
volume Merge "Add retry to cinder API calls related to volume detach" 2020-04-20 17:36:33 +00:00
__init__.py
availability_zones.py trivial: Fetch 'Service' objects once when building AZs 2020-02-05 21:26:23 +00:00
baserpc.py
block_device.py
cache_utils.py trivial: Remove unused 'cache_utils' APIs 2020-02-05 17:20:28 +00:00
config.py remove support of oslo.messaging 9.8.0 warning message 2020-05-22 12:51:46 +00:00
context.py Reset the cell cache for database access in Service 2020-04-08 17:48:18 +00:00
crypto.py
debugger.py
exception.py Merge "libvirt: Add support for stable device rescue" 2020-04-10 14:56:13 +00:00
exception_wrapper.py
filters.py
hooks.py
i18n.py
loadables.py trivial: Remove dead code 2019-12-12 10:55:02 +00:00
manager.py
middleware.py Rename 'nova.common.config' module to 'nova.middleware' 2019-08-16 00:53:03 +01:00
monkey_patch.py Monkey patch original current_thread _active 2020-02-12 16:34:56 -05:00
policy.py Use oslo policy flag to disable default change warning instead of all 2020-04-15 02:23:32 +00:00
profiler.py
quota.py Make quotas respect instance_list_per_project_cells 2020-05-19 02:20:28 +00:00
rpc.py
safe_utils.py
service.py Reset the cell cache for database access in Service 2020-04-08 17:48:18 +00:00
service_auth.py
test.py func tests: move _run_periodics() into base class 2020-03-24 10:10:53 -04:00
utils.py compute: Extract _get_bdm_image_metadata into nova.utils 2020-04-09 08:39:36 +01:00
version.py
weights.py
wsgi.py