Enhanced VNF placements blue print

This spec introduces specifying NUMA, CPU Pinning, Huge Pages and SR-IOV nic type as part of VNFD to improve VNF VDU's performance. Change-Id: I4414df00b72b98626d7564e383a1a60893fd6535 Author: gong yong sheng <gong.yongsheng@99cloud.net> Co-Authored-By: Vishwanath Jayaraman <vishwanathj@hotmail.com>
2015-12-16 22:53:07 -06:00
parent 05a58fdbf8
commit 5348d70fb1
1 changed files with 416 additions and 0 deletions
--- a/specs/mitaka/enhanced-placement.rst
+++ b/specs/mitaka/enhanced-placement.rst
@@ -0,0 +1,416 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+
+==========================================
+Enhanced VNF placements
+==========================================
+
+Include the URL of your launchpad blueprint:
+
+https://blueprints.launchpad.net/tacker/+spec/enhanced-vnf-placement
+
+This spec tries to use declartive way to place VNF's VDUs effctively.
+
+
+Problem description
+===================
+
+VNF's VDUs are placed just like normal VMs. This does not satisfy the VNF's
+performance requirements:
+
+* IO intensive
+
+* Computation intensive
+
+
+Proposed change
+===============
+
+Introduce new host properties in VNFD template that allows to specify CPU
+pinning, Huge pages, NUMA placements and vCPU topology per VDU. Additionally,
+allows for a way to specify SR-IOV nics for the VDU network interfaces.
+
+CPU pinning avoids unpredicatable latency and host CPU overcommit by
+pinning guest vCPUs to host CPUs, thereby improving performance of
+applications running in guest.
+
+Huge pages helps ensure that guest has 100% dedicated RAM that will never be
+swapped out.
+
+NUMA placement decreases latency by avoiding cross-node memory and I/O device
+access by guests.
+
+SR-IOV port allocation to a guest enables network traffic to bypass the
+software layer of the hypervisor and flow directly between the SR-IOV nic and
+the guest therby improving performance.
+
+VNFD host properties schema:
+
+..code-block::
+
+**topology_template**:
+
+  **node_templates:**
+    **vdu1:**
+      **type: tosca.nodes.nfv.VDU:**
+
+      **capabilities:**
+
+        **nfv_compute**:
+
+          **properties:**
+            **disk_size: {get_input: dsize}**
+            #disk size value of VM in GB
+
+            **num_cpus: {get_input: cpu_count}**
+            #CPU count for VM
+
+            **mem_size: {get_input: msize}**
+            #Memory Size in MB for VM
+
+          **cpu_allocation:**
+            **cpu_affinity: {get_input: affinity}**
+
+            #valid value supported is 'dedicated'. The value 'dedicated'
+            ensures that the guest vCPU associated with VDU will be strictly
+            pinned to a set of host pCPUs. Any other value specified or not,
+            will allow guest vCPU to float freely across host pCPUs.
+
+            **thread_allocation: {get_input: threadalloc}**
+
+            #valid values  are 'avoid', 'separate', 'isolate' and 'prefer'.
+            The values applies only if 'cpu_affinity' is set to 'dedicated'.
+            The value 'avoid' indicates to not place the guest on a host that
+            has hyperthreads. The value 'separate' allows to place each vCPU
+            on a different core if host has threads. The value 'isolate' will
+            place each vCPU on a different core and no vCPUs from other
+            guests will be placed on the same core. If a host has threads,
+            the value 'prefer' allows to place vCPUs on the same core, so
+            they are thread siblings.
+
+
+            **socket_count: {get_input: sock_cnt}**
+
+            #specifies preferred number of sockets to expose to the guest. A
+            socket count greater than 1 enables a VM to be spread across NUMA
+            nodes.
+            Note: While the template specifies the exact socket, core and
+            thread count the underlying IaaS system (in this case Nova) might
+            optimize into a slightly different core count combination across
+            sockets, cores and threads.
+
+            **core_count: {get_input: core_cnt}**
+
+            #specifies preferred number of cores per socket to expose to the
+            guest.
+
+            **thread_count: {get_input: thrdcnt}**
+
+            #specifies preferred number of threads per core to expose to the
+            guest.
+
+          **mem_page_size: {get_input: mem_pg_sz}**
+
+          #allows to specify values when Huge pages are used, allowed values
+          are 'small', 'large', 'any' and 'custom page size in MB'.'small'
+          usually maps to 4K page sizes on x86, large maps to either 2 MB or
+          1 GB on x86, 'any' leaves it to driver implementation.
+
+          **numa_node_count**: **count:** {get_input: numa_count}
+
+          # specifies the number of NUMA nodes to expose to the guest.
+          When numa_node_count is specified, the CPU and Memory resources for
+          the guest are symmetrically allocated across the numa nodes.
+          Specifying only one of either numa_node_count or numa_nodes is
+          supported, if both are specified, the numa_node_count value is
+          considered.
+
+          **numa_nodes**:
+
+          #Allows for specifying asymmetrical allocation of CPUs and RAM. A
+          minimum of 2 nodes with unique node labels should be defined for
+          this to take effect.
+
+            **<node_label>**:
+              #specify a unique name for the node_label.
+
+                **id: {get_input: numa_id}**
+
+                # Specifies NUMA node id
+
+                **vcpus: {get_input: vcpu_list}**
+
+                # specifies mapping of vCPUs list to the NUMA node
+
+                **memory: {get_input: mem_size}**
+
+                #specifies mapping of RAM in MB to NUMA node
+
+For SR-IOV support, a new property called "type" that would accept value of
+'sriov' is introduced for the tosca.nodes.nfv.CP type
+
+
+Find below examples of using the above VNFD template schema
+
+CPU Pinning : Below would be an example of pinning guest vCPUs to host pCPUs.
+
+topology_template:
+  node_templates:
+    VDU1:
+      type: tosca.nodes.nfv.VDU
+
+      capabilities:
+        nfv_compute:
+          properties:
+            num_cpus: 8
+            mem_size: 4096 # Memory Size in MB
+            disk_size: 8 # Value in GB
+
+            cpu_allocation:
+              cpu_affinity: dedicated
+              thread_allocation: isolate
+
+Huge Pages: An example of specifying Huge pages be used for a guest VM
+
+topology_template:
+  node_templates:
+    VDU1:
+      type: tosca.nodes.nfv.VDU
+
+      capabilities:
+        nfv_compute:
+          properties:
+            num_cpus: 8
+            mem_size: 4096 # Memory Size in MB
+            disk_size: 8 # Value in GB
+            mem_page_size: large
+
+
+NUMA placement: Below would be an example of specifying asymmetrical
+allocation of CPUs and RAM across NUMA nodes.
+
+topology_template:
+  node_templates:
+    VDU1:
+      type: tosca.nodes.nfv.VDU
+
+      capabilities:
+        nfv_compute:
+          properties:
+            num_cpus: 8
+            mem_size: 6144
+            disk_size: 8
+            numa_nodes:
+
+              node1:
+                id: 0
+                vcpus: [ 0,1 ]
+                mem_size: 2048
+              node2:
+                id: 1
+                vcpus: [ 2, 3, 4, 5]
+                mem_size: 4096
+
+NUMA placement: Below would be an example of specifying symmetrical
+allocation of CPUs and RAM across NUMA nodes
+
+topology_template:
+  node_templates:
+    VDU1:
+      type: tosca.nodes.nfv.VDU
+
+      capabilities:
+        nfv_compute:
+          properties:
+            num_cpus: 8
+            mem_size: 6144
+            disk_size: 8
+            numa_node_count: 2
+
+
+Combination Example: Below would be an example that specifies HugePages,
+CPU pinning, NUMA placement, host hyper-threading disabled, as well providing
+sockets, cores and thread count to be exposed to guest
+
+topology_template:
+  node_templates:
+    VDU1:
+      type: tosca.nodes.nfv.VDU
+
+      capabilities:
+        nfv_compute:
+          properties:
+            num_cpus: 8
+            mem_size: 4096
+            disk_size: 80
+            mem_page_size: 1G
+            cpu_allocation:
+
+              cpu_affinity: dedicated
+              thread_allocation: avoid
+              socket_count: 2
+              core_count: 2
+              thread_count: 2
+
+            numa_node_count: 2
+
+
+Network Interfaces example: Below would be an example that defines multiple
+network interfaces and sriov nic types.
+
+topology_template:
+  node_templates:
+    VDU1:
+      type: tosca.nodes.nfv.VDU
+
+      capabilities:
+        nfv_compute:
+          properties:
+            num_cpus: 8
+            mem_size: 4096 MB
+            disk_size: 8 GB
+            mem_page_size: 1G
+
+            cpu_allocation:
+              cpu_affinity: dedicated
+              thread_allocation: isolate
+              socket_count: 2
+              core_count: 8
+              thread_count: 4
+
+            numa_node_count: 2
+
+    CP11:
+      type: tosca.nodes.nfv.CP
+
+      requirements:
+        - virtualbinding: VDU1
+        - virtualLink: net_mgmt
+
+    CP12:
+     type: tosca.nodes.nfv.CP
+
+     properties:
+         anti_spoof_protection: false
+         type : sriov
+     requirements:
+      - virtualbinding: VDU1
+      - virtualLink: net_ingress
+
+    CP13:
+      type: tosca.nodes.nfv.CP
+
+     properties:
+         anti_spoof_protection: false
+         type : sriov
+
+      requirements:
+        - virtualbinding: VDU1
+        - virtualLink: net_egress
+
+    net_mgmt:
+      type: tosca.nodes.nfv.VL.ELAN
+
+    net_ingress:
+      type: tosca.nodes.nfv.VL.ELAN
+
+
+Alternatives
+------------
+
+The alternative would be to create a flavor ahead of time and use that flavor
+in the VNFD template.
+
+Data model impact
+-----------------
+None
+
+REST API impact
+---------------
+
+
+Security impact
+---------------
+
+
+Other end user impact
+---------------------
+
+
+Performance Impact
+------------------
+
+
+Other deployer impact
+---------------------
+The deployer is expected to prepare the Host OS (grub changes) on the compute
+nodes for reserving Huge Pages, isolating CPUs and enabling SR-IOV.
+Configuration changes are expected in nova and neutron configuration files.
+
+
+Developer impact
+----------------
+
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+
+Primary assignee:
+  gong yong sheng gong.yongsheng@99cloud.net
+
+Other contributors:
+  Vishwanath Jayaraman <vishwanathj@hotmail.com>
+
+Work Items
+----------
+
+1) numa support
+2) sriov support
+
+
+Dependencies
+============
+
+* https://blueprints.launchpad.net/tacker/+spec/automatic-resource-creation
+
+
+
+Testing
+=======
+
+To test the numa, sriov and pci passthough needs special hardware, the normal
+environment on openstack CI does not satisfy it.
+
+So manual testing is a must, and hopefully, some one can provide their own
+hosts in lab to do the third party testing.
+
+Other options are:
+1.Approach openstack-infra / -qa teams to request compute resources be added
+at the gate for testing the capabilities in the spec.
+2.Have a vendor to support a 3rd party CI job and vote against the features
+called out in the spec.
+
+Documentation Impact
+====================
+
+The document will be updated to guide how to use this feature.
+
+
+References
+==========
+
+[1] http://docs.openstack.org/developer/nova/testing/libvirt-numa.html
+[2] http://redhatstackblog.redhat.com/2015/05/05/cpu-pinning-and-numa-topology-awareness-in-openstack-compute/
+[3] https://wiki.openstack.org/wiki/VirtDriverGuestCPUMemoryPlacement
+[4] https://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/input-output-based-numa-scheduling.html
+[5] http://specs.openstack.org/openstack/nova-specs/specs/mitaka/approved/virt-driver-cpu-pinning.html
+[6] http://redhatstackblog.redhat.com/2015/03/05/red-hat-enterprise-linux-openstack-platform-6-sr-iov-networking-part-i-understanding-the-basics/