[admin-guide] Add huge page documentation
Document what huge pages are, why you may wish to use them, and how you can go about doing so. A minor typo fix is included in the 'compute-cpu-topologies' to ensure "huge pages" is the term used throughout all docs. Change-Id: I08b27f70bbcae69fa2bc4d9319ddd27ba706b496
This commit is contained in:
parent
5d799a67b1
commit
0aa8f16e1f
@ -23,5 +23,6 @@ instance for these kind of workloads.
|
|||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
|
||||||
compute-pci-passthrough.rst
|
compute-pci-passthrough
|
||||||
compute-cpu-topologies.rst
|
compute-cpu-topologies
|
||||||
|
compute-huge-pages
|
||||||
|
@ -56,11 +56,11 @@ also should be local. Finally, PCI devices are directly associated with
|
|||||||
specific NUMA nodes for the purposes of DMA. Instances that use PCI or SR-IOV
|
specific NUMA nodes for the purposes of DMA. Instances that use PCI or SR-IOV
|
||||||
devices should be placed on the NUMA node associated with these devices.
|
devices should be placed on the NUMA node associated with these devices.
|
||||||
|
|
||||||
By default, an instance floats across all NUMA nodes on a host. NUMA
|
By default, an instance floats across all NUMA nodes on a host. NUMA awareness
|
||||||
awareness can be enabled implicitly through the use of hugepages or pinned
|
can be enabled implicitly through the use of huge pages or pinned CPUs or
|
||||||
CPUs or explicitly through the use of flavor extra specs or image metadata.
|
explicitly through the use of flavor extra specs or image metadata. In all
|
||||||
In all cases, the ``NUMATopologyFilter`` filter must be enabled. Details on
|
cases, the ``NUMATopologyFilter`` filter must be enabled. Details on this
|
||||||
this filter are provided in `Scheduling`_ configuration guide.
|
filter are provided in `Scheduling`_ configuration guide.
|
||||||
|
|
||||||
.. caution::
|
.. caution::
|
||||||
|
|
||||||
|
239
doc/admin-guide/source/compute-huge-pages.rst
Normal file
239
doc/admin-guide/source/compute-huge-pages.rst
Normal file
@ -0,0 +1,239 @@
|
|||||||
|
.. _compute-huge-pages:
|
||||||
|
|
||||||
|
==========
|
||||||
|
Huge pages
|
||||||
|
==========
|
||||||
|
|
||||||
|
The huge page feature in OpenStack provides important performance improvements
|
||||||
|
for applications that are highly memory IO-bound.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Huge pages may also be referred to hugepages or large pages, depending on
|
||||||
|
the source. These terms are synonyms.
|
||||||
|
|
||||||
|
Pages, the TLB and huge pages
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
Pages
|
||||||
|
Physical memory is segmented into a series of contiguous regions called
|
||||||
|
pages. Each page contains a number of bytes, referred to as the page size.
|
||||||
|
The system retrieves memory by accessing entire pages, rather than byte by
|
||||||
|
byte.
|
||||||
|
|
||||||
|
Translation Lookaside Buffer (TLB)
|
||||||
|
A TLB is used to map the virtual addresses of pages to the physical addresses
|
||||||
|
in actual memory. The TLB is a cache and is not limitless, storing only the
|
||||||
|
most recent or frequently accessed pages. During normal operation, processes
|
||||||
|
will sometimes attempt to retrieve pages that are not stored in the cache.
|
||||||
|
This is known as a TLB miss and results in a delay as the processor iterates
|
||||||
|
through the pages themselves to find the missing address mapping.
|
||||||
|
|
||||||
|
Huge Pages
|
||||||
|
The standard page size in x86 systems is 4 kB. This is optimal for general
|
||||||
|
purpose computing but larger page sizes - 2 MB and 1 GB - are also available.
|
||||||
|
These larger page sizes are known as huge pages. Huge pages result in less
|
||||||
|
efficient memory usage as a process will not generally use all memory
|
||||||
|
available in each page. However, use of huge pages will result in fewer
|
||||||
|
overall pages and a reduced risk of TLB misses. For processes that have
|
||||||
|
significant memory requirements or are memory intensive, the benefits of huge
|
||||||
|
pages frequently outweigh the drawbacks.
|
||||||
|
|
||||||
|
Persistent Huge Pages
|
||||||
|
On Linux hosts, persistent huge pages are huge pages that are reserved
|
||||||
|
upfront. The HugeTLB provides for the mechanism for this upfront
|
||||||
|
configuration of huge pages. The HugeTLB allows for the allocation of varying
|
||||||
|
quantities of different huge page sizes. Allocation can be made at boot time
|
||||||
|
or run time. Refer to the `Linux hugetlbfs guide`_ for more information.
|
||||||
|
|
||||||
|
Transparent Huge Pages (THP)
|
||||||
|
On Linux hosts, transparent huge pages are huge pages that are automatically
|
||||||
|
provisioned based on process requests. Transparent huge pages are provisioned
|
||||||
|
on a best effort basis, attempting to provision 2 MB huge pages if available
|
||||||
|
but falling back to 4 kB small pages if not. However, no upfront
|
||||||
|
configuration is necessary. Refer to the `Linux THP guide`_ for more
|
||||||
|
information.
|
||||||
|
|
||||||
|
Enabling huge pages on the host
|
||||||
|
-------------------------------
|
||||||
|
|
||||||
|
Persistent huge pages are required owing to their guaranteed availability.
|
||||||
|
However, persistent huge pages are not enabled by default in most environments.
|
||||||
|
The steps for enabling huge pages differ from platform to platform and only the
|
||||||
|
steps for Linux hosts are described here. On Linux hosts, the number of
|
||||||
|
persistent huge pages on the host can be queried by checking ``/proc/meminfo``:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ grep Huge /proc/meminfo
|
||||||
|
AnonHugePages: 0 kB
|
||||||
|
ShmemHugePages: 0 kB
|
||||||
|
HugePages_Total: 0
|
||||||
|
HugePages_Free: 0
|
||||||
|
HugePages_Rsvd: 0
|
||||||
|
HugePages_Surp: 0
|
||||||
|
Hugepagesize: 2048 kB
|
||||||
|
|
||||||
|
In this instance, there are 0 persistent huge pages (``HugePages_Total``) and 0
|
||||||
|
transparent huge pages (``AnonHugePages``) allocated. Huge pages can be
|
||||||
|
allocated at boot time or run time. Huge pages require a contiguous area of
|
||||||
|
memory - memory that gets increasingly fragmented the long a host is running.
|
||||||
|
Identifying contiguous areas of memory is a issue for all huge page sizes, but
|
||||||
|
it's particularly problematic for larger huge page sizes such as 1 GB huge
|
||||||
|
pages. Allocating huge pages at boot time will ensure the correct number of huge
|
||||||
|
pages is always available, while allocating them at run time can fail if memory
|
||||||
|
has become too fragmented.
|
||||||
|
|
||||||
|
To allocate huge pages at run time, the kernel boot parameters must be extended
|
||||||
|
to include some huge page-specific parameters. This can be achieved by
|
||||||
|
modifying ``/etc/default/grub`` and appending the ``hugepagesz``,
|
||||||
|
``hugepages``, and ``transparent_hugepages=never`` arguments to
|
||||||
|
``GRUB_CMDLINE_LINUX``. To allocate, for example, 2048 persistent 2 MB huge
|
||||||
|
pages at boot time, run:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
# echo 'GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX hugepagesz=2M hugepages=2048 transparent_hugepage=never"' > /etc/default/grub
|
||||||
|
$ grep GRUB_CMDLINE_LINUX /etc/default/grub
|
||||||
|
GRUB_CMDLINE_LINUX="..."
|
||||||
|
GRUB_CMDLINE_LINUX="$GRUB_CMDLINE_LINUX hugepagesz=2M hugepages=2048 transparent_hugepage=never"
|
||||||
|
|
||||||
|
.. important::
|
||||||
|
|
||||||
|
Persistent huge pages are not usable by standard host OS processes. Ensure
|
||||||
|
enough free, non-huge page memory is reserved for these processes.
|
||||||
|
|
||||||
|
Reboot the host, then validate that huge pages are now available:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ grep "Huge" /proc/meminfo
|
||||||
|
AnonHugePages: 0 kB
|
||||||
|
ShmemHugePages: 0 kB
|
||||||
|
HugePages_Total: 2048
|
||||||
|
HugePages_Free: 2048
|
||||||
|
HugePages_Rsvd: 0
|
||||||
|
HugePages_Surp: 0
|
||||||
|
Hugepagesize: 2048 kB
|
||||||
|
|
||||||
|
There are now 2048 2 MB huge pages totalling 4 GB of huge pages. These huge
|
||||||
|
pages must be mounted. On most platforms, this happens automatically. To verify
|
||||||
|
that the huge pages are mounted, run:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
# mount | grep huge
|
||||||
|
hugetlbfs on /dev/hugepages type hugetlbfs (rw)
|
||||||
|
|
||||||
|
In this instance, the huge pages are mounted at ``/dev/hugepages``. This mount
|
||||||
|
point varies from platform to platform. If the above command did not return
|
||||||
|
anything, the hugepages must be mounted manually. To mount the huge pages at
|
||||||
|
``/dev/hugepages``, run:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
# mkdir -p /dev/hugepages
|
||||||
|
# mount -t hugetlbfs hugetlbfs /dev/hugepages
|
||||||
|
|
||||||
|
There are many more ways to configure huge pages, including allocating huge
|
||||||
|
pages at run time, specifying varying allocations for different huge page
|
||||||
|
sizes, or allocating huge pages from memory affinitized to different NUMA
|
||||||
|
nodes. For more information on configuring huge pages on Linux hosts, refer to
|
||||||
|
the `Linux hugetlbfs guide`_.
|
||||||
|
|
||||||
|
Customizing instance huge pages allocations
|
||||||
|
-------------------------------------------
|
||||||
|
|
||||||
|
.. important::
|
||||||
|
|
||||||
|
The functionality described below is currently only supported by the
|
||||||
|
libvirt/KVM driver.
|
||||||
|
|
||||||
|
.. important::
|
||||||
|
|
||||||
|
For performance reasons, configuring huge pages for an instance will
|
||||||
|
implicitly result in a NUMA topology being configured for the instance.
|
||||||
|
Configuring a NUMA topology for an instance requires enablement of
|
||||||
|
``NUMATopologyFilter``. Refer to :doc:`compute-cpu-topologies` for more
|
||||||
|
information.
|
||||||
|
|
||||||
|
By default, an instance does not use huge pages for its underlying memory.
|
||||||
|
However, huge pages can bring important or required performance improvements
|
||||||
|
for some workloads. Huge pages must be requested explicitly through the use of
|
||||||
|
flavor extra specs or image metadata. To request an instance use huge pages,
|
||||||
|
run:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ openstack flavor set m1.large --property hw:mem_page_size=large
|
||||||
|
|
||||||
|
Different platforms offer different huge page sizes. For example: x86-based
|
||||||
|
platforms offer 2 MB and 1 GB huge page sizes. Specific huge page sizes can be
|
||||||
|
also be requested, with or without a unit suffix. The unit suffix must be one
|
||||||
|
of: Kb(it), Kib(it), Mb(it), Mib(it), Gb(it), Gib(it), Tb(it), Tib(it), KB,
|
||||||
|
KiB, MB, MiB, GB, GiB, TB, TiB. Where a unit suffix is not provided, Kilobytes
|
||||||
|
are assumed. To request an instance to use 2 MB huge pages, run one of:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ openstack flavor set m1.large --property hw:mem_page_size=2Mb
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ openstack flavor set m1.large --property hw:mem_page_size=2048
|
||||||
|
|
||||||
|
Enabling huge pages for an instance can have negative consequences for other
|
||||||
|
instances by consuming limited huge pages resources. To explicitly request
|
||||||
|
an instance use small pages, run:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ openstack flavor set m1.large --property hw:mem_page_size=small
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Explicitly requesting any page size will still result in a NUMA topology
|
||||||
|
being applied to the instance, as described earlier in this document.
|
||||||
|
|
||||||
|
Finally, to leave the decision of huge or small pages to the compute driver,
|
||||||
|
run:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ openstack flavor set m1.large --property hw:mem_page_size=any
|
||||||
|
|
||||||
|
For more information about the syntax for ``hw:mem_page_size``, refer to the
|
||||||
|
`Flavors`_ guide.
|
||||||
|
|
||||||
|
Applications are frequently packaged as images. For applications that require
|
||||||
|
the IO performance improvements that huge pages provides, configure image
|
||||||
|
metadata to ensure instances always request the specific page size regardless
|
||||||
|
of flavor. To configure an image to use 1 GB huge pages, run:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ openstack image set [IMAGE_ID] --property hw_mem_page_size=1GB
|
||||||
|
|
||||||
|
Image metadata takes precedence over flavor extra specs. Thus, configuring
|
||||||
|
competing page sizes causes an exception. By setting a ``small`` page size
|
||||||
|
through image metadata, administrators can prevent users requesting huge pages
|
||||||
|
in flavors and impacting resource utilization. To configure this page size,
|
||||||
|
run:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
$ openstack image set [IMAGE_ID] --property hw_mem_page_size=small
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Explicitly requesting any page size will still result in a NUMA topology
|
||||||
|
being applied to the instance, as described earlier in this document.
|
||||||
|
|
||||||
|
For more information about image metadata, refer to the `Image metadata`_
|
||||||
|
guide.
|
||||||
|
|
||||||
|
.. Links
|
||||||
|
.. _`Linux THP guide`: https://www.kernel.org/doc/Documentation/vm/transhuge.txt
|
||||||
|
.. _`Linux hugetlbfs guide`: https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
|
||||||
|
.. _`Flavors`: http://docs.openstack.org/admin-guide/compute-flavors.html
|
||||||
|
.. _`Image metadata`: http://docs.openstack.org/image-guide/image-metadata.html
|
Loading…
Reference in New Issue
Block a user