diskimage-builder/doc/source/developer/design.rst
SamYaple 5b6716cee8 Use fstrim to prep the block device
This cuts the image size down alot, esspecially if there were lots of
small file deletes.

The fstrim utility is in the util-linux package and should be on
most all systems. fstrim also works with XFS, ext4, btrfs, etc
prodiving the kernel is new enough.

A reduction of 25% or more in size is common.

Change-Id: I269b4416be450369616f9b8e030f84c30e329804
2016-03-13 16:24:59 +00:00

3.3 KiB

Design

Images are built using a chroot and bind mounted /proc /sys and /dev. The goal of the image building process is to produce blank slate machines that have all the necessary bits to fulfill a specific purpose in the running of an OpenStack cloud: e.g. a nova-compute node. Images produce either a filesystem image with a label of cloudimg-rootfs, or can be customised to produce whole disk images (but will still contain a filesystem labelled cloudimg-rootfs). Once the file system tree is assembled a loopback device with filesystem (or partition table and file system) is created and the tree copied into it. The file system created is an ext4 filesystem just large enough to hold the file system tree and can be resized up to 1PB in size.

To produce the smallest image the utility fstrim is used. When deleting a file the space is simply marked as free on the disk, the file is still there until it is overwritten. fstrim informs the underlying disk to drop those bytes the end result of which is like writting zeros over those sectors. The same effect could be achieved by creating a large file full of zeros and removing that file, however that method is far more IO intensive.

An element is a particular set of code that alters how the image is built, or runs within the chroot to prepare the image. E.g. the local-config element copies in the http proxy and ssh keys of the user running the image build process into the image, whereas the vm element makes the image build a regular VM image with partition table and installed grub boot sector. The mellanox element adds support for mellanox infiniband hardware to both the deploy ramdisk and the built images.

Images must specify a base distribution image element. Currently base distribution elements exist for fedora, rhel, ubuntu, debian and opensuse. Other distributions may be added in future, the infrastructure deliberately makes few assumptions about the exact operating system in use. The base image has opensshd running (a new key generated on first boot) and accepts keys via the cloud metadata service, loading them into the distribution specific default user account.

The goal of a built image is to have any global configuration ready to roll, but nothing that ties it to a specific cloud instance: images should be able to be dropped into a test cloud and validated, and then deployed into a production cloud (usually via bare metal nova) for production use. As such, the image contents can be modelled as three distinct portions:

  • global content: the actual code, kernel, always-applicable config (like disabling password authentication to sshd).
  • metadata / config management provided configuration: user ssh keys, network address and routes, configuration management server location and public key, credentials to access other servers in the cloud. These are typically refreshed on every boot.
  • persistent state: sshd server key, database contents, swift storage areas, nova instance disk images, disk image cache. These would typically be stored on a dedicated partition and not overwritten when re-deploying the image.

The goal of the image building tools is to create machine images that contain the correct global content and are ready for 'last-mile' configuration by the nova metadata API, after which a configuration management system can take over (until the next deploy, when it all starts over from scratch).