nodepool/doc/source/aws.rst
James E. Blair d5b0dee642 AWS driver create/delete improvements
The default AWS rate limit is 2 instances/sec, but in practice, we
can achieve something like 0.6 instances/sec with the current code.
That's because the create instance REST API call itself takes more
than a second to return.  To achieve even the default AWS rate
(much less a potentially faster one which may be obtainable via
support request), we need to alter the approach.  This change does
the following:

* Paralellizes create API calls.  We create a threadpool with
  (typically) 8 workers to execute create instance calls in the
  background.  2 or 3 workers should be sufficient to meet the
  2/sec rate, more allows for the occasional longer execution time
  as well as a customized higher rate.  We max out at 8 to protect
  nodepool from too many threads.
* The state machine uses the new background create calls instead
  of synchronously creating instances.  This allows other state
  machines to progress further (ie, advance to ssh keyscan faster
  in the case of a rush of requests).
* Delete calls are batched.  They don't take as long as create calls,
  yet their existence at all uses up rate limiting slots which could
  be used for creating instances.  By batching deletes, we make
  more room for creates.
* A bug in the RateLimiter could cause it not to record the initial
  time and therefore avoid actually rate limiting.  This is fixed.
* The RateLimiter is now thread-safe.
* The default rate limit for AWS is changed to 2 requests/sec.
* Documentation for the 'rate' parameter for the AWS driver is added.
* Documentation for the 'rate' parameter for the Azure driver is
  corrected to describe the rate as requests/sec instead of delay
  between requests.

Change-Id: Ida2cbc59928e183eb7da275ff26d152eae784cfe
2022-06-22 13:28:58 -07:00

18 KiB

zuul

AWS Driver

If using the AWS driver to upload diskimages, see VM Import/Export service role for information on configuring the required permissions in AWS. You must also create an S3 Bucket for use by Nodepool.

Selecting the aws driver adds the following options to the providers section of the configuration.

providers.[aws]

An AWS provider's resources are partitioned into groups called pool (see providers.[aws].pools for details), and within a pool, the node types which are to be made available are listed (see providers.[aws].pools.labels for details).

See Boto Configuration for information on how to configure credentials and other settings for AWS access in Nodepool's runtime environment.

Note

For documentation purposes the option names are prefixed providers.[aws] to disambiguate from other drivers, but [aws] is not required in the configuration (e.g. below providers.[aws].pools refers to the pools key in the providers section when the aws driver is selected).

Example:

providers:
  - name: ec2-us-west-2
    driver: aws
    region-name: us-west-2
    cloud-images:
      - name: debian9
        image-id: ami-09c308526d9534717
        username: admin
    pools:
      - name: main
        max-servers: 5
        subnet-id: subnet-0123456789abcdef0
        security-group-id: sg-01234567890abcdef
        labels:
          - name: debian9
            cloud-image: debian9
            instance-type: t3.medium
            iam-instance-profile:
              arn: arn:aws:iam::123456789012:instance-profile/s3-read-only
            key-name: zuul
            tags:
              key1: value1
          - name: debian9-large
            cloud-image: debian9
            instance-type: t3.large
            key-name: zuul
            tags:
              key1: value1
              key2: value2

name

A unique name for this provider configuration.

region-name

Name of the AWS region to interact with.

profile-name

The AWS credentials profile to load for this provider. If unspecified the boto3 library will select a profile.

See Boto Configuration for more information.

rate

The number of operations per second to perform against the provider.

boot-timeout

Once an instance is active, how long to try connecting to the image via SSH. If the timeout is exceeded, the node launch is aborted and the instance deleted.

launch-timeout

The time to wait from issuing the command to create a new instance until that instance is reported as "active". If the timeout is exceeded, the node launch is aborted and the instance deleted.

launch-retries

The number of times to retry launching a node before considering the request failed.

post-upload-hook

Filename of an optional script that can be called after an image has been uploaded to a provider but before it is taken into use. This is useful to perform last minute validation tests before an image is really used for build nodes. The script will be called as follows:

<SCRIPT> <PROVIDER> <EXTERNAL_IMAGE_ID> <LOCAL_IMAGE_FILENAME>

If the script returns with result code 0 it is treated as successful otherwise it is treated as failed and the image gets deleted.

object-storage

This section is only required when using Nodepool to upload diskimages.

bucket-name

The name of a bucket to use for temporary storage of diskimages while creating snapshots. The bucket must already exist.

image-format

The image format that should be requested from diskimage-builder and also specified to AWS when importing images. One of: ova, vhd, vhdx, vmdk, raw (not all of which are supported by diskimage-builder).

cloud-images

Each entry in this section must refer to an entry in the labels section.

cloud-images:
  - name: ubuntu1804
    image-id: ami-082fd9a18128c9e8c
    username: ubuntu
  - name: ubuntu1804-by-filters
    image-filters:
      - name: name
        values:
         - named-ami
    username: ubuntu
  - name: my-custom-win2k3
    connection-type: winrm
    username: admin

Each entry is a dictionary with the following keys

name

Identifier to refer this cloud-image from providers.[aws].pools.labels section. Since this name appears elsewhere in the nodepool configuration file, you may want to use your own descriptive name here and use image-id to specify the cloud image so that if the image id changes on the cloud, the impact to your Nodepool configuration will be minimal. However, if image-id is not provided, this is assumed to be the image id in the cloud.

image-id

If this is provided, it is used to select the image from the cloud provider by ID. Either this field or providers.[aws].cloud-images.image-filters must be provided.

image-filters

If provided, this is used to select an AMI by filters. If the filters provided match more than one image, the most recent will be returned. Either this field or providers.[aws].cloud-images.image-id must be provided.

Each entry is a dictionary with the following keys

name

The filter name. See Boto describe images for a list of valid filters.

values

A list of string values on which to filter.

username

The username that a consumer should use when connecting to the node.

python-path

The path of the default python interpreter. Used by Zuul to set ansible_python_interpreter. The special value auto will direct Zuul to use inbuilt Ansible logic to select the interpreter on Ansible >=2.8, and default to /usr/bin/python2 for earlier versions.

connection-type

The connection type that a consumer should use when connecting to the node. For most images this is not necessary. However when creating Windows images this could be 'winrm' to enable access via ansible.

connection-port

The port that a consumer should use when connecting to the node. For most diskimages this is not necessary. This defaults to 22 for ssh and 5986 for winrm.

shell-type

The shell type of the node's default shell executable. Used by Zuul to set ansible_shell_type. This setting should only be used

  • For a windows image with the experimental connection-type ssh in which case cmd or powershell should be set and reflect the node's DefaultShell configuration.
  • If the default shell is not Bourne compatible (sh), but instead e.g. csh or fish, and the user is aware that there is a long-standing issue with ansible_shell_type in combination with become.

diskimages

Each entry in a provider's diskimages section must correspond to an entry in diskimages. Such an entry indicates that the corresponding diskimage should be uploaded for use in this provider. Additionally, any nodes that are created using the uploaded image will have the associated attributes (such as flavor or metadata).

If an image is removed from this section, any previously uploaded images will be deleted from the provider.

diskimages:
  - name: bionic
    pause: False
  - name: windows
    connection-type: winrm
    connection-port: 5986

Each entry is a dictionary with the following keys

name

Identifier to refer this image from providers.[aws].pools.labels and diskimages sections.

pause

When set to True, nodepool-builder will not upload the image to the provider.

username

The username that should be used when connecting to the node.

connection-type

The connection type that a consumer should use when connecting to the node. For most diskimages this is not necessary. However when creating Windows images this could be winrm to enable access via ansible.

connection-port

The port that a consumer should use when connecting to the node. For most diskimages this is not necessary. This defaults to 22 for ssh and 5986 for winrm.

python-path

The path of the default python interpreter. Used by Zuul to set ansible_python_interpreter. The special value auto will direct Zuul to use inbuilt Ansible logic to select the interpreter on Ansible >=2.8, and default to /usr/bin/python2 for earlier versions.

shell-type

The shell type of the node's default shell executable. Used by Zuul to set ansible_shell_type. This setting should only be used

  • For a windows image with the experimental connection-type ssh in which case cmd or powershell should be set and reflect the node's DefaultShell configuration.
  • If the default shell is not Bourne compatible (sh), but instead e.g. csh or fish, and the user is aware that there is a long-standing issue with ansible_shell_type in combination with become.

pools

A pool defines a group of resources from an AWS provider. Each pool has a maximum number of nodes which can be launched from it, along with a number of cloud-related attributes used when launching nodes.

name

A unique name within the provider for this pool of resources.

priority

The priority of this provider pool (a lesser number is a higher priority). Nodepool launchers will yield requests to other provider pools with a higher priority as long as they are not paused. This means that in general, higher priority pools will reach quota first before lower priority pools begin to be used.

This setting may be specified at the provider level in order to apply to all pools within that provider, or it can be overridden here for a specific pool.

node-attributes

A dictionary of key-value pairs that will be stored with the node data in ZooKeeper. The keys and values can be any arbitrary string.

subnet-id

If provided, specifies the subnet to assign to the primary network interface of nodes.

security-group-id

If provided, specifies the security group ID to assign to the primary network interface of nodes.

public-ip-address

Deprecated alias for providers.[aws].pools.public-ipv4.

public-ipv4

Specify if a public IPv4 address shall be attached to nodes.

public-ipv6

Specify if a public IPv6 address shall be attached to nodes.

use-internal-ip

If a public IP is attached but Nodepool should prefer the private IP, set this to true.

host-key-checking

Whether to validate SSH host keys. When true, this helps ensure that nodes are ready to receive SSH connections before they are supplied to the requestor. When set to false, nodepool-launcher will not attempt to ssh-keyscan nodes after they are booted. Disable this if nodepool-launcher and the nodes it launches are on different networks, where the launcher is unable to reach the nodes directly, or when using Nodepool with non-SSH node platforms. The default value is true.

labels

Each entry in a pool's labels section indicates that the corresponding label is available for use in this pool. When creating nodes for a label, the flavor-related attributes in that label's section will be used.

labels:
  - name: bionic
    instance-type: m5a.large

Each entry is a dictionary with the following keys

name

Identifier to refer to this label.

cloud-image

Refers to the name of an externally managed image in the cloud that already exists on the provider. The value of cloud-image should match the name of a previously configured entry from the cloud-images section of the provider. See providers.[aws].cloud-images. Mutually exclusive with providers.[aws].pools.labels.diskimage

diskimage

Refers to provider's diskimages, see providers.[aws].diskimages. Mutually exclusive with providers.[aws].pools.labels.cloud-image

ebs-optimized

Indicates whether EBS optimization (additional, dedicated throughput between Amazon EC2 and Amazon EBS,) has been enabled for the instance.

instance-type

Name of the flavor to use.

iam-instance-profile

Used to attach an iam instance profile. Useful for giving access to services without needing any secrets.

name

Name of the instance profile. Mutually exclusive with providers.[aws].pools.labels.iam-instance-profile.arn

arn

ARN identifier of the profile. Mutually exclusive with providers.[aws].pools.labels.iam-instance-profile.name

key-name

The name of a keypair that will be used when booting each server.

volume-type

If given, the root EBS volume type

volume-size

If given, the size of the root EBS volume, in GiB.

userdata

A string of userdata for a node. Example usage is to install cloud-init package on image which will apply the userdata. Additional info about options in cloud-config: https://cloudinit.readthedocs.io/en/latest/topics/examples.html

tags

A dictionary of tags to add to the EC2 instances. Values must be supplied as strings.