first cut complete

This commit is contained in:
Sandy Walsh
2011-05-30 05:03:45 -07:00
parent f7a8b5b0be
commit cefb58271f

View File

@@ -119,8 +119,49 @@ Finally, we need to give the user a way to get information on each of the instan
Host Filter Host Filter
-------------- --------------
As we mentioned earlier, filtering hosts is a very deployment specific process. Service Providers may have a different set of criteria for filtering Compute nodes than a University. To faciliate this the `nova.scheduler.host_filter` module supports a variety of filtering strategies as well as an easy means for plugging in your own algorithms.
The filter used is determined by the `--default_host_filter` flag, which points to a Python Class. By default this flag is set to `nova.scheduler.host_filter.AllHostsFilter` which simply returns all available hosts. But there are others.
`nova.scheduler.host_filter.InstanceTypeFilter` provides host filtering based on the memory and disk size specified in the `InstanceType` record passed into `run_instance`.
`nova.scheduler.host_filter.JSONFilter` filters hosts based on simple JSON expression grammar. Using a LISP-like JSON structure the caller can request instances based on criteria well beyond what `InstanceType` specifies. See `nova.tests.test_host_filter` for examples.
To create your own `HostFilter` the user simply has to derive from `nova.scheduler.host_filter.HostFilter` and implement two methods: `instance_type_to_filter` and `filter_hosts`. Since Nova is currently dependent on the `InstanceType` structure, the `instance_type_to_filter` method should take an `InstanceType` and turn it into an internal data structure usable by your filter. This is for backward compatibility with existing OpenStack and EC2 API calls. If you decide the create your own call for creating instances not based on `Flavors` or `InstanceTypes` you can ignore this method. The real work is done in `filter_hosts` which must return a list of weight tuples for each appropriate host. The set of all available hosts is in the `ZoneManager` object passed into the call as well as the filter query. The weight tuple contains (`<hostname>`, `<additional data>`) where `<additional data>` is whatever you want it to be.
Cost Scheduler Weighing Cost Scheduler Weighing
-------------- --------------
Every `ZoneAwareScheduler` derivation must also override the `weigh_hosts` method. This takes the list of filtered hosts (generated by the `filter_hosts` method) and returns a list of weight dicts. The weight dicts must contain two keys: `weight` and `hostname` where `weight` is simply an integer (lower is better) and `hostname` is the name of the host. The list does not need to be sorted, this will be done by the `ZoneAwareScheduler` base class when all the results have been assembled.
Simple Zone Aware Scheduling
--------------
The easiest way to get started with the Zone Aware Scheduler is to use the `nova.scheduler.host_filter.HostFilterScheduler`. This scheduler uses the default Host Filter as and the `weight_hosts` method simply returns a weight of 1 for all hosts. But, from this, you can see calls being routed from Zone to Zone and follow the flow of things.
The `--scheduler_driver` flag is how you specify the Scheduler class name.
Flags
--------------
All this Zone and Distributed Scheduler stuff can seem a little daunting to configure, but it's actually not too bad. Here are some of the main flags you should set in your `nova.conf` file:
::
--allow_admin_api=true
--enable_zone_routing=true
--zone_name=zone1
--build_plan_encryption_key=c286696d887c9aa0611bbb3e2025a45b
--scheduler_driver=nova.scheduler.host_filter.HostFilterScheduler
--default_host_filter=nova.scheduler.host_filter.AllHostsFilter
`--allow_admin_api` must be set for OS API to enable the new `/zones/*` commands.
`--enable_zone_routing` must be set for OS API commands such as `create()`, `pause()` and `delete()` to get routed from Zone to Zone when looking for instances.
`--zone_name` is only required in Child Zones. The default Zone name is `nova`, but you may want to name your child Zones something useful. Duplicate Zone names are not an issue.
`build_plan_encryption_key` is the SHA-256 key for encrypting/decrypting the Host information when it leaves a Zone. Be sure to change this key for each Zone you create. Do not duplicate keys.
`scheduler_driver` is the real work horse of the operation. For Distributed Scheduler, you need to specify a class derived from `nova.scheduler.zone_aware_scheduler.ZoneAwareScheduler`
`default_host_filter` is the host filter to be used for filtering candidate Compute nodes.
Some optional flags which are handy for debugging are:
::
--connection_type=fake
--verbose
Using the `Fake` virtualization driver is handy when you're setting this stuff up so you're not dealing with a million possible issues at once. When things seem to working correctly, switch back to whatever hypervisor your deployment uses.