cinder

History

Zhiteng Huang ffefe18334 Pool-aware Scheduler Support This change introduces pool-aware scheduler to address the need for supporting multiple pools from one storage controller. Terminology ----------- Pool - A logical concept to describe a set of storage resource that can be used to serve core Cinder requests, e.g. volumes/snapshots. This notion is almost identical to Cinder Volume Backend, for it has simliar attributes (capacity, capability). The main difference is Pool couldn't exist on its own, it must reside in a Volume Backend. One Volume Backend can have mulitple Pools but Pools don't have sub-Pools (meaning even they have, sub-Pools don't get to exposed to Cinder, yet). Pool has a unique name in backend namespace, which means Volume Backend can't have two pools using same name. Legacy Volume - Volumes that were created prior pools are introduced. There are several corner cases where legacy volumes could cause issues, especially for those drivers used to do pools internally (e.g. 3Par, NetApp). Please refer to 'Limitation/Known Issues' for details. Design ------ The workflow in this change is simple: 1) Volume Backends reports how many pools and what those pools look like and are capable of to scheduler; 2) When request comes in, scheduler picks a pool that fits the need most to serve the request, it passes the request to the backend where the target pool resides in; 3) Volume driver gets the message and let the target pool to serve the request as scheduler instructed. To support placing resources (volume/snapshot) onto a pool, these pieces in Cinder currently are missing: 1. Volume Backends reporting capacity/capabilities at pool level; 2. Scheduler filtering/weighing based on pool capacity/capability and placing volumes/snapshots to a pool of certain backend; 3. Record which pool a resource is located on a backend and passes between scheduler and volume backend. The missing piece 1 is solved by a) updating the format of periodical volume stats message to adopt pool stats; b) altering manager/driver to collect and report pool stats. Below is an example of the updated report message that contains 2 pools: capability = { 'volume_backend_name': 'Local iSCSI', #\ 'vendor_name': 'OpenStack', # backend level 'driver_version': '1.0', # mandatory/fixed 'storage_protocol': 'iSCSI', #- stats&capabilities 'active_volumes': 10, #\ 'IOPS_provisioned': 30000, # optional custom 'fancy_capability_1': 'eat', # stats & capabilities 'fancy_capability_2': 'drink', #/ 'pools': [ {'pool_name': '1st pool', #\ 'total_capacity_gb': 500, # mandatory stats for 'free_capacity_gb': 230, # pools 'allocated_capacity_gb': 270, # \| 'QoS_support': 'False', # \| 'reserved_percentage': 0, #/ 'dying_disks': 100, #\ 'super_hero_1': 'spider-man', # optional custom 'super_hero_2': 'flash', # stats & capabilities 'super_hero_3': 'neoncat' #/ }, {'pool_name': '2nd pool', 'total_capacity_gb': 1024, 'free_capacity_gb': 1024, 'allocated_capacity_gb': 0, 'QoS_support': 'False', 'reserved_percentage': 0, 'dying_disks': 200, 'super_hero_1': 'superman', 'super_hero_2': ' ', 'super_hero_2': 'Hulk', } ] } Notice that there are now two levels of mandatory/fixed stats & capabilities that every volume driver should report. The scheduler change is mostly done in scheduler/host_manager.py: * HostState adds a list element to hold PoolState(s) (a HostState sub- class). Each PoolState can be treated like a HostState since it has as much information as HostState and also share the same structure. HostState.update_capabilities()/update_from_volume_capability() are changed to handle both legacy and new report messages. * HostManager.get_all_host_states() now returns a PoolState iterator that includes all pools scheduler tracks. To filters and weighers, PoolState() and HostState() are identical, thus there is no need to change filters and weighers since they are dealing with same kind of information and exact same data strucuture as before. What filter and weigher deals with looks like this: # Before this change HostState() for Backend 1 ... HostState() for Backend N # After this change PoolState() for Backend 1 Pool 1 ... PoolState() for Backend 1 Pool N ... PoolState() for Backend N Pool 1 ... PoolState() for Backend N Pool N With this change, filter scheduler will pick a pool@host instead of a host. Now that we are able to report and decide at pool level, the 3rd missing piece is easy to fix. Just like how multi-backend volume service is supported, we encoded pool name into 'host' field of Volume table. The 'host' field is now 'host@backend#pool'. Notice that this change doesn't mean that cinder-volume service will have to subscribe to multiple RPC channels. There is no need to mess with message queue subscription at all because we did a little trick when determining RPC target in VolumeRPCAPI: correct host info like 'host@backend' is extracted from 'host@backend#pool' before sending RPC messages. Therefore, once scheduler decides which pool on a backend shall serve a request, it updates 'host' field of the volume record in DB to be like 'host@backend#pool', but it still sends RPC message to 'host@backend' which cinder-volume is listening to. Simliar action is done when creating backups for volumes. host@backend is extracted from volume['host'] so that correct backup service can be picked. Other changes are done in this patch: * Change get_volume_stats() in ISCSIVolumeDriver/ISERDriver to include pool stats, and change default total/free_capacity_gb from 'infinite' to 0. * Add logic in volume manager init_host() to detect legacy volumes and try to update host info for them if driver is able to provide pool info. * Add get_pool() driver API to return the pool name of given volume, this help volume manager to handle legacy volumes, especially for those backends already support pooling internally (e.g. 3Par, NetApp). * Implement get_pool() for LVM driver to return volume backend name as pool name. * Add extract_host() helper function in cinder/volume/utils.py to help handle cases where there is needs to extract 'host', 'backend' or 'pool' information from volume['host']. * Add append_host() helper function in cinder/volume/utils.py to help concatenate host and pool string into one for volume['host'] field. Limitation/Known Issues ----------------------- * The term 'host' in Cinder used to refer to 'backend', and it was consistent from the view of end users/admins to Cinder internals. Now that pool is exposed to Cinder scheduler, scheduler starts treating different pools on same backend as different hosts. Therefore, we have to expose pool to admin at least, because migrating volume now has to include pool in 'host' parameter in order to work. As for end users, the whole 'host' of volume equals storage backend idea works well for them, so they can decide the policy of migration when retyping volumes, or choose to create new volume on same or different host as/from existing volumes. Now it's not easy to hide pool from end user and make retype or affinity filter work like before. This change has a speical code path for legacy volumes, to allow (potential) migration between pools even migration_policy is set to 'never'. But not every driver has magic to move volumes to one pool to another at minimum cost. The inconsistency behavior between drivers (same command may take totally different time to finish), which could be very confusing. * Drivers want to support pools need to update, but they should work just like they used to without any change except: - creating volume using same/different host hints with legacy volumes may NOT work as expected, because 'hostA' is considered different from 'hostA#pool0' and 'hostA#pool1'. But legacy volume on 'hostA' might actually resides in pool0, only the driver has this knowledge. - retyping legacy volume issue as mentioned above. Ultimate solution for all these corner cases is to update Cinder DB to add 'pool' info for legacy volumes. The problem is only the driver knows such info, that is why we add a new driver API get_pool() to so that volume manager is able to learn pool info from driver and update host field of legacy volumes in DB. User-Visible Change ------------------- DocImpact For managing and migrating volumes, now user needs to provide pool information as part of host string. For example: cinder manage --source-name X --name newX host@backend#POOL cinder migrate UUID host@backend#POOL implement blueprint: pool-aware-cinder-scheduler Change-Id: Id8eacb8baeb56558aa3d9de19402e2075822b7b4		2014-08-31 16:27:02 -07:00
..
api	Consistency Groups	2014-08-27 01:47:31 -04:00
manager	Pool-aware Scheduler Support	2014-08-31 16:27:02 -07:00
__init__.py	Empty files shouldn't contain copyright nor license	2013-12-26 22:45:17 -06:00
common.py	Use oslo.i18n	2014-08-08 17:26:41 -05:00