Files
nova-specs/specs/juno/implemented/juno-slaveification.rst
Michael Still f0b8204072 Re-organize juno specs
As discussed at our nova meetings, reorganize the juno specs into
three directories:

 - proposed: things proposed which weren't approved
 - approved: things we approved but didn't implement
 - implemented: things approved and implemented

The first I suspect is the most controversial. I've done this
because I worry about the case where a future developer wants to
pick up something dropped by a previous developer, but has trouble
finding previous proposed specifications on the topic. Note that
the actual proposed specs for Juno are adding in a later commit.

Change-Id: Idcf55ca37a83d7098dcb7c2971240c4e8fd23dc8
2014-10-07 07:49:59 +11:00

158 lines
4.4 KiB
ReStructuredText

..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
==========================================
More periodic tasks to slave for Juno
==========================================
https://blueprints.launchpad.net/nova/+spec/juno-slaveification
In the Icehouse development cycle we gave deployers the option to offload
most reads from nova-compute periodic tasks to a DB replication slave.
We will continue this work in Juno by "slaveifying" the rest of the
periodic tasks where appropriate.
Problem description
===================
Currently the accepted way to scale the database for reads and writes in Nova
is to do a multi-master setup or use some sort of database clustering. The
problem with this approach is that while read scalability is potentially
increased by making more hardware resources available (CPU, RAM, iops, etc).
Write scalability is decreased and more operational complexity is inherited.
Proposed change
===============
I would like to continue the work done in Icehouse by completing the
"slaveification" of periodic tasks.
Alternatives
------------
There are alternative ways to scale reads and writes both:
-Handling scaling within the application through some sort of sharding scheme.
-Handle scaling at the DB level.
We have a sharding model, cells, in Nova currently. It could be argued that
time would be better spent improving this approach rather than spending time
trying to scale it using available DB technologies.
Data model impact
-----------------
None
REST API impact
---------------
None
Security impact
---------------
None
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
No negative changes, hopefully this allows us to take some load off of
a "write master" and offload them to a slave or slaves.
Other deployer impact
---------------------
If a deployer changes the slave_connection configuration parameter in the
database section it is assumed that they are accepting the behavior of
having all reads from periodic tasks be sent to that connection. The
deployer needs to be educated and aware of the implication of running a
database replication slave and fetching actionable data from said slave.
These include, but may not be limited to:
-Need for monitoring of the slave status
-Operational staff familiar with maintenance of replication slaves
-Possibility to operate on data that is slightly out of date
See https://wiki.openstack.org/wiki/Slave_usage
Developer impact
----------------
Developers should consider which reads might benefit from optionally using
a slave handle. When new reads are introduced, consider the context in which
the code is called. Will it matter if this code operates on possibly out of
date data? Is the benefit of offloading reads greater than an inconvenience
caused by acting on old data?
Implementation
==============
Assignee(s)
-----------
Primary assignee:
<geekinutah>
Other contributors:
<None>
Work Items
----------
Slaveify the following periodic tasks in nova/compute/manager.py
update_available_resource
_run_pending_deletes
_instance_usage_audit
_poll_bandwidth_usage
_poll_volume_usage
Dependencies
============
We will need to have an object for bw_usage, this is covered by
https://blueprints.launchpad.net/nova/+spec/compute-manager-objects-juno
Testing
=======
Currently there is no testing in Tempest for reads going to the alternate
slave handle. We should add a replication slave to our test runs and test
the periodic tasks with and without slave_connection enabled.
Documentation Impact
====================
The operations guide should be updated and provide instructions with references
to MySQL and Postgres documentation on setting up and maintaining slaves. We
should also talk about HA possibilities with asynchronous slaves and various
automation frameworks that deal with this problem. It would also be good to
explain that while being able to specify a slave_connection is primarily a
scaling feature, the ability to use it for availability purposes is there.
References
==========
https://wiki.openstack.org/wiki/Slave_usage
The original blueprint with code history and discussion:
https://blueprints.launchpad.net/nova/+spec/db-slave-handle
The Icehouse blueprint:
https://blueprints.launchpad.net/nova/+spec/periodic-tasks-to-db-slave