As discussed at our nova meetings, reorganize the juno specs into three directories: - proposed: things proposed which weren't approved - approved: things we approved but didn't implement - implemented: things approved and implemented The first I suspect is the most controversial. I've done this because I worry about the case where a future developer wants to pick up something dropped by a previous developer, but has trouble finding previous proposed specifications on the topic. Note that the actual proposed specs for Juno are adding in a later commit. Change-Id: Idcf55ca37a83d7098dcb7c2971240c4e8fd23dc8
158 lines
4.4 KiB
ReStructuredText
158 lines
4.4 KiB
ReStructuredText
..
|
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
License.
|
|
|
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
==========================================
|
|
More periodic tasks to slave for Juno
|
|
==========================================
|
|
|
|
https://blueprints.launchpad.net/nova/+spec/juno-slaveification
|
|
|
|
In the Icehouse development cycle we gave deployers the option to offload
|
|
most reads from nova-compute periodic tasks to a DB replication slave.
|
|
We will continue this work in Juno by "slaveifying" the rest of the
|
|
periodic tasks where appropriate.
|
|
|
|
Problem description
|
|
===================
|
|
|
|
Currently the accepted way to scale the database for reads and writes in Nova
|
|
is to do a multi-master setup or use some sort of database clustering. The
|
|
problem with this approach is that while read scalability is potentially
|
|
increased by making more hardware resources available (CPU, RAM, iops, etc).
|
|
Write scalability is decreased and more operational complexity is inherited.
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
I would like to continue the work done in Icehouse by completing the
|
|
"slaveification" of periodic tasks.
|
|
|
|
Alternatives
|
|
------------
|
|
|
|
There are alternative ways to scale reads and writes both:
|
|
|
|
-Handling scaling within the application through some sort of sharding scheme.
|
|
-Handle scaling at the DB level.
|
|
|
|
We have a sharding model, cells, in Nova currently. It could be argued that
|
|
time would be better spent improving this approach rather than spending time
|
|
trying to scale it using available DB technologies.
|
|
|
|
Data model impact
|
|
-----------------
|
|
|
|
None
|
|
|
|
REST API impact
|
|
---------------
|
|
|
|
None
|
|
|
|
Security impact
|
|
---------------
|
|
|
|
None
|
|
|
|
Notifications impact
|
|
--------------------
|
|
|
|
None
|
|
|
|
Other end user impact
|
|
---------------------
|
|
|
|
None
|
|
|
|
Performance Impact
|
|
------------------
|
|
|
|
No negative changes, hopefully this allows us to take some load off of
|
|
a "write master" and offload them to a slave or slaves.
|
|
|
|
Other deployer impact
|
|
---------------------
|
|
|
|
If a deployer changes the slave_connection configuration parameter in the
|
|
database section it is assumed that they are accepting the behavior of
|
|
having all reads from periodic tasks be sent to that connection. The
|
|
deployer needs to be educated and aware of the implication of running a
|
|
database replication slave and fetching actionable data from said slave.
|
|
These include, but may not be limited to:
|
|
|
|
-Need for monitoring of the slave status
|
|
-Operational staff familiar with maintenance of replication slaves
|
|
-Possibility to operate on data that is slightly out of date
|
|
|
|
See https://wiki.openstack.org/wiki/Slave_usage
|
|
|
|
|
|
Developer impact
|
|
----------------
|
|
|
|
Developers should consider which reads might benefit from optionally using
|
|
a slave handle. When new reads are introduced, consider the context in which
|
|
the code is called. Will it matter if this code operates on possibly out of
|
|
date data? Is the benefit of offloading reads greater than an inconvenience
|
|
caused by acting on old data?
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary assignee:
|
|
<geekinutah>
|
|
|
|
Other contributors:
|
|
<None>
|
|
|
|
Work Items
|
|
----------
|
|
|
|
Slaveify the following periodic tasks in nova/compute/manager.py
|
|
|
|
update_available_resource
|
|
_run_pending_deletes
|
|
_instance_usage_audit
|
|
_poll_bandwidth_usage
|
|
_poll_volume_usage
|
|
|
|
Dependencies
|
|
============
|
|
|
|
We will need to have an object for bw_usage, this is covered by
|
|
https://blueprints.launchpad.net/nova/+spec/compute-manager-objects-juno
|
|
|
|
Testing
|
|
=======
|
|
|
|
Currently there is no testing in Tempest for reads going to the alternate
|
|
slave handle. We should add a replication slave to our test runs and test
|
|
the periodic tasks with and without slave_connection enabled.
|
|
|
|
Documentation Impact
|
|
====================
|
|
|
|
The operations guide should be updated and provide instructions with references
|
|
to MySQL and Postgres documentation on setting up and maintaining slaves. We
|
|
should also talk about HA possibilities with asynchronous slaves and various
|
|
automation frameworks that deal with this problem. It would also be good to
|
|
explain that while being able to specify a slave_connection is primarily a
|
|
scaling feature, the ability to use it for availability purposes is there.
|
|
|
|
References
|
|
==========
|
|
|
|
https://wiki.openstack.org/wiki/Slave_usage
|
|
|
|
The original blueprint with code history and discussion:
|
|
https://blueprints.launchpad.net/nova/+spec/db-slave-handle
|
|
|
|
The Icehouse blueprint:
|
|
https://blueprints.launchpad.net/nova/+spec/periodic-tasks-to-db-slave
|