Add NOTIFY throttling and Mitaka directory
Minor typos fix related-bug: #1436210 related-bug: #1498462 Change-Id: If15594099eb7cf74f1c534a05884c2d2501e57e6
This commit is contained in:
parent
2da64ceb36
commit
3e3439c5c8
|
@ -28,6 +28,14 @@ Liberty approved specs:
|
|||
|
||||
specs/liberty/*
|
||||
|
||||
Mitaka approved specs:
|
||||
|
||||
.. toctree::
|
||||
:glob:
|
||||
:maxdepth: 1
|
||||
|
||||
specs/mitaka/*
|
||||
|
||||
|
||||
==================
|
||||
Indices and tables
|
||||
|
|
|
@ -14,7 +14,7 @@ database.
|
|||
Problem description
|
||||
===================
|
||||
|
||||
Once deleted, domains are not removed immediatly from the database, mostly for
|
||||
Once deleted, domains are not removed immediately from the database, mostly for
|
||||
billing reasons. They are flagged as deleted in the "deleted" database column
|
||||
and the "deleted_at" column is populated with a timestamp.
|
||||
|
||||
|
@ -45,7 +45,7 @@ plugin. The task will select a group of domains and send a RPC call to Central.
|
|||
Central will run a query against the database to purge any deleted domain if
|
||||
needed and log the number of purged domains.
|
||||
|
||||
Configuration paramenters:
|
||||
Configuration parameters:
|
||||
|
||||
Purging run frequency.
|
||||
Default: hourly. Users might want to run it frequently to minimize the cycle duration.
|
||||
|
@ -122,7 +122,7 @@ Milestones
|
|||
----------
|
||||
|
||||
Target Milestone for completion:
|
||||
Libery-3
|
||||
Liberty-3
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
|
|
@ -0,0 +1,129 @@
|
|||
..
|
||||
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
=============================
|
||||
Bulk zone update throttling
|
||||
=============================
|
||||
|
||||
https://blueprints.launchpad.net/designate/+spec/notify-throttling
|
||||
|
||||
Implement a mechanism to throttle the delivery of NOTIFY transactions when
|
||||
a large number of zones are updated at the same time.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
If a large number of zones are updated in a short time this will generate a
|
||||
consequently large amount NOTIFY transaction to be sent to the nameservers
|
||||
with no delay leading to a burst of incoming AXFR requests.
|
||||
This might impact on bottlenecks in MiniDNS and the storage layer in terms of
|
||||
CPU, I/O or network bandwidth.
|
||||
|
||||
A typical trigger is the update of an NS record in a Pool containing many zones.
|
||||
|
||||
The autonomous refreshing of zones performed by resolvers can also trigger a
|
||||
similar burst of AXFR. This can happen on recently started resolvers, where the
|
||||
refresh timers can share the same values across many zones.
|
||||
|
||||
Related to bug https://bugs.launchpad.net/designate/+bug/1498462
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
Implement a mechanism for enqueuing and delayed delivery of notify transactions
|
||||
at a configurable throttle speed.
|
||||
|
||||
Also, implement staggering of zone refresh requests by randomizing the refresh
|
||||
interval.
|
||||
|
||||
API Changes
|
||||
-----------
|
||||
|
||||
Expose the count of zones flagged for delayed notify in the Admin
|
||||
API as "/reports/counts/zones_pending_notify".
|
||||
|
||||
Central Changes
|
||||
---------------
|
||||
|
||||
Implement support for a new database column "pending_notify" and set it to
|
||||
True every time a Pool NS record is updated.
|
||||
|
||||
Storage Changes
|
||||
---------------
|
||||
|
||||
Add an new boolean database column "pending_notify" on Zones.
|
||||
Implement a migration script to add the column to existing databases,
|
||||
defaulting to False. In future, the column might default to True.
|
||||
|
||||
Other Changes
|
||||
-------------
|
||||
|
||||
Implement a Task in Zone Manager to periodically fetch a set of zones that need
|
||||
to receive a Notify starting with the oldest in term of last update time.
|
||||
The task frequency and the maximum set size can be configured to throttle the
|
||||
amount of outgoing Notify.
|
||||
Zone Manager will reset the "pending_notify" flag once done.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
N/A
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
The throttling queue is implemented as a new database column containing a
|
||||
boolean flag. See Central Changes and Storage Changes.
|
||||
|
||||
Also, new zones will be created with an uniformly random refresh time between a minimum and a maximum value.
|
||||
|
||||
|
||||
Design considerations
|
||||
---------------------
|
||||
|
||||
The throttling queue could be implemented outside of the database:
|
||||
- No need to create an extra database column
|
||||
- No increased database I/O
|
||||
|
||||
We propose using the database for the following reasons:
|
||||
- Zone Manager is the best candidate to handle the delayed Notify. Currently there are no ways for Central to send a list of Zones to Zone Manager other than through the database
|
||||
- The queue can support delayed Notify for changes other than Pool NS record updates
|
||||
- Ability to monitor the queue size and ETA to inform the user and for debugging
|
||||
- A persistent queue can survive Zone Manager unhandled exceptions or restarts
|
||||
- The increased database load is negligible compared to the existing traffic
|
||||
|
||||
Risk analysis
|
||||
-------------
|
||||
|
||||
- Zone Manager fails to run the Notify delivery task. The nameservers will eventually refresh the zone anyways. Impact: slow update propagation. Mitigation: expose the notification queue length to the user through Admin API and by logging.
|
||||
- A big notification queue takes a considerable time to be handled. Impact: potentially prevents more urgent changes to be delivered quickly. Mitigation: encourage users to configure the throttling parameters; Provide sensible default values. Implementing a concept of notification priority seems unnecessary.
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Federico Ceratto https://launchpad.net/~federico-ceratto
|
||||
|
||||
Milestones
|
||||
----------
|
||||
|
||||
Target Milestone for completion:
|
||||
Liberty-3
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
- Implement refresh time staggering
|
||||
- Implement Notify throttling
|
||||
- Add throttle parameters to configuration files
|
||||
- Document throttling mechanism
|
||||
- Write unit and functional tests
|
||||
- Test throttling and staggering on devstack
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
N/A
|
Loading…
Reference in New Issue