Reload configuration files on SIGHUP signal

No need to restart the glance api service when user modifies configuration files. Operator/User can send SIGHUP signal to glance service which will reload the configuration file. Change-Id: Ie1862af08dbdd06653c1c515603b4e1a68fc7875
2014-10-09 06:09:07 -07:00 · 2014-10-09 06:09:07 -07:00 · 36839191e3
commit 36839191e3
parent b9313819c3
1 changed files with 176 additions and 0 deletions
--- a/specs/kilo/sighup-conf-reload.rst
+++ b/specs/kilo/sighup-conf-reload.rst
@ -0,0 +1,176 @@
 ===========================================
 Reload configuration files on SIGHUP signal
 ===========================================
 https://blueprints.launchpad.net/glance/+spec/sighup-conf-reload
 We propose to eliminate the need to restart the glance api service when
 configuration files are modified. Operator can send SIGHUP signal to glance
 service which will reload the configuration file.
 Problem description
 ===================
 In a production environment, an administrator will modify the glance-api.conf
 configuration parameters like filesystem_store_datadirs when the storage
 is almost full to add more capacity by adding more disks, or to increase
 the number of workers or log configuration etc. Then they need to restart the
 glance services explicitly for these changes to be loaded. Restarting
 service would break users connected to it which is not good from users point
 of view.
 Proposed change
 ===============
 Add the ability to dynamically change configuration settings of a running
 glance server with no impact to service.
 A running glance server consists of a parent process and one or
 more child processes.
 On receipt of a SIGHUP signal the parent process will:
 - reload the configuration
 - send a SIGHUP to the original child processes
 - start new child processes with the new configuration
 - its listening socket will not be closed
 On receipt of a SIGHUP signal each original child process will:
 - close the listening socket so as not to accept new requests
 - complete any in-flight requests
 - exit
 This approach is based on nginx's behaviour and avoids some of the
 disadvantages of the current oslo's Launcher reload:
 - Race conditions: Launcher does not shutdown eventlet cleanly, existing
  requests can fail.
 - If all child processes are busy there can be a lengthy delay when new
  requests are not processed.
 - Long lived pre-SIGHUP idle client connections can stall request
  processing indefinitely.
 - Not all parameters can be changed, eg number of workers.
 - The wsgi pipeline cannot be changed, for example to enable caching.
 Alternatives
 ------------
 An alternative may be to attempt to save and then restore long running tasks
 using taskflow. The process restart would then only need to deal with
 short lived requests (e.g. API DB lookups) and then no user visible downtime
 is required for regular restarts
 Data model impact
 -----------------
 None
 REST API impact
 ---------------
 None
 Security impact
 ---------------
 None
 Notifications impact
 --------------------
 None
 Other end user impact
 ---------------------
 None
 Performance Impact
 ------------------
 If the reload takes too long (e.g., >50ms) then the API requests will be
 noticeably delayed.
 We are proposing current worker processes to stop accepting requests and
 continue with what they are doing, while the parent process starts and
 spawn new child processes with the new configuration. So there is a
 possibility that the glance node will be running twice as many child processes
 as it is configured to run for a while. It could impact performance,
 especially if it is an underpowered node that is already configured to run
 as many child processes as it can handle without degradation.
 In the author's opinion, it is the responsibility of the operator to make sure
 the node will not be over-provisioned with child processes (workers). If an
 operator wants to run a node with no headroom for additional child processes,
 the author suggests that such an operator not use dynamic configuration via
 SIGHUP. Instead, such an operator should use the old fashioned technique of
 restarting the api service.
 .. _other_deployer:
 Other deployer impact
 ---------------------
 Need to document the impact of config changes for some params like workers,
 host, port etc.
 Developer impact
 ----------------
 None
 Implementation
 ==============
 Assignee(s)
 -----------
 Primary assignee:
  stuart-mclaren
 Other contributors:
  abhishek-kekane
 Reviewers
 ---------
 Core reviewer(s):
  nikhil-komawar
  flaper87
 Other reviewer(s):
  icordasc
 Work Items
 ----------
 - Add handler for SIGHUP signal
 - Reload configuration parameters
 - Unit and functional tests for coverage
 Dependencies
 ============
 None
 Testing
 =======
 None
 Documentation Impact
 ====================
 Please refer to :ref:`other_deployer`
 References
 ==========
 https://etherpad.openstack.org/p/sighup-conf-reload