Spec to improve containerized Apache WSGI footprint

Co-Authored-by: Michael Johnson <johnsomor@gmail.com> Change-Id: I96621a7f8234dc126eff0b9ae5a3a109a566459f Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2021-10-29 15:59:40 +02:00 · 2021-10-29 15:59:40 +02:00 · 23d0c73894
parent 50365e46dc
commit 23d0c73894
1 changed files with 183 additions and 0 deletions
--- a/specs/yoga/compact-wsgi-footprint.rst
+++ b/specs/yoga/compact-wsgi-footprint.rst
@ -0,0 +1,183 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=========================================
+Improve WSGI Apps Footrpint in Containers
+=========================================
+
+https://blueprints.launchpad.net/tripleo/+spec/compact-wsgi-footprint
+
+OpenStack API service workers in TripleO are isolated in Python interpreters
+instantiated by mod WSGI of their own Apache servers, which is a share nothing
+execution model.
+
+This specification proposes an alternative deployment and configuration method
+for such WSGI applications. That method improves memory footrpint of those, by
+the price of the failure domain increased to the Apache server(s) that control
+it. That layout may become a default one, eventually, and can be applied on a
+service-by-service case.
+
+Problem Description
+===================
+
+Isolated Apache servers in containers make their WSGI applications consuming
+an extended memory footprint. Containers not cooperate but compete for more and
+more RAM for its individual cache among all python interpreter processes there.
+More details can be found in `[0]`_.
+
+Proposed Change
+===============
+
+Provide alternative deployment methods for Apache in containers. Wrap its WSGI
+applications with podman-py `[1]`_ to run those in containers of a pod, but
+share cache and its Python interpreters within its shared "vhost pod".
+
+That would require a new Apache mod, like `mod_pod(man)`, to start each vhost
+as a podman pod that shares the needed namespaces (cgroup, ipc, net, uts, pid,
+and whatnot). An example `mod_container` `[2]`_ shows the idea. Or we could
+fork and modify `mod_container` and make it preparing the needed namespaces for
+a vhost. Also make WSGI scripts aware of those pre-created namespaces. So that
+the first process group in an application group would start a new podman pod
+off the pre-created namespaces, and add a new container into it. Then each new
+process group would only add a new container into the app group pod.
+
+Depending on a use case, WSGI applications can be packed into its dedicated
+Apache container pods by its main service tag, like Nova, or Neutron. Or using
+another groupping methods, like expected API load, or by a Python interpreter
+version (for envs running mixed components versions). While compact all-in-one
+deployments may share a global context and interpreter.
+
+Apache VirtHost configs may use these segregation examples then:
+ * for a `pod1`:
+   ``WSGIApplicationGroup %{ENV:SVC}``, ``SetEnv SVC Nova_And_Cinder_Py39``.
+ * for a `pod2`:
+   ``WSGIApplicationGroup %{ENV:SVC}``, ``SetEnv SVC Neutron_And_Glance_Py42``.
+Or use a ``GLOBAL`` app group and ``WSGIProcessGroup %{ENV:SVC}``.
+
+Open question: can we share a python interpreter of the application group
+without `mod_pod(man)`? Or if we start process
+groups in distinct containers, would they share nothing? See `[3]`_.
+
+Overview
+--------
+
+According to `[2]`_, WSGI applications within the same application group will
+execute within the context of the same Python sub interpreter. In each distinct
+process of a named group of processes, there will be a separate sub interpreter
+instance of same name.
+
+For example, `Nova`, `Cinder` etc., can become either `WSGIProcessGroup`, or
+`WSGIApplicationGroup` groups of its service-scoped, or global Python context.
+
+TripleO could support both proposed Apache deployment layouts. Wthereis the
+global context aims compact undercloud/standalone/low-memory footprints with
+APIs failure domain as large as its common web-server is. And those with a
+per-service apache server aiming HA envs with failure domains reduced to a
+service.
+
+Alternatives
+------------
+
+* Switch to uWSGI (and add it to RDO). uWSGI behind Apache is how the upstream
+  devstack is setup and has a proven track record in many production
+  environments. At would also require additions to existing puppet modules,
+  changes to existing container builds and introduces a new mechanism for
+  web services integration.
+
+Security Impact
+---------------
+
+A compromised Apache web-server would also compromise its WSGI applications.
+
+Upgrade Impact
+--------------
+
+No special upgrade procedures required. Apache and API containers would
+catch up deployment method changes by the standard for framework meanings.
+
+Other End User Impact
+---------------------
+
+No deployer impact (the new deployment method of Apache is not users-faced).
+
+Performance Impact
+------------------
+
+Depending on the global (per a host) or a service-based scoping
+of deployed Apache server(s), memory footprint would become drastically
+decreased, while improved caching among Python WSGI processes would
+benefit undercloud, standalone and multi-node overcloud setups.
+
+Also performance could decrease, depending on the design, due to the NAT
+required at the pod boundary (i.e. exposing the port) and because of the
+containerization overhead in general.
+
+Running all the API WSGI applications under one WSGI framework and/or Python
+instance, performance can degrade at scale due to the GIL. There is some
+benefit to having high-rate-of-change APIs get their own python instance and be
+placed into separate pods.
+
+Other Deployer Impact
+---------------------
+
+No deployer impact (the new deployment method of Apache provides no new
+configurable switches).
+
+Developer Impact
+----------------
+
+No developer impact aside of maintaining podman-py wrappers of WSGI scripts.
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  * Bogdando - Bogdan Dobrelya
+
+Other contributors:
+  * ???
+
+Work Items
+----------
+
+* Implement `mod_pod(man)` for HTTPD based on podman libs or `mod_container`
+* Implement podman-py wrapper what starts WSGI applications in containers.
+* Provide switch to control the deployment mode for Apache in containers.
+* Default OpenStack APIs to the service-scoped containerized Apache servers.
+* Switch Undercloud and Standalone deployments to the global Apache container.
+
+Dependencies
+============
+
+External dependencies include ``podman-remote`` and ``python-podman-api``
+that provide Python Podman API (podman-py `[1]`_) via Unix Sockets. The latter
+relies on Systemd socket activation feature. These packages are already
+available via Centos 8 AppStream and container-tools modules.
+
+Unfortunately, podman-py won't be available downstream until RHEL 8.5, which
+means that this feature cannot be backported.
+
+Testing
+=======
+
+No special test coverage is required.
+
+Documentation Impact
+====================
+
+No documentation impact, since the alternative deployment method is internal to
+TripleO framework.
+
+References
+==========
+
+.. _[0]: -- https://etherpad.opendev.org/p/containerized-memory-sharing
+.. _[1]: -- https://github.com/containers/podman-py
+.. _[2]: -- https://github.dev/avagin/mod_container
+.. _[3]: -- https://modwsgi.readthedocs.io/en/develop/configuration-directives/WSGIApplicationGroup.html