stx-7.0 initial spec for armada deprecation and removal

This spec proposes the replacement of Armada with FluxCD.

Story: 2009138
Change-Id: I3bbe4c452a09915031029e4bc6a7080ef08e6167
Signed-off-by: Mihnea Saracin <Mihnea.Saracin@windriver.com>
This commit is contained in:
Mihnea Saracin 2022-02-16 20:36:26 +02:00
parent 960766e114
commit 31407ef891
1 changed files with 329 additions and 0 deletions

View File

@ -0,0 +1,329 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License. http://creativecommons.org/licenses/by/3.0/legalcode
==========================================
Armada Deprecation and Replacement
==========================================
Storyboard: https://storyboard.openstack.org/#!/story/2009138
This story will present how we are going to replace Armada in StarlingX
with a similar replacement named FluxCD.
Problem description
===================
Airship Armada [1]_ is being used in StarlingX application framework
to manage the life-cycle of StarlingX applications.
It provides the ability to manage (i.e., install, upgrade, remove) multiple
Helm charts in an application with dependencies by having all chart
configurations in a single Armada YAML and having it processed with a single
command.
However, the Armada project in AirShip is no longer actively developed
upstream and no longer being used by current versions of AirShip.
Therefore an alternative underlying application management solution
needs to be used by the StarlingX Application Framework.
Use Cases
---------
As a developer/tester/operator I need the ability to define and manage
various applications on a running StarlingX system in a similar fashion
to how this was done with Armada. I.e. defining applications with 1 or more
helm charts and specifying install dependencies between helm charts.
Also the new solution must support both helmv2 and helmv3 helm charts.
Proposed change
===============
A good replacement for armada in the StarlingX project is ``FluxCD`` [2]_.
It provides the same mechanisms to manage the life-cycle of
StarlingX applications.
Flux is a tool for keeping Kubernetes clusters in sync with sources of
configuration (like Git repositories), and automating updates to
configuration when there is new code to deploy.
For our specific needs we will use only the Flux Helm controller [3]_
and Source controller [4]_. Also the ``kustomize`` [5]_ tool will
be used to deploy FluxCD related resources.
Packaging FluxCD and Kustomize into the Load
--------------------------------------------
The Flux Helm and Source controllers will be deployed as pods during the
ansible bootstrap. To enable incremental delivery and eventual removal of
Armada, a new common Ansible role for installing the Helm/Source
controllers will be added.
The ``Kustomize`` binary is used to install FluxCD resources,
it will be packaged into one of the Kubernetes rpms in the integ repository.
In the newer kubernetes versions this is not needed because Kustomize support
is built into **kubectl (kubectl apply -k)**.
StarlingX application changes
-----------------------------
The Armada tar file components for every
StarlingX application are the following:
* charts - A mandatory directory which contains
the helm charts for the application.
* checksum.md5 - An MD5 checksum for the tar file contents.
* files - An optional directory for resource files.
* metadata.yaml - Application metadata.
* plugins - An optional directory containing python app plugins.
* templates - An optional directory for template files.
* **ArmadaManifest.yaml - The Armada manifest file for the application.**
The FluxCD applications tar file components will be almost the same as the
armada components, only the **armada manifest file (ArmadaManifest.yaml)**
will be replaced with the fluxcd-manifests folder. This directory will
contain the necessary files to deploy the app using FluxCD.
The following structure is proposed:
* fluxcd-manifests.
* base.
* helmrepository.yaml.
* kustomization.yaml.
* namespace.yaml.
* kustomization.yaml
* chart-0.
* helmrelease.yaml
* kustomization.yaml
* chart-0-static-overrides.yaml
* chart-0-system-overrides.yaml
* chart-n.
* helmrelease.yaml
* kustomization.yaml
* chart-n-static-overrides.yaml
* chart-n-system-overrides.yaml
In the new **fluxcd-manifests** directory, there is a **base** directory
that contains the basic resources required to be created for this
particular application (i.e… helm repository).
Each chart directory has a helmrelease YAML that defines
the ``HelmRelease`` object for each helm chart and 2 overrides YAML files
that contains the static and the system overrides for the chart.
The static overrides and system overrides will be stored in ``Secret`` and
referenced in its ``HelmRelease`` object.
All the existing applications need to be updated to this new structure.
Also, we need to ensure that all the existing application repositories which
contain 'armada' in the name are renamed and the Starling repo manifests are
updated to reflect the name changes.
Starlingx framework changes
---------------------------
* All existing system application-[upload/apply/remove/delete/abort/update]
commands need to operate on both Armada and FluxCD tarball formats.
To enable incremental delivery and eventual removal of Armada,
we want to maintain the existing infrastructure to call Armada based
applications and introduce new functionality to use the Helm/Source
controllers for applications formatted for them.
* Support application upgrade (to FluxCD) and rollback (to Armada)
if problems occur during a platform upgrade.
Alternatives
------------
**Argo CD** [6]_ A declarative, GitOps continuous delivery tool for k8s
applications deployment and automated lifecycle management. Each workload is
defined declarative through a resource manifest in an Application YAML file.
Argo CD checks if the state defined in the Git repository matches what is
running on the cluster and synchronizes it if changes were detected. K8s
applications accept kustomize applications, helm charts and so on.
Helm repository instead of Git repository is supported to obtain helm chart.
If an application is made of multiple Helm charts and a specific installation
sequence is required, the Argo CD app for each Helm chart needs to be set to
manual sync mode and user has to sync the apps in a specified order.
Basically, it doesnt support the auto-sync(deploy) of multiple Helm charts in
a specific order.
**Argo workflow** [7]_ container-native workflow engine for
orchestrating parallel/sequential jobs on Kubernetes.
**Kubeapps** [8]_ - web-based UI for deploying and managing
applications in Kubernetes clusters.
Data model impact
-----------------
None
REST API impact
---------------
None
Security impact
---------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
The impact is unknown at this time.
Other deployer impact
---------------------
None
Developer impact
----------------
Developer adding new StarlingX applications need to embrace the
new application structure that is compatible with FluxCD.
Upgrade impact
--------------
* All platform applications are updated over a
platform upgrade (from Armada to FluxCD)
* When a platform upgrade is in progress an application update
failure must trigger a rollback using Armada
* Armada pod is removed from the system after all applications are
upgraded to the FluxCD compatible framework on platform upgrade activation
* reject Armada application use (detect and reject Armada based
application tarball uploads) once the platform upgrade is completed and all
applications have been migrated to FluxCD.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
* Mihnea Saracin (msaracin)
Repos Impacted
--------------
* config
* ansible-playbooks
* integ
* audit-armada-app
* metrics-server-armada-app
* monitor-armada-app
* nginx-ingress-controller-armada-app
* oidc-auth-armada-app
* openstack-armada-app
* platform-armada-app
* portieris-armada-app
* ptp-notification-armada-app
* rook-ceph
* SDO-rv-service
* snmp-armada-app
* vault-armada-app
Work Items
----------
* Implement Ansible bootstrap changes to install FluxCD Helm/Source controllers
* Enable the Helm and Source controllers
* Update the application framework to use the Helm/Source controllers
* Update all applications to use the Helm/Source controllers
for deploying/updating applications
* Enabling the postgres backend for the Helm controller
* Provide upgrade support to update all applications to the Helm Controller
and remove the Armada pod
* Ensure that the Helm/Source controller versions will align/work with
future kubernetes versions
* Removal of Armada related code from the application
framework/ansible playbooks
* Provide documentation support for setting up applications that use
the Helm/Source controllers
Dependencies
============
* FluxCD
Testing
=======
* the usual unit testing in the impacted code areas
* full system regression of all StarlingX applications functionality
(system application commands, lifecycle actions etc)
* performance testing in order to identify and address any performance impacts.
In addition, this story changes the way a StarlingX system is installed and
configured, which will require changes in existing automated installation
and testing tools.
Documentation Impact
====================
Only in the **developer** documentation we need to add the following:
* Brief description of the new FluxCD Helm/Source controllers.
* Note to be added to Armada references that Armada is deprecated
and Armada application are no longer allowed to be uploaded.
References
==========
.. [1] https://github.com/openstack/airship-armada
.. [2] https://fluxcd.io/
.. [3] https://github.com/fluxcd/helm-controller
.. [4] https://github.com/fluxcd/source-controller
.. [5] https://kustomize.io/
.. [6] https://github.com/argoproj/argo-cd
.. [7] https://github.com/argoproj/argo-workflows
.. [8] https://github.com/kubeapps/kubeapps
History
=======
.. list-table:: Revisions
:header-rows: 1
* - Release Name
- Description
* - STX-7.0
- Introduced