Merge "Database Synchronization between Tacker and K8s"

2022-07-29 08:23:20 +00:00 · 2022-07-29 08:23:20 +00:00 · a876c667dd
parent 762649479b 7213b97071
commit a876c667dd
1 changed files with 360 additions and 0 deletions
--- a/specs/zed/database-synchronization.rst
+++ b/specs/zed/database-synchronization.rst
@ -0,0 +1,360 @@
+==========================================================
+Database Synchronization between TackerDB and KubernetesDB
+==========================================================
+
+https://blueprints.launchpad.net/tacker/+spec/database-synchronization
+
+Problem description
+===================
+
+In the current implementation, TackerDB and VIM DB(KubernetesDB)
+such as etcd in Kubernetes are only synchronized when SOL002 heal is executed.
+However, when Kubernetes auto-heals or auto-scales and the pod restarts,
+the information is only reflected in the KubernetesDB and not in TackerDB.
+As a result, the following problems occurs.
+
+* When a user uses a command or API to get information about
+  a Pod resource, they see the information stored in TackerDB
+  and get an out-of-date output.
+
+* If a user attempts to scale when the database is inconsistent, the Pod
+  scales at a number that the user does not intend because it sees and
+  processes old information.
+
+.. note:: The existing heal process already supports the ability
+          to synchronize information about a Pod.
+
+This specification supposes the below changes:
+
+* Adds periodic synchronization of Pod information between
+  TackerDB and KubernetesDB
+
+Proposed Change
+===============
+
+The following functions are incorporated in this specification.
+
+ Regularly retrieve information of pods that TackerDB and
+  KubernetesDB hold, compare the data,
+  and update TackerDB from KubernetesDB if there are differences.
+
+  .. note:: When the TackerDB is updated by synchronization processing,
+            the information is shown in a log.
+            The log contains the name of the old pod and the new one.
+
+ Enable config to specify the frequency of synchronization.
+
+The attributes in the TackerDB to be updated are as follows:
+
+ vnf_instantiated_info.scaleStatus.scaleLevel: Calculate the `scaleLevel`
+  from the number of Pods retrieved from KubernetesDB and update it if it
+  differs from TackerDB.
+ vnf_instantiated_info.vnfc_resource_info: If the number of
+  running Pods changes, increase or decrease the number of data in
+  `vnfcResourceInfo` to match the number of running Pods.
+ vnf_instantiated_info.vnfc_resource_info.computeResource.resourceId:
+  Update resourceId with the new resource ID as Pod name.
+ VnfInstanceV2.instantiatedVnfInfo.scaleStatus.scaleLevel: Calculate
+  the `scaleLevel` from the number of Pods retrieved from KubernetesDB
+  and update it if it differs from TackerDB.
+ VnfInstanceV2.instantiatedVnfInfo.vnfcResourceInfo: If the number of
+  running Pods changes, increase or decrease the number of data in
+  `vnfcResourceInfo` and `VnfInstanceV2.instantiatedVnfInfo.vnfcInfo`
+  to match the number of running Pods.
+ VnfInstanceV2.instantiatedVnfInfo.vnfcResourceInfo.
+  computeResource.resourceId:
+  Update resourceId with the new resource ID as Pod name.
+
+.. note:: Get the following resource information from KubernetesDB
+
+  * Pod
+  * Deployment
+  * Replicaset
+  * Daemonset
+  * StatefulSet
+
+  Pods under the control of StatefulSet have static names,
+  thus resourceId does not change even if they are recreated by
+  Kubernetes. However, Tacker must fetch the current number of
+  Pods to update ScaleStatus during DB synchronization.
+
+Synchronizes between TackerDB and KubernetesDB
+----------------------------------------------
+
+Periodic database synchronization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+This feature periodically synchronizes information related to
+Pod resources between TackerDB and KubernetesDB.
+
+The diagram below shows the database synchronization.
+
+::
+
+
+                                                    +----------------------------------------------------------+
+                                                    |   VNFM                                                   |
+                                                    |  +-----------------------+                               |
+   +-------------------------------+                |  |    Tacker-conductor   |                               |
+   |     Kubernetes                |                |  |  +--------------+     |                               |
+   |                               |                |  |  | VnflcmDriver |     |                               |
+   | +--------------------------+  |                |  |  | (v1/v2)      |     |                               |
+   | |           Master         |  |                |  |  +---+----------+     |                               |
+   | |                          |  |                |  |      | 1. Start periodic database synchronization     |
+   | |                          |  |                |  |      |                |                               |
+   | |                          |  |                |  |      |                |                               |
+   | | +------+    +---------+  |  |                |  |  +---v----------+     |                               |
+   | | | etcd |    |         |  |  |                |  |  | Kubernetes   |     |              +-----------+    |
+   | | |      |<-->|kube-    |<---------------------------+ InfraDriver  +------------------->| TackerDB  |    |
+   | | +------+    |apiserver|  |  | 3. Get pod     |  |  |              |     |              |           |    |
+   | |             |         |  |  |    information |  |  +--------------+     |              +-----------+    |
+   | |             |         |  |  |                |  |   4. Check difference |   2. Get pod information      |
+   | |             |         |  |  |                |  |                       |                               |
+   | |             |         |  |  |                |  |                       |   5. If there are difference, |
+   | |             +---------+  |  |                |  |                       |      update pod information   |
+   | |                          |  |                |  |                       |                               |
+   | +--------------------------+  |                |  +-----------------------+                               |
+   |                               |                |                                                          |
+   +-------------------------------+                +----------------------------------------------------------+
+
+
+
+
+Sequence for periodic database synchronization:
+
+.. seqdiag::
+
+  seqdiag {
+    node_width = 80;
+    edge_length = 100;
+
+    "Client"
+    "Tacker-server"
+    "Tacker-conductor"
+    "VnfLcmDriver/VnfLcmDriverV2"
+    "KubernetesInfraDriver"
+    "TackerDB"
+    "Kubernetes client"
+
+    === Start periodic database synchronization. ===
+    "Tacker-conductor" -> "VnfLcmDriver/VnfLcmDriverV2"
+      [label = "1. Start periodic database synchronization."];
+    "VnfLcmDriver/VnfLcmDriverV2" -> "KubernetesInfraDriver"
+      [label = "check pod status"];
+    "KubernetesInfraDriver" -> "TackerDB"
+      [label = "2. Get pod information"];
+    "KubernetesInfraDriver" <-- "TackerDB"
+      [label = "return pod information"];
+    "KubernetesInfraDriver" -> "Kubernetes client"
+      [label = "3. Get pod information"];
+    "KubernetesInfraDriver" <-- "Kubernetes client"
+      [label = "return pod information"];
+    "KubernetesInfraDriver" -> "KubernetesInfraDriver"
+      [label = "4. Check differences"];
+    "KubernetesInfraDriver" -> "TackerDB"
+      [label = "5. If there are difference, update pod information"];
+
+    "Client" -> "Tacker-server"
+      [label = "LCM request", rightnote = "This is a sample of LCM request,\nnot intending to describe specific LCM."];
+    "Tacker-server" -> "Tacker-conductor"
+      [label = "LCM request"];
+    "Tacker-conductor" -> "VnfLcmDriver/VnfLcmDriverV2"
+      [label = "LCM request"];
+    "VnfLcmDriver/VnfLcmDriverV2" -> "KubernetesInfraDriver"
+      [label = "LCM request"];
+    "VnfLcmDriver/VnfLcmDriverV2" <-- "KubernetesInfraDriver"
+      [label = "Return error due to synchronization", rightnote = "LCM requests will be rejected\nduring synchronization."];
+    "Tacker-conductor" <-- "VnfLcmDriver/VnfLcmDriverV2"
+      [label = "Return error due to synchronization"];
+    "Tacker-server" <-- "Tacker-conductor"
+      [label = "Return error due to synchronization"];
+    "Client" <-- "Tacker-server"
+      [label = "409 Conflict"];
+
+    "KubernetesInfraDriver" <-- "TackerDB"
+      [label = ""];
+    "VnfLcmDriver/VnfLcmDriverV2" <-- "KubernetesInfraDriver"
+      [label = ""];
+    "Tacker-conductor" <-- "VnfLcmDriver/VnfLcmDriverV2"
+      [label = ""];
+    "Client" <-- "Tacker-conductor"
+      [style=none];
+
+    === In case of conflicting DB synchronization during LCMs ===
+    "Client" -> "Tacker-server"
+      [label = "LCM request", rightnote = "This is a sample of LCM request,\nnot intending to describe specific LCM."];
+    "Tacker-server" -> "Tacker-conductor"
+      [label = "LCM request"];
+    "Tacker-conductor" -> "VnfLcmDriver/VnfLcmDriverV2"
+      [label = "LCM request"];
+    "VnfLcmDriver/VnfLcmDriverV2" -> "KubernetesInfraDriver"
+      [label = "LCM request"];
+
+    "Tacker-server" -> "Tacker-conductor"
+      [style=none]
+    "Tacker-conductor" -> "VnfLcmDriver/VnfLcmDriverV2"
+      [label = "1'. Start periodic database synchronization."];
+    "VnfLcmDriver/VnfLcmDriverV2" -> "KubernetesInfraDriver"
+      [label = "check pod status"];
+    "KubernetesInfraDriver" -> "TackerDB"
+      [label = "2'. Get pod information"];
+    "KubernetesInfraDriver" <-- "TackerDB"
+      [label = "return pod information"];
+    "KubernetesInfraDriver" -> "Kubernetes client"
+      [label = "3'. Get pod information"];
+    "KubernetesInfraDriver" <-- "Kubernetes client"
+      [label = "return pod information"];
+    "KubernetesInfraDriver" -> "KubernetesInfraDriver"
+      [label = "4'. Check differences"];
+    "KubernetesInfraDriver" -> "KubernetesInfraDriver"
+      [label = "5'. Skip DB update if the target instance is same as current ongoing LCM."];
+    "VnfLcmDriver/VnfLcmDriverV2" <-- "KubernetesInfraDriver"
+      [label = ""];
+    "Tacker-conductor" <-- "VnfLcmDriver/VnfLcmDriverV2"
+      [label = ""];
+    "Tacker-server" <-- "Tacker-conductor"
+      [style=none]
+
+    "VnfLcmDriver/VnfLcmDriverV2" <-- "KubernetesInfraDriver"
+      [label = "return"];
+    "Tacker-conductor" <-- "VnfLcmDriver/VnfLcmDriverV2"
+      [label = "return"];
+    "Tacker-server" <-- "Tacker-conductor"
+      [label = "return"];
+    "Client" <-- "Tacker-server"
+      [label = "return"];
+  }
+
+#. Start the database synchronization process periodically at
+   intervals specified in Config
+
+#. KubernetesInfraDriver retrieves Pod information stored in TackerDB
+
+#. KubernetesInfraDriver retrieves information about currently running
+   Pod name and number of Pod from KubernetesDB
+
+#. KubernetesInfraDriver compares these two pieces of information
+
+   Comparisons include:
+
+   * Pod name
+
+   * Number of running Pods
+
+#. If there are differences, KubernetesInfraDriver updates the
+   Pod information and scaleLevel to TackerDB.
+
+   * Any LCM requests of target instances is rejected by HTTP 409
+     (Conflict).
+
+Step 1' to 5' describes the detail of conflict of DB synchronization during
+LCM process. If DB synchronization is needed for the instance which is
+affected by ongoing LCM procedure, DB synchronization will be skipped for that
+instance.
+
+Configuration for Interval timer
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Config can set a timer to synchronize pod information between
+TackerDB and KubernetesDB.
+
+This configuration is common to the v1 and v2 APIs.
+
+The default is 300 seconds.
+
+
+Data model impact
+-----------------
+
+None
+
+REST API impact
+---------------
+
+None
+
+Security impact
+---------------
+
+None
+
+Notifications impact
+--------------------
+
+None
+
+Other end user impact
+---------------------
+
+None
+
+Performance impact
+------------------
+
+Contention may occur when database synchronization and LCM operations are
+performed simultaneously.
+
+ If you perform an LCM operation on the same VNF instance while periodic
+  database synchronization is taking place on that VNF instance, Tacker
+  responds with "409 conflicts" and the LCM operation is not performed.
+ If database synchronization is performed for the same VNF instance during
+  an LCM operation for a VNF instance, the database synchronization is skipped.
+
+Other deployer impact
+---------------------
+
+None
+
+Developer impact
+----------------
+
+None
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+ Masaki Ueno <masaki.ueno.up@hco.ntt.co.jp>
+
+Other contributors:
+ Hideki Matsuda <matsuda.hideki1@fujitsu.com>
+
+ Ayumu Ueha <ueha.ayumu@fujitsu.com>
+
+ Yoshiyuki Katada <katada.yoshiyuk@fujitsu.com>
+
+ Yusuke Niimi <niimi.yusuke@fujitsu.com>
+
+Work Items
+----------
+
+ Support for periodic database synchronization
+
+ Add a new configuration about interval for checking database synchronization
+
+ Add new unit and functional tests
+
+Dependencies
+============
+
+None
+
+Testing
+=======
+
+Unit and functional tests will be added to cover cases required in the
+specification.
+
+Documentation Impact
+====================
+
+Complete configuration guide will be added to explain configuration
+about specify the interval between synchronizations.
+
+References
+==========
+
+None