Merge "Added Congress HA Overview guide"

2016-08-30 19:15:26 +00:00 · 2016-08-30 19:15:26 +00:00 · 9e97f398ab
parent 82f360c2b9 eb77bd1153
commit 9e97f398ab
4 changed files with 325 additions and 79 deletions
--- a/README.rst
+++ b/README.rst
@ -14,9 +14,9 @@ cloud operator can declare, monitor, enforce, and audit "policy" in a
 heterogeneous cloud environment.  Congress gets inputs from a cloud's
 various cloud services; for example in OpenStack, Congress fetches
 information about VMs from Nova, and network state from Neutron, etc.
-Congress then feeds input data from those services into its policy engine where Congress
-verifies that the cloud's actual state abides by the cloud operator's
-policies.  Congress is designed to work with **any policy** and
+Congress then feeds input data from those services into its policy engine
+where Congress verifies that the cloud's actual state abides by the cloud
+operator's policies.  Congress is designed to work with **any policy** and
 **any cloud service**.

 2. Why is Policy Important
@ -60,7 +60,7 @@ resembles datalog.  For more detail about the policy language and data
 format see :ref:`Policy <policy>`.

 To add a service as an input data source, the cloud operator configures a Congress
-"driver", and the driver queries the service.  Congress already
+"driver," and the driver queries the service.  Congress already
 has drivers for several types of service, but if a cloud operator
 needs to use an unsupported service, she can write a new driver
 without much effort and probably contribute the driver to the
@ -88,24 +88,29 @@ Congress is free software and is licensed with Apache.

 There are 2 ways to install Congress.

-* As part of devstack.  Get Congress running alongside other OpenStack services like Nova
+* As part of DevStack.  Get Congress running alongside other OpenStack services like Nova
  and Neutron, all on a single machine.  This is a great way to try out Congress for the
  first time.

 * Separate install.  Get Congress running alongside an existing OpenStack
- deployment.
+  deployment

 4.1 Devstack-install
 --------------------
-For integrating congress with DevStack::
+For integrating Congress with DevStack:

-1. Download DevStack::
+1. Download DevStack
+
+.. code-block:: console

    $ git clone https://git.openstack.org/openstack-dev/devstack.git
    $ cd devstack

-2. Configure devstack to use Congress and any other service you want.  To do that, modify
-   the ``local.conf`` file (inside the devstack directory).  Here is what our file looks like::
+2. Configure DevStack to use Congress and any other service you want.  To do that, modify
+   the ``local.conf`` file (inside the DevStack directory).  Here is what
+   our file looks like:
+
+.. code-block:: console

    [[local|localrc]]

@ -122,13 +127,15 @@ For integrating congress with DevStack::
    enable_service s-proxy s-object s-container s-account

 3. Run ``stack.sh``.  The default configuration expects the passwords to be 'password'
-   without the quotes::
+   without the quotes
+
+.. code-block:: console

    $ ./stack.sh


 4.2 Separate install
----------------------
+--------------------
 Install the following software, if you haven't already.

 * python 2.7: https://www.python.org/download/releases/2.7/
@ -136,28 +143,36 @@ Install the following software, if you haven't already.
 * pip: https://pip.pypa.io/en/latest/installing.html

 * java: http://java.com  (any reasonably current version should work)
-  On Ubuntu: apt-get install default-jre
+  On Ubuntu:   console apt-get install default-jre

-* Additionally::
+* Additionally
+
+.. code-block:: console

  $ sudo apt-get install git gcc python-dev libxml2 libxslt1-dev libzip-dev mysql-server python-mysqldb build-essential libssl-dev libffi-dev

-Clone Congress::
+Clone Congress
+
+.. code-block:: console

  $ git clone https://github.com/openstack/congress.git
  $ cd congress

-Install requirements::
+Install requirements

- $ sudo pip install .
+.. code-block:: console

-Install Source code::
+ $ sudo pip install
+
+Install Source code
+
+.. code-block:: console

  $ sudo python setup.py install

-Configure congress::
+Configure Congress  (Assume you put config files in /etc/congress)

-  (Assume you put config files in /etc/congress)
+.. code-block:: console

  $ sudo mkdir -p /etc/congress
  $ sudo mkdir -p /etc/congress/snapshot
@ -165,21 +180,29 @@ Configure congress::
  $ sudo cp etc/policy.json /etc/congress
  $ sudo touch /etc/congress/congress.conf

-  Add drivers in /etc/congress/congress.conf [DEFAULT] section:
+
+Add drivers in /etc/congress/congress.conf [DEFAULT] section:
+
+.. code-block:: text

  drivers = congress.datasources.neutronv2_driver.NeutronV2Driver,congress.datasources.glancev2_driver.GlanceV2Driver,congress.datasources.nova_driver.NovaDriver,congress.datasources.keystone_driver.KeystoneDriver,congress.datasources.ceilometer_driver.CeilometerDriver,congress.datasources.cinder_driver.CinderDriver,congress.datasources.swift_driver.SwiftDriver,congress.datasources.plexxi_driver.PlexxiDriver,congress.datasources.vCenter_driver.VCenterDriver,congress.datasources.murano_driver.MuranoDriver,congress.datasources.ironic_driver.IronicDriver

-  Modify [keystone_authtoken] and [database] according to your environment.

-  For setting congress with "noauth":
-    Add the following line to [DEFAULT] section in /etc/congress/congress.conf
+Modify [keystone_authtoken] and [database] according to your environment.
+
+For setting Congress with "noauth":
+Add the following line to [DEFAULT] section in /etc/congress/congress.conf
+
+.. code-block:: text

    auth_strategy = noauth

-    Also, might want to delete/comment [keystone_authtoken] section in
-    /etc/congress/congress.conf
+Also, might want to delete/comment [keystone_authtoken] section in
+ /etc/congress/congress.conf

-  A bare-bones congress.conf is as follows (adapt MySQL root password):
+A bare-bones congress.conf is as follows (adapt MySQL root password):
+
+.. code-block:: text

  [DEFAULT]
  drivers = congress.datasources.neutronv2_driver.NeutronV2Driver,congress.datasources.glancev2_driver.GlanceV2Driver,congress.datasources.nova_driver.NovaDriver,congress.datasources.keystone_driver.KeystoneDriver,congress.datasources.ceilometer_driver.CeilometerDriver,congress.datasources.cinder_driver.CinderDriver,congress.datasources.swift_driver.SwiftDriver,congress.datasources.plexxi_driver.PlexxiDriver,congress.datasources.vCenter_driver.VCenterDriver,congress.datasources.murano_driver.MuranoDriver,congress.datasources.ironic_driver.IronicDriver
@ -187,76 +210,88 @@ Configure congress::
  [database]
  connection = mysql://root:password@127.0.0.1/congress?charset=utf8

-  For a detailed sample, please follow README-congress.conf.txt
+For a detailed sample, please follow README-congress.conf.txt

-Create database::
+Create database
+
+.. code-block:: console

  $ mysql -u root -p
  $ mysql> CREATE DATABASE congress;
-  $ mysql> GRANT ALL PRIVILEGES ON congress.* TO 'congress'@'localhost' \
-           IDENTIFIED BY 'CONGRESS_DBPASS';
-  $ mysql> GRANT ALL PRIVILEGES ON congress.* TO 'congress'@'%' \
-           IDENTIFIED BY 'CONGRESS_DBPASS';
+  $ mysql> GRANT ALL PRIVILEGES ON congress.* TO 'congress'@'localhost' IDENTIFIED BY 'CONGRESS_DBPASS';
+  $ mysql> GRANT ALL PRIVILEGES ON congress.* TO 'congress'@'%' IDENTIFIED BY 'CONGRESS_DBPASS';

-  (Configure congress.conf with db information)

-  Push down schema
+Configure congress.conf with db information.
+
+Push down schema
+
+.. code-block:: console
+
  $ sudo congress-db-manage --config-file /etc/congress/congress.conf upgrade head

-Setup congress accounts::
-  (Use your OpenStack RC file to set and export required environment variables:
-  OS_USERNAME, OS_PASSWORD, OS_PROJECT_NAME, OS_TENANT_NAME, OS_AUTH_URL)
+Set up Congress accounts
+  Use your OpenStack RC file to set and export required environment variables:
+  OS_USERNAME, OS_PASSWORD, OS_PROJECT_NAME, OS_TENANT_NAME, OS_AUTH_URL.

  (Adapt parameters according to your environment)

+.. code-block:: console
+
  $ ADMIN_ROLE=$(openstack role list | awk "/ admin / { print \$2 }")
  $ SERVICE_TENANT=$(openstack project list | awk "/ admin / { print \$2 }")
-  $ CONGRESS_USER=$(openstack user create --password password --project admin \
-    --email "congress@example.com" congress | awk "/ id / {print \$4 }")
-  $ openstack role add $ADMIN_ROLE --user $CONGRESS_USER --project \
-    $SERVICE_TENANT
-  $ CONGRESS_SERVICE=$(openstack service create congress --name "policy" \
-    --description "Congress Service" | awk "/ id / { print \$4 }")
-  $ openstack endpoint create $CONGRESS_SERVICE \
-    --region RegionOne \
-    --publicurl http://127.0.0.1:1789/ \
-    --adminurl http://127.0.0.1:1789/ \
-    --internalurl http://127.0.0.1:1789/
+  $ CONGRESS_USER=$(openstack user create --password password --project admin --email "congress@example.com" congress | awk "/ id / {print \$4 }")
+  $ openstack role add $ADMIN_ROLE --user $CONGRESS_USER --project  $SERVICE_TENANT
+  $ CONGRESS_SERVICE=$(openstack service create congress --name "policy" --description "Congress Service" | awk "/ id / { print \$4 }")
+  $ openstack endpoint create $CONGRESS_SERVICE --region RegionOne --publicurl http://127.0.0.1:1789/  --adminurl http://127.0.0.1:1789/ --internalurl http://127.0.0.1:1789/

-Start congress::
+Start Congress
+  The default behavior is to start the Congress API, Policy Engine, and
+  Datasource in a single node. For HAHT deployment options, please see the
+  :ref:`HA Overview <ha_overview>` document.
+
+.. code-block:: console

  $ sudo /usr/local/bin/congress-server --debug

-Configure datasource drivers::
+Configure datasource drivers
+  First make sure you have the Congress client (project python-congressclient)
+  installed. Run this command for every service that Congress will poll for
+  data. Please note that the service name $SERVICE should match the ID of the
+  datasource driver, e.g. "neutronv2" for Neutron and "glancev2" for Glance;
+  $OS_USERNAME, $OS_TENANT_NAME, $OS_PASSWORD and $SERVICE_HOST are used to
+  configure the realted datasource driver so that congress knows how to
+  talk with the service.

-  First make sure you have congress client (project python-congressclient) installed.
-  Run this command for every service that congress will poll for data:
+.. code-block:: console

-  $ openstack congress datasource create $SERVICE "$SERVICE" \
-    --config username=$OS_USERNAME \
-    --config tenant_name=$OS_TENANT_NAME \
-    --config password=$OS_PASSWORD \
-    --config auth_url=http://$SERVICE_HOST:5000/v2.0
+  $ openstack congress datasource create $SERVICE "$SERVICE" --config username=$OS_USERNAME --config tenant_name=$OS_TENANT_NAME --config password=$OS_PASSWORD --config auth_url=http://$SERVICE_HOST:5000/v2.0

-  Please note that the service name $SERVICE should match the id of the datasource driver,
-  e.g. "neutronv2" for Neutron and "glancev2" for Glance. $OS_USERNAME, $OS_TENANT_NAME,
-  $OS_PASSWORD and $SERVICE_HOST are used to configure the related datasource driver
-  so that congress knows how to talk with the service.

-Install test harness::
+Install test harness
+
+.. code-block:: console

  $ sudo pip install 'tox<1.7'

-Run unit tests::
+Run unit tests
+
+.. code-block:: console

  $ tox -epy27

-Read the HTML documentation::
+Read the HTML documentation
  Install python-sphinx and the oslosphinx extension if missing.
+
+.. code-block:: console
+
  $ sudo pip install sphinx
  $ sudo pip install oslosphinx

  Build the docs
+
+.. code-block:: console
+
  $ make docs

  Open doc/html/index.html in a browser
@ -267,42 +302,57 @@ Read the HTML documentation::
 Here are the instructions for upgrading to a new release of the
 Congress server.

-0. Stop the Congress server.
+1. Stop the Congress server.

-1. Update the Congress git repo::
+2. Update the Congress git repo
+
+.. code-block:: console

  $ cd /path/to/congress
  $ git fetch origin

-2. Checkout the release you are interested in, say mitaka.  Note that this
+3. Checkout the release you are interested in, say Mitaka.  Note that this
 step will not succeed if you have any uncommitted changes in the repo.

+.. code-block:: console
+
  $ git checkout origin/stable/mitaka

-If you have changes committed locally that are not merged into public
+
+If you have changes committed locally that are not merged into the public
 repository, you now need to cherry-pick those changes onto the new
 branch.

-3. Install dependencies::
+4. Install dependencies

- $ sudo pip install .
+.. code-block:: console

-4. Install source code::
+ $ sudo pip install
+
+5. Install source code
+
+.. code-block:: console

  $ sudo python setup.py install

-5. Migrate the database schema::
+6. Migrate the database schema
+
+.. code-block:: console

  $ sudo congress-db-manage --config-file /etc/congress/congress.conf upgrade head

-6. (optional) Check if the configuration options you are currently using are
+7. (optional) Check if the configuration options you are currently using are
   still supported and whether there are any new configuration options you
   would like to use.  To see the current list of configuration options,
   use the following command, which will create a sample configuration file
   in ``etc/congress.conf.sample`` for you to examine.

+.. code-block:: console
+
   $ tox -egenconfig

-7. Restart congress, e.g. ::
+8. Restart Congress, e.g.
+
+.. code-block:: console

  $ sudo /usr/local/bin/congress-server --debug
--- a/doc/source/ha-deployment.rst
+++ b/doc/source/ha-deployment.rst
@ -4,7 +4,7 @@


 =============
-HA deployment
+HA Deployment
 =============

 Overview
@ -13,7 +13,8 @@ Overview
 This section shows how to deploy Congress with High Availability (HA).
 Congress is divided to 2 parts in HA. First part is API and PolicyEngine
 Node which is replicated with Active-Active style. Another part is
-DataSource Node which is deployed with warm-stanby style.
+DataSource Node which is deployed with warm-standby style. Please see the
+:ref:`HA Overview <ha_overview>` for details.

 .. code-block:: text

@ -43,10 +44,22 @@ DataSource Node which is deployed with warm-stanby style.
 HA for API and Policy Engine Node
 ---------------------------------

-Please write down how to deploy API and/or PolicyEngine Node.
+New config settings for setting the DSE node type:
+
+- N (>=2 even okay) nodes of PE+API node
+
+  .. code-block:: console
+
+    $ python /usr/local/bin/congress-server --api --policy_engine --node_id=<api_unique_id>
+
+- One single DSD node
+
+  .. code-block:: console
+
+    $ python /usr/local/bin/congress-server --datasources --node_id=<datasource_unique_id>

 HA for DataSource Node
-----------------------
+----------------------

 Nodes which DataSourceDriver runs on takes warm-standby style. Congress assumes
 cluster manager handles the active-standby cluster. In this document, we describe
--- a/doc/source/ha-overview.rst
+++ b/doc/source/ha-overview.rst
@ -0,0 +1,182 @@
+
+.. include:: aliases.rst
+
+.. _ha_overview:
+
+===========
+HA Overview
+===========
+Some applications require Congress to be highly available. Some
+applications require a Congress Policy Engine (PE) to handle a high volume of
+queries. This guide describes Congress support for High Availability (HA)
+High Throughput (HT) deployment.
+
+Please see the `OpenStack High Availability Guide`__ for details on how to
+install and configure OpenStack for High Availability.
+
+__ http://docs.openstack.org/ha-guide/index.html
+
+HA Types
+========
+
+Warm Standby
+~~~~~~~~~~~~
+Warm Standby is when a software component is installed and available on the
+secondary node. The secondary node is up and running. In the case of a
+failure on the primary node, the software component is started on the
+secondary node. This process is usually automated using a cluster manager.
+Data is regularly mirrored to the secondary system using disk based replication
+or shared disk. This generally provides a recovery time of a few minutes.
+
+Active-Active (Load-Balanced)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+In this method, both the primary and secondary systems are active and
+processing requests in parallel. Data replication happens through software
+capabilities and would be bi-directional. This generally provides a recovery
+time that is instantaneous.
+
+
+Congress HAHT
+=============
+Congress provides Active-Active for the Policy Engine and Warm Standby for
+the Datasource Drivers.
+
+Run N instances of the Congress Policy Engine in active-active
+configuration, so both the primary and secondary systems are active
+and processing requests in parallel.
+
+One Datasource Driver (DSD) per physical datasource, publishing data on
+oslo-messaging to all policy engines.
+
+.. code-block:: text
+
+  +-------------------------------------+      +--------------+
+  |       Load Balancer (eg. HAProxy)   | <----+ Push client  |
+  +----+-------------+-------------+----+      +--------------+
+       |             |             |
+  PE   |        PE   |        PE   |        all+DSDs node
+  +---------+   +---------+   +---------+   +-----------------+
+  | +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
+  | | API | |   | | API | |   | | API | |   | | DSD | | DSD | |
+  | +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
+  | +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
+  | | PE  | |   | | PE  | |   | | PE  | |   | | DSD | | DSD | |
+  | +-----+ |   | +-----+ |   | +-----+ |   | +-----+ +-----+ |
+  +---------+   +---------+   +---------+   +--------+--------+
+       |             |             |                 |
+       |             |             |                 |
+       +--+----------+-------------+--------+--------+
+          |                                 |
+          |                                 |
+  +-------+----+   +------------------------+-----------------+
+  |  Oslo Msg  |   | DBs (policy, config, push data, exec log)|
+  +------------+   +------------------------------------------+
+
+
+- Downtime: < 1s for queries, ~2s for reactive enforcement
+- Deployment considerations:
+
+  - Cluster manager (eg. Pacemaker + Corosync) can be used to manage warm
+    standby
+  - Does not require global leader election
+- Performance considerations:
+
+  - Multi-process, multi-node query throughput
+  - No redundant data-pulling load on datasources
+  - DSDs node separate from PE, allowing high load DSDs to operate more
+    smoothly and avoid affecting PE performance.
+  - PE nodes are symmetric in configuration, making it easy to load balance
+    evenly.
+
+Details
+~~~~~~~
+
+- Datasource Drivers (DSDs):
+
+  - One datasource driver per physical datasource
+  - All DSDs run in a single DSE node (process)
+  - Push DSDs: optionally persist data in push data DB, so a new snapshot
+    can be obtained whenever needed
+  - Warm Standby:
+
+    - Only one set of DSDs running at a given time; backup instances ready
+      to launch
+    - For pull DSDs, warm standby is most appropriate because warm startup
+      time is low (seconds) relative to frequency of data pulls
+    - For push DSDs, warm standby is generally sufficient except for use cases
+      that demand sub-second latency even during a failover
+- Policy Engine (PE):
+
+  - Replicate policy engine in active-active configuration.
+  - Policy synchronized across PE instances via Policy DB
+  - Every instance subscribes to the same data on oslo-messaging
+  - Reactive Enforcement:
+    All PE instances initiate reactive policy actions, but each DSD locally
+    selects a leader to listen to. The DSD ignores execution requests
+    initiated by all other PE instances.
+
+    - Every PE instance computes the required reactive enforcement actions and
+      initiates the corresponding execution requests over oslo-messaging
+    - Each DSD locally picks a PE instance as leader (say the first instance
+      the DSD hears from in the asymmetric node deployment, or the PE
+      instance on the same node as the DSD in a symmetric node deployment) and
+      executes only requests from that PE
+    - If heartbeat contact is lost with the leader, the DSD selects a new
+      leader
+    - Each PE instance is unaware of whether it is a leader
+  - Node Configurations:
+
+    - Congress supports the Two Node-Types (API+PE nodes, all-DSDs) node
+      configuration because it gives reasonable support for high-load DSDs
+      while keeping the deployment complexities low.
+  - Local Leader for Action Execution:
+
+    - Local Leader: every PE instance sends action-execution requests, but
+      each receiving DSD locally picks a "leader" to listen to
+    - Because there is a single active DSD for a given data source,
+      it is a natural spot to locally choose a "leader" among the PE instances
+      sending reactive enforcement action execution requests. Congress
+      supports the local leader style because it avoids the deployment
+      complexities associated with global leader election. Furthermore,
+      because all PE instances perform reactive enforcement and send action
+      execution requests, the redundancy opens up the possibility for zero
+      disruption to reactive enforcement when a PE instance fails.
+- API:
+
+  - Each node has an active API service
+  - Each API service routes requests for the PE to its associated intranode PE
+  - Requests for any other service (eg. get data source status) are routed to
+    the Datasource and/or Policy Engine, which will be fielded by some active
+    instance of the service on some node
+- Load balancer:
+
+  - Layer 7 load balancer (e.g. HAProxy) distributes incoming API calls among
+    the nodes (each running an API service).
+  - load balancer optionally configured to use sticky session to pin each API
+    caller to a particular node. This configuration avoids the experience of
+    going back in time.
+- External components (load balancer, DBs, and oslo messaging bus) can be made
+  highly available using standard solutions (e.g. clustered LB, Galera MySQL
+  cluster, HA rabbitMQ)
+
+
+Performance Impact
+==================
+- In single node deployment, there is generally no performance impact.
+- Increased latency due to network communication required by multi-node
+  deployment
+- Increased reactive enforcement latency if action executions are persistently
+  logged to facilitate smoother failover
+- PE replication can achieve greater query throughput
+
+End User Impact
+===============
+Different PE instances may be out-of-sync in their data and policies (eventual
+consistency). The issue is generally made transparent to the end  user by
+making each user sticky to a particular PE instance. But if a PE instance
+goes down, the end user reaches a different instance and may experience
+out-of-sync artifacts.
+
+Installation
+============
+Please see :ref:`HA Deployment <ha_deployment>` for details.
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@ -19,6 +19,7 @@ Contents:
   enforcement
   api
   config
+   ha-overview
   ha-deployment

   contributing