372 lines
17 KiB
ReStructuredText
372 lines
17 KiB
ReStructuredText
.. include:: aliases.rst
|
||
|
||
.. _enforcement:
|
||
|
||
|
||
==========================
|
||
Monitoring and Enforcement
|
||
==========================
|
||
|
||
Congress is given two inputs: the other cloud
|
||
services in the datacenter and a policy describing the desired state of those
|
||
services. Congress does two things with those inputs: monitoring and
|
||
enforcement. *Monitoring* means passively comparing the actual state of the
|
||
other cloud services and the desired state (i.e. policy) and flagging
|
||
mismatches. *Enforcement* means actively working
|
||
to ensure that the actual state of the other cloud services is also a desired
|
||
state (i.e. that the other services obey policy).
|
||
|
||
1. Monitoring
|
||
=============
|
||
Recall from :ref:`Policy <policy>` that policy violations are represented with the
|
||
table *error*. To ask Congress for a list of all policy violations, we
|
||
simply ask it for the contents of the *error* table.
|
||
|
||
For example, recall our policy from :ref:`Policy <policy>`: each Neutron port has at
|
||
most one IP address. For that policy, the *error* table is has 1 row for
|
||
each Neutron port that has more than 1 IP address. Each of those rows
|
||
specify the UUID for the port, and two different IP addresses. So if we
|
||
had the following mapping of Neutron ports to IP addresses:
|
||
|
||
====================================== ==========
|
||
ID IP
|
||
====================================== ==========
|
||
"66dafde0-a49c-11e3-be40-425861b86ab6" "10.0.0.1"
|
||
"66dafde0-a49c-11e3-be40-425861b86ab6" "10.0.0.2"
|
||
"73e31d4c-e89b-12d3-a456-426655440000" "10.0.0.3"
|
||
"73e31d4c-e89b-12d3-a456-426655440000" "10.0.0.4"
|
||
"8caead95-67d5-4f45-b01b-4082cddce425" "10.0.0.5"
|
||
====================================== ==========
|
||
|
||
the *error* table would be something like the one shown below.
|
||
|
||
====================================== ========== ==========
|
||
ID IP 1 IP 2
|
||
====================================== ========== ==========
|
||
"66dafde0-a49c-11e3-be40-425861b86ab6" "10.0.0.1" "10.0.0.2"
|
||
"73e31d4c-e89b-12d3-a456-426655440000" "10.0.0.3" "10.0.0.4"
|
||
====================================== ========== ==========
|
||
|
||
The API would return this table as the following collection of Datalog facts
|
||
(encoded as a string)::
|
||
|
||
error("66dafde0-a49c-11e3-be40-425861b86ab6", "10.0.0.1", "10.0.0.2")
|
||
error("73e31d4c-e89b-12d3-a456-426655440000", "10.0.0.3", "10.0.0.4")
|
||
|
||
It is the responsibility of the client to periodically ask the server for the
|
||
contents of the error table.
|
||
|
||
|
||
2. Proactive Enforcement
|
||
========================
|
||
Often we want policy to be enforced, not just monitored. *Proactive
|
||
enforcement* is the term we use to mean preventing policy violations before
|
||
they occur. Proactive enforcement requires having enforcement points in the
|
||
cloud that stop changes before they happen. Cloud services like Nova,
|
||
Neutron, and Cinder are good examples of enforcement points. For example,
|
||
Nova could refuse to provision a VM that would cause a policy violation,
|
||
thereby proactively enforcing policy.
|
||
|
||
To enable other cloud services like Nova to check if a proposed change in the
|
||
cloud state would violate policy, the cloud service can consult Congress
|
||
using its :func:`simulate` functionality. The idea for :func:`simulate` is
|
||
that we ask Congress to answer a query after having
|
||
temporarily made some changes to data and policies. Simulation allows us to
|
||
explore the effects of proposed changes. Typically simulation is used to ask:
|
||
if I made these changes, would there be any new policy violations?
|
||
For example, provisioning a new VM might add rows to several of Nova's tables.
|
||
After receiving an API call that requests a new VM be provisioned, Nova could
|
||
ask Congress if adding those rows would create any new policy violations.
|
||
If new violations arise, Nova could refuse to provision the VM, thereby
|
||
proactively enforcing the policy.
|
||
|
||
|
||
In this writeup we assume you are using the python-client.
|
||
|
||
Suppose you want to know the policy violations after making the following
|
||
changes.
|
||
|
||
1. insert a row into the *nova:servers* table with ID uuid1, 2TB of disk,
|
||
and 10GB of memory
|
||
2. delete the row from *neutron:security_groups* with the ID “uuid2” and name
|
||
“alice_default_group”
|
||
|
||
(Here we assume the nova:servers table has columns ID, disk-size, and memory
|
||
and that neutron:security groups has columns ID, and name.)
|
||
|
||
To do a simulation from the command line, you use the following command::
|
||
|
||
$ openstack congress policy simulate <policy-name> <query> <change-sequence> <action-policy-name>
|
||
|
||
* <policy-name>: the name of the policy in which to run the query
|
||
* <query>: a string representing the query you would like to run after
|
||
applying the change sequence
|
||
* <change-sequence>: a string codifying a sequence of insertions and deletions
|
||
of data and rules. Insertions are denoted by '+' and deletions by '-'
|
||
* <action-policy-name>: the name of another policy of type 'action' describing
|
||
the effects of any actions occurring in <change-sequence>. Actions are not
|
||
necessary and are explained later. Without actions, this argument can be
|
||
anything (and will in the future be optional).
|
||
|
||
For our nova:servers and neutron:security_groups example, we would run the
|
||
following command to find all of the policy violations after inserting a row
|
||
into nova:servers and then deleting a row out of neutron:security_groups::
|
||
|
||
$ openstack congress policy simulate classification
|
||
'error(x)’
|
||
'nova:servers+(“uuid1”, “2TB”, “10 GB”)
|
||
neutron:security_groups-(“uuid2”, “alice_default_group”)'
|
||
null
|
||
|
||
**More examples**
|
||
|
||
Suppose the table 'p' is a collection of key-value pairs: p(key, value).
|
||
Let's begin by creating a policy and adding some key/value pairs for 'p'::
|
||
|
||
$ openstack congress policy create alice
|
||
$ openstack congress policy rule create alice 'p(101, 0)'
|
||
$ openstack congress policy rule create alice 'p(202, "abc")'
|
||
$ openstack congress policy rule create alice 'p(302, 9)'
|
||
|
||
Let's also add a statement that says there's an error if a single key has
|
||
multiple values or if any value is assigned 9::
|
||
|
||
$ openstack congress policy rule create classification
|
||
'error(x) :- p(x, val1), p(x, val2), not eq(val1, val2)'
|
||
$ openstack congress policy rule create classification 'error(x) :- p(x, 9)'
|
||
|
||
|
||
Each of the following is an example of a simulation query you might want to run.
|
||
|
||
a) **Basic usage**. Simulate adding the value 5 to key 101 and ask for the contents of p::
|
||
|
||
$ openstack congress policy simulate classification 'p(x,y)' 'p+(101, 5)' null
|
||
p(101, 0)
|
||
p(101, 5)
|
||
p(202, "abc")
|
||
p(302, 9)
|
||
|
||
b) **Error table**. Simulate adding the value 5 to key 101 and ask for the contents of error::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)' 'p+(101, 5)' null
|
||
error(101)
|
||
error(302)
|
||
|
||
c) **Inserts and Deletes**. Simulate adding the value 5 to key 101 and deleting 0 and ask for the contents of error::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'p+(101, 5) p-(101, 0)' null
|
||
error(302)
|
||
|
||
|
||
d) **Error changes**. Simulate changing the value of key 101 to 9 and query the **change** in the error table::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'p+(101, 9) p-(101, 0)' null --delta
|
||
error+(101)
|
||
|
||
|
||
f) **Multiple error changes**. Simulate changing 101:9, 202:9, 302:1 and query the *change* in the error table::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'p+(101, 9) p-(101, 0) p+(202, 9) p-(202, "abc") p+(302, 1) p-(302, 9)'
|
||
null --delta
|
||
error+(202)
|
||
error+(101)
|
||
error-(302)
|
||
|
||
|
||
g) **Order matters**. Simulate changing 101:9, 202:9, 302:1, and finally 101:15 (in that order). Then query the *change* in the error table::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'p+(101, 9) p-(101, 0) p+(202, 9) p-(202, "abc") p+(302, 1) p-(302, 9)
|
||
p+(101, 15) p-(101, 9)' null --delta
|
||
error+(202)
|
||
error-(302)
|
||
|
||
|
||
h) **Tracing**. Simulate changing 101:9 and query the *change* in the error table, while asking for a debug trace of the computation::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'p+(101, 9) p-(101, 0)' null --delta --trace
|
||
error+(101)
|
||
RT : ** Simulate: Querying error(x)
|
||
Clas : Call: error(x)
|
||
Clas : | Call: p(x, 9)
|
||
Clas : | Exit: p(302, 9)
|
||
Clas : Exit: error(302)
|
||
Clas : Redo: error(302)
|
||
Clas : | Redo: p(302, 9)
|
||
Clas : | Fail: p(x, 9)
|
||
Clas : Fail: error(x)
|
||
Clas : Found answer [error(302)]
|
||
RT : Original result of error(x) is [error(302)]
|
||
RT : ** Simulate: Applying sequence [set(101, 9)]
|
||
Action: Call: action(x)
|
||
...
|
||
|
||
i) **Changing rules**. Simulate adding 101: 5 (which results in 101 having 2 values) and deleting the rule that says each key must have at most 1 value. Then query the error table::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'p+(101, 5) error-(x) :- p(x, val1), p(x, val2), not eq(val1, val2)'
|
||
null
|
||
error(302)
|
||
|
||
The syntax for inserting/deleting rules is a bit awkward since we just afix
|
||
a + or - to the head of the rule. Ideally we would afix the +/- to the rule
|
||
as a whole. This syntactic sugar will be added in a future release.
|
||
|
||
There is also currently the limitation that you can only insert/delete rules
|
||
from the policy you are querying. And you cannot insert/delete action
|
||
description rules.
|
||
|
||
|
||
2.1 Simulation with Actions
|
||
---------------------------
|
||
|
||
The downside to the simulation functionality just described is that the
|
||
cloud service wanting to prevent policy violations would need to compute the
|
||
proposed changes in terms of the *tables* that Congress uses to represent its
|
||
internal state. Ideally a cloud service would have no idea which tables
|
||
Congress uses to represent its internals. But even if each cloud service
|
||
knew which tables Congress was using, it would still need convert each API
|
||
call into a collection of changes on its internal tables.
|
||
|
||
For example, an API call for Nova to provision a new VM might change several
|
||
tables. An API call to Heat to provision a new app might change tables in
|
||
several different cloud services. Translating each API call exposed by a
|
||
cloud service into the collection of Congress table changes is sometimes
|
||
impractical.
|
||
|
||
In the key/value examples above, the caller needed to know the current
|
||
state of the key/value store in order to accurately describe the changes
|
||
she wanted to make. Setting the key 101 to value 9 meant knowing that its
|
||
current value was 0 so that during the simulation we could say to delete the
|
||
assignment of 101 to 0 and add the assignment of 101 to 9.
|
||
|
||
It would be preferable if an external cloud service could simply ask Congress
|
||
if the API call it is about to execute is permitted by the policy.
|
||
To do that, we must tell Congress what each of those actions do in terms of
|
||
the cloud-service tables. Each of these *action descriptions* describe which
|
||
rows are inserted/deleted from which tables if the action were to be executed
|
||
in the current state of the cloud. Those action descriptions are written in
|
||
Datalog and are stored in a policy of type 'action'.
|
||
|
||
Action description policy statements are regular Datalog rules with one main
|
||
exception: they use + and - to adorn the table in the head of a rule to indicate
|
||
whether they are describing how to *insert* table rows or to *delete* table rows,
|
||
respectively.
|
||
|
||
For example in the key-value store, we can define an action 'set(key, value)'
|
||
that deletes the current value assigned to 'key' and adds 'value' in its place.
|
||
To describe this action, we write two things: a declaration to Congress that
|
||
*set* is indeed an action using the reserved table name *action* and
|
||
rules that describe which table rows *set* inserts and which rows it deletes::
|
||
|
||
action("set")
|
||
p+(x,y) :- set(x,y)
|
||
p-(x,oldy) :- set(x,y), p(x,oldy)
|
||
|
||
Note: Insertion takes precedence over deletion, which means that if a row is
|
||
both inserted and deleted by an action, the row will be inserted.
|
||
|
||
To insert these rows, we create a policy of type 'action' and then insert
|
||
these rules into that policy::
|
||
|
||
$ openstack congress policy create aliceactions --kind 'action'
|
||
$ openstack congress policy rule create action 'action("set")'
|
||
$ openstack congress policy rule create action 'p+(x,y) :- set(x,y)'
|
||
$ openstack congress policy rule create action 'p-(x,oldy) :- set(x,y), p(x,oldy)'
|
||
|
||
Below we illustrate how to use *set* to simplify the simulation queries
|
||
shown previously.
|
||
|
||
a) **Inserts and Deletes**. Set key 101 to value 5 and ask for the contents of error::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)' 'set(101, 5)' null
|
||
error(302)
|
||
|
||
|
||
b) **Multiple error changes**. Simulate changing 101:9, 202:9, 302:1 and query the *change* in the error table::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'set(101, 9) set(202, 9) set(302, 1)' null --delta
|
||
error+(202)
|
||
error+(101)
|
||
error-(302)
|
||
|
||
|
||
c) **Order matters**. Simulate changing 101:9, 202:9, 302:1, and finally 101:15 (in that order). Then query the *change* in the error table::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'set(101, 9) set(202, 9) set(302, 1) set(101, 15)' null --delta
|
||
error+(202)
|
||
error-(302)
|
||
|
||
d) **Mixing actions and state-changes**. Simulate changing 101:9 and adding value 7 for key 202. Then query the *change* in the error table::
|
||
|
||
$ openstack congress policy simulate classification 'error(x)'
|
||
'set(101, 9) p+(202, 7)' null --delta
|
||
error+(202)
|
||
error+(101)
|
||
|
||
|
||
3. Manual Reactive Enforcement
|
||
==============================
|
||
Not all policies can be enforced proactively on all clouds, which means that sometimes
|
||
the cloud will violate policy. Once policy violations happen, Congress can take action
|
||
to transition the cloud back into one of the states permitted by policy. We call this
|
||
*reactive enforcement*. Currently, to reactively enforce policy,
|
||
Congress relies on people to tell it which actions to execute and when to execute them,
|
||
hence we call it *manual* reactive enforcement.
|
||
|
||
Of course, Congress tries to make it easy for people to tell it how to react to policy
|
||
violations. People write policy statements
|
||
that look almost the same as standard Datalog rules, except the rules use the modal *execute* in
|
||
the head. For more information about the Datalog language and how to write these rules,
|
||
see :ref:`Policy <policy>`.
|
||
|
||
Take a simple example that is easy and relatively safe to try out. The policy we want is
|
||
that no server should have an ACTIVE status. The policy we write tells Congress
|
||
how to react when this policy is violated: it says to ask Nova to execute ``pause()``
|
||
every time it sees a server with ACTIVE status::
|
||
|
||
execute[nova:servers.pause(x)] :- nova:servers(id=x, status="ACTIVE")
|
||
|
||
The way this works is that everytime Congress gets new data about the state of the cloud,
|
||
it figures out whether that new data causes any new rows to be added to the
|
||
``nova:servers.pause(x)`` table. (While policy writers know that nova:servers.pause isn't a table
|
||
in the usual sense, the Datalog implementation treats it like a normal table and computes
|
||
all the rows that belong to it in the usual way.) If there are new rows added to the
|
||
``nova:servers.pause(x)`` table, Congress asks Nova to execute ``servers.pause`` for every row
|
||
that was newly created. The arguments passed to ``servers.pause`` are the columns in each row.
|
||
|
||
For example, if two servers have their status set to ACTIVE, Congress receives the following
|
||
data (in actuality the data comes in with all the columns set, but here we use column references
|
||
for the sake of pedagogy)::
|
||
|
||
nova:servers(id="66dafde0-a49c-11e3-be40-425861b86ab6", status="ACTIVE")
|
||
nova:servers(id="73e31d4c-a49c-11e3-be40-425861b86ab6", status="ACTIVE")
|
||
|
||
Congress will then ask Nova to execute the following commands::
|
||
|
||
servers.pause("66dafde0-a49c-11e3-be40-425861b86ab6")
|
||
servers.pause("73e31d4c-a49c-11e3-be40-425861b86ab6")
|
||
|
||
Congress will not wait for a response from Nova. Nor will it change the status of the two servers that it
|
||
asked Nova to pause in its ``nova:servers`` table. Congress will simply execute the pause() actions and
|
||
wait for new data to arrive, just like always.
|
||
Eventually Nova executes the pause() requests, the status of
|
||
those servers change, and Congress receives another data update.
|
||
|
||
nova:servers(id="66dafde0-a49c-11e3-be40-425861b86ab6", status="PAUSED")
|
||
nova:servers(id="73e31d4c-a49c-11e3-be40-425861b86ab6", status="PAUSED")
|
||
|
||
At this point, Congress updates the status of those servers in its ``nova:servers`` table to PAUSED.
|
||
But this time, Congress will find that no new rows were **added** to the ``nova:servers.pause(x)``
|
||
table and so will execute no actions. (Two rows were deleted, but Congress ignores deletions.)
|
||
|
||
In short, Congress executes actions exactly when new rows are inserted into a table augmented
|
||
with the *execute* modal.
|
||
|