Merge pull request #120 from bodepd/data_model_docs

Improve data model docs.
2013-10-04 14:33:19 -07:00
parent 93b9df1798 341426d6b6
commit 50b385157b
1 changed files with 261 additions and 76 deletions
--- a/data/README.md
+++ b/data/README.md
@@ -1,16 +1,125 @@
-# what do all of these folders mean?
+# Deployment modeling data

-## config.yaml
+This folder contains data that is used to express openstack
+deployment as data.

-config.yaml is full of global variables that are used
-to determine what 'type' of reference architecture you wish
-to deploy.
+Technically, it is implemented as a custom [hiera](http://docs.puppetlabs.com/hiera/1/index.html) [backend](http://docs.puppetlabs.com/hiera/1/custom_backends.html),
+and an [external node classifier](http://docs.puppetlabs.com/guides/external_nodes.html) (more specfically, as a custom node terminus, but for our purposes here, they can be considered as the same thing)

-These variables are used by hiera to determine both what
-classes are included, and are also used to drive the hierarchical
-lookup.
+It is critical to understand the following in order to understand this model:

-The variables are:
+ how hiera works
+ the fact this uses a custom hiera backend (so it is not quite the same
+  as standard hiera)
+ what an ENC is
+
+This solution ONLY works with Puppet 3.x or greater and relies on the data-bindings
+system.
+
+Below are the links to the custom ENC and custom hiera backend:
+
+* https://github.com/bodepd/scenario\_node\_terminus
+* https://github.com/bodepd/hiera\_data\_mapper
+
+## why data?
+
+This is intended to replace the stackforge/puppet-openstack class
+as well as other tools that look at composing the core stackforge modules
+into openstack roles.
+
+The puppet-openstack (and other models that I looked at) suffered
+from the following problems:
+
+### The roles were not composable enough.
+
+Multiple reference architectures for openstack (ie: all\_in\_one,
+compte\_controller) should all be composed of the same building
+blocks in order to reduce the total amount of code needed to express
+these deployment scenarios.
+
+For example, an all\_in\_one deployment should be expressed as a
+compute + controller + network\_controller.
+
+The previous model did not support this because of it's use of parameterized
+classes and inherent issues regarding duplicate resource definitions, and
+the need for all parameters to be provided in a single declaration of that
+class.
+
+### Data forwarding was too much work
+
+Explicit data-forwarding in the class hierarchies
+(ie: openstack::controller -> openstack::nova::controller -> nova)
+was too much work and too easy to mess up. Adding a single parameter to the
+core model sometimes required adding it to 2 or three different class interfaces.
+
+In fact, a large percent of all of the pull requests in the project were to
+add parameters for forwarding to the openstack class.
+
+### Puppet manifests are not introspectable enough
+
+As we move towards the creation of user interfaces that drive the
+configuration of multiple different reference architectures, we need
+a way to inspect the current state of our deployment model to understand
+what input needs to be provided by the end user.
+
+For example:
+
+  The data provided to deploy a 3 role model: (compute/controller/network controller)
+  is different from the data used to deploy a 2 role model (compute/controller)
+
+To make matters even a but more complicated:
+
+  Each of those models also supports a large number of configurable backends
+  that each require their own specific configurations. Even with a 2 role scenario,
+  you could select ceph, or swift, or file as the glance backend. Each of these
+  selections results in different user configurations.
+
+This issue specifically lead to the creation of the model as data (as
+opposed to something more like roles/profiles: http://www.craigdunn.org/2012/05/239/).
+
+Puppet provides a great way to express interfaces for encapsulating system resources,
+but that content is only designed to be consumed by Puppet's internal lexer and parser,
+it is not designed to be introspectable by other tools. In order to support the selection
+of both multiple reference architectures as well as multiple backends, we need to
+be able to programatically understand the selected classes to provide the user with the
+correct interface.
+
+## what is used to express the data
+
+All of the data used to express the various reference architectures can be found
+in this projects data directory. The sections explains the various configuration
+files and directories that contain data.
+
+### Data driven by the custom ENC
+
+When a node checks in with Puppet, the master invokes the scenario node terminus.
+This call is made to provide Puppet with two pieces of information that it needs
+to compile a catalog for that node:
+
+ what classes should be included for that node
+ what top scope parameters should be set that can effect the hiera lookups
+
+In order for this to work with the data model, you need to install the following
+module:
+
+    https://github.com/bodepd/scenario\_node\_terminus
+    (this module is automatically installed if you use the Puppetfile that comes
+    with this project))
+
+And add the following configuration to your puppet.conf file:
+
+  node\_terminus=scenario
+  (this is already configured if you bootstrap with setup.pp))
+
+The following list of data is used to drive that classification process
+
+#### config.yaml
+
+For the general use case, most of the information in config.yaml
+can be ignored. Most of it is used for provisioning of virtual machine
+instances for the CD part of this work.
+
+The is one very important setting in config.yaml called scenario.

  *scenario* scenario is used to select the specific references architecture
  that you wish to deploy. It is used to select the roles for that scenario
@@ -18,6 +127,32 @@ The variables are:
  nodes/<scenario>/yaml (for deployment of CI). It is also used as a hiera
  lookup and data mapping hierarchy.

+All data set in this file are passed on to Puppet as global variables.
+
+#### global\_hiera\_params
+
+This directory is used to specify the global variables that can be used
+to effect the hierarchical overrides that will be used to determine both
+the classes contained in a scenarios roles as well as the hiera overrides
+for both data mappings and the regular yaml hierarchy.
+
+The selection of the global\_hiera\_params is driven by hiera using the following
+hierarchy:
+
+  - global\_hiera\_params/user.yaml
+  - global\_hiera\_params/scenario/%{scenario}.yaml
+  - global\_hiera\_params/common.yaml
+
+This means that default globals are stored in common.yaml, scenarios can
+provide their own defaults, and users can override whatever settings they
+need to.
+
+These variables are used by hiera to determine both what
+classes are included as a part of the role lookup, and are also used to drive
+the hierarchical lookups of class parameters.
+
+The current supported variables are:
+
  *db_type* selects the database to use (defaults to mysql)

  *rpc_type* Selects the rpc type to use (defaults to rabbitmq)
@@ -39,14 +174,12 @@ The variables are:

  *tenant_network_type* Type of tenant network to use. (defaults to gre).

-  *password_management* selects the type of password management you wish to use.
-  This is used by the data\_mapper to determine how password values map to services.
-  Default to individual, which means that individual password are used for everything.
-
  *enabled_services* Used to select all of the services that are enabled. This is
  used to determine what endpoints and databases should be configured.

-## scenarios
+All data set in this file are passed on to Puppet as global variables.
+
+### scenarios

 Scenarios are used to describe the reference architecture that you wish to
 deploy.
@@ -60,23 +193,122 @@ file:
  data/scenario/<scenario>.yaml

 This file defines what roles exists for a specific scenario, and which classes
-should be assigned to those roles.
+and classgroups should be assined to nodes of that role.

-## class groups
+### class groups

 Class groups are simply a list of classes that can be referred
 to as a single name. Class groups are used to store combinations
 of classes for reuse.

-## data mappings
+## role\_mappings
+
+role\_mappings is used to map Puppet certnames to roles.
+
+## data that drives the hiera configuration
+
+After Puppet gets a list of classes and top scope parameters from the node
+terminus it begins to compile the catalog. During this process,
+every single class parameter is resolved through Puppet's data-binding system
+that came into existence in Puppet 3.x. Basically:
+
+  for every class
+    let's say foo
+  for every parameter
+    let's say param1
+  the fully qualified variable is lookup in hiera automatically:
+    hiera('foo::param') for our example
+
+When this lookup is performed, for the data model, our default
+hiera backend is used:
+
+  https://github.com/bodepd/hiera\_data\_mapper
+
+This should be configured in your hiera config
+
+  /etc/puppet/hiera.yaml
+
+    ---
+    :backends:
+      - data_mapper
+    :hierarchy:
+      - "hostname/%{hostname}"
+      - user
+      - jenkins
+      - user.%{scenario}
+      - user.common
+      - "cinder_backend/%{cinder_backend}"
+      - "glance_backend/%{glance_backend}"
+      - "rpc_type/%{rpc_type}"
+      - "db_type/%{db_type}"
+      - "tenant_network_type/%{tenant_network_type}"
+      - "network_type/%{network_type}"
+      - "network_plugin/%{network_plugin}"
+      - "password_management/%{password_management}"
+      - "scenario/%{scenario}"
+      - grizzly_hack
+      - common
+    :yaml:
+       :datadir: /etc/puppet/data/hiera_data
+    :data_mapper:
+       # this should be contained in a module
+       :datadir: /etc/puppet/data/data_mappings
+
+This entire hiera config is required for all of the default data to be set
+correctly. As the project matures, this default hierarchy may be subject to change.
+
+### data mappings

 Data mappings are used to express the way in which
-global variables from hiera map to individual class parameters.
+global variables from map to individual class parameters.

 Previous, this was done with parameter forwarding in parameterized
-classes.
+classes. In fact, this style of parameter forwarding is one of the main
+functions of the previous openstack module.

-## hiera data
+For example, in the openstack::controller class, we implemented the
+parameter verbose which is used to set verbose for all services.
+
+    class openstack::controller(
+      $verbose = false
+    ) {
+
+      class { 'nova': verbose => $verbose }
+      class { 'glance': verbose => $verbose }
+      class { 'keystone': verbose => $verbose }
+      class { 'cinder': verbose => $verbose }
+      class { 'quantum': verbose => $verbose }
+
+    }
+
+This is pretty concise way to express how a single data value assigns
+multiple class parameters. The problem, is that is uses the parameterized
+class declaration syntax to forward this data, meaning that it is hard to
+reuse this code if you want to provider different settings.
+
+The same configuration above can be expressed with the data\_mappings as
+follows:
+
+    verbose:
+      - nova::verbose
+      - glance::verbose
+      - keystone::verbose
+      - cinder::verbose
+      - quantum::verbose
+
+For each of those variables, the data-binding will call out to hiera when
+the classes are processed (if they are included)
+
+Example:
+
+  Puppet
+    calls hiera to determine the value of keystone::verbose?"
+  Hiera
+    * consults data mappings (via the hierarchical lookup defined in hiera.yaml)
+    * determines that value maps to verbose
+    * performs a regular YAML lookup in the hiera\_data directory of 'verbose'
+
+### hiera data

 hiera data is used to express what values are going to be used to
 configure your openstack services.
@@ -85,66 +317,19 @@ hiera data is used to either express global keys (that were mapped to
 class parameters in the data mappings), or fully qualified class parameter
 namespaces.

-## nodes
+NOTE: at the moment, fully qualified variables are ignored from hiera\_data
+if they were defined in the data\_mappings. This is probably a bug (b/c they should
+probably override), but this is how it works at the moment.
+
+## CI/CD specific constructs:
+
+### nodes

 Nodes are currently used to express the nodes that can be built
 in order to test deployments of various scenarios. This is currently
 used for deployments for CI.

+Casual users should be able to ignore this.
+
 We are currently performing research to see if this part of the data
 should be replaces by a HEAT template.
-
-## role\_mappings
-
-role\_mappings is used to map Puppet certnames to roles.
-
-## discovered modeling issues (please ignore, unless you are Dan :) )
-
-### Issue 1
-
-some data values map to multiple combined values:
-
-   ex: mysql\_connection => db\_name, password, host, user, type
-
-##### solutions
-
-1. accept sql\_connection from hiera for each service
-
-This is problematic b/c it will lead to data suplication, and not take advantage of
-reasonable defaults
-
-2. patch the components to accept the parts of the password and not the whole thing
-
-That may not be the only occurrence.
-
-It will have to be done in a backwards compat way
-
-3. allow the value of the lookup to be resolvable as multiple lookups (and not a single one)
-
-### Issue number 2
-
-Some data effects the static values of what needs to be passed to other services
-
-Ex: depending on the rpc\_type, the actual rpc\_backend passed to cinder is differnet.
-
-#### solutions
-
-1. add an extra parameter called rpc\_type to the class interfaces
-
-2. add rpc\_type to the global data that drives configuration, and make it a variable
-that drives the hierarchical configuration
-
-### Issue 3
-
-There is no way to have hiera drive whether or not individual components need to be installed
-
-For now, this will need to be stored as global data that contains a list of the services that
-you want to install
-
-### Issue 4
-
-where do we set assumed defaults?
-
-examples:
-  - cinder simple scheduler
-  - charset for database (can we just set this as a default for the database?)