diff --git a/doc/source/kerberos.rst b/doc/source/kerberos.rst index ae3765638a..d98c731539 100644 --- a/doc/source/kerberos.rst +++ b/doc/source/kerberos.rst @@ -15,9 +15,8 @@ At a Glance :Hosts: * kdc*.openstack.org -:Puppet: - * https://opendev.org/opendev/puppet-kerberos - * :git_file:`modules/openstack_project/manifests/kdc.pp` +:Ansible: + * :git_file:`playbooks/service-kerberos.yaml` :Projects: * http://web.mit.edu/kerberos :Bugs: @@ -30,50 +29,59 @@ At a Glance OpenStack Realm --------------- -OpenStack runs a Kerberos ``Realm`` called ``OPENSTACK.ORG``. -The realm contains a ``Key Distribution Center`` or KDC which is spread -across a master and a slave, as well as an admin server which only runs on the -master. Most of the configuration is in puppet, but initial setup and -the management of user accounts, known as ``principals``, are manual tasks. +OpenStack runs a Kerberos ``Realm`` called ``OPENSTACK.ORG``. The +realm contains a ``Key Distribution Center`` or KDC which is spread +across a primary and a replica, as well as an admin server which only +runs on the primary. + +Most of the configuration is in Ansible, but management of user +accounts, known as ``principals``, is a manual task for +administrators. Realm Creation -------------- -On the first KDC host, the admin needs to run `krb5_newrealm` by hand. Then -admin principals and host principles need to be set up. +Realm creation is exercised by the Ansible roles during testing, but +is not expected to be used in production (because we have an active +realm/database). -Set up host principals for slave propagation:: +The general process is: - # execute kadmin.local then run these commands - addprinc -randkey host/kdc03.openstack.org - addprinc -randkey host/kdc04.openstack.org - ktadd host/kdc03.openstack.org - ktadd host/kdc04.openstack.org + * create the new Kerberos database on the primary + * distribute the database ``stash`` file from the primary to + replicas, to allow them to unencrypt the database propogated to + them. This is created from a master key kept as a secret. + * create an admin user (password saved in file on primary server) + * add host principals for the primary and replica servers + * create keytabs on primary and replica servers (via the admin user), + which allows them to authenticate to each other. + * setup database propogation from primary to replicas with ``kprop`` + (primary-side push) and ``kpropod`` (replica-side listen). -Copy the file `/etc/krb5.keytab` to the second kdc host. - -The puppet config sets up slave propagation scripts and cron jobs to run them. - -You will also need to create a stash file after creating a new realm. Run -`krb5_util stash` on the first kdc host. Copy the file `/etc/krb5kdc/stash` -to all other KDC servers for the krb5-kdc daemons to run. +In a disaster recovery situation, we can provision a fresh realm and +recover principals from dump files (XXX: 2020-03-11 ianw -- dump file +backup to come). .. _addprinc: Adding A User Principal ----------------------- -First, ensure the user has an entry in puppet so they have a unix +First, ensure the user has an entry in Ansible so they have a Unix shell account on our hosts. SSH access is not necessary, but keeping track of usernames and uids with account entries is necessary. -Then, add the user to Kerberos using kadmin (while authenticated as a -kerberos admin) or kadmin.local on the kdc:: +If you are already an admin, you should authenicate with ``kinit +/admin``. Otherwise you can use the ``kadmin.local`` tool +(instead of ``kadmin``) on the primary server, which by-passes +authenication and writes to the database directly. + +Use ``kadmin`` to add the principal like so: kadmin: addprinc $USERNAME@OPENSTACK.ORG Where `$USERNAME` is the lower-case username of their unix account in -puppet. `OPENSTACK.ORG` should be capitalized. +Ansible. `OPENSTACK.ORG` should be capitalized. If you are adding an admin principal, use `username/admin@OPENSTACK.ORG`. Admins should additionally have @@ -87,11 +95,11 @@ than a person. There is no difference in their implementation, only in conventions around how they are created and used. Service principals are created without passwords and keytab files are used instead for authentication. The program `k5start` can use keytab -files to automatically obtain kerberos credentials (and AFS if +files to automatically obtain Kerberos credentials (and AFS if needed). Add the service principal to Kerberos using kadmin (while -authenticated as a kerberos admin) or kadmin.local on the kdc:: +authenticated as a Kerberos admin) or kadmin.local on the kdc:: kadmin: addprinc -randkey service/$NAME@OPENSTACK.ORG @@ -105,6 +113,10 @@ Then save the principal's keytab:: .. warning:: Each time ``ktadd`` is run, the key is rotated and previous keytabs are invalidated. +These keytabs are then usually converted to base-64 and stored as +secret variables, and deployed to hosts via Ansible. +``mirror-update`` is probably a good example. + Resetting A User Principal's Password ------------------------------------- @@ -117,12 +129,12 @@ twice as prompted. If you need to reset your admin principal, use No Service Outage Server Maintenance ------------------------------------ -Should you need perform maintenance on the kerberos server that requires -taking kerberos processes offline you can do this by performing your +Should you need perform maintenance on the Kerberos server that requires +taking Kerberos processes offline you can do this by performing your updates on a single server at a time. `kdc03.openstack.org` is our primary server and `kdc04.openstack.org` -is the hot standby. Perform your maintenance on `kdc04.openstack.org` +is the replica. Perform your maintenance on `kdc04.openstack.org` first. Then once that is done we can prepare for taking down the primary. On `kdc03.openstack.org` run:: @@ -132,7 +144,7 @@ You should see:: Database propagation to kdc04.openstack.org: SUCCEEDED -Once this is done the standby server is ready and we can take kdc03 +Once this is done the replica is ready and we can take kdc03 offline. When kdc03 is back online rerun `run-kprop.sh` to ensure everything is working again. diff --git a/inventory/service/group_vars/kerberos-kdc.yaml b/inventory/service/group_vars/kerberos-kdc.yaml new file mode 100644 index 0000000000..33fc460c88 --- /dev/null +++ b/inventory/service/group_vars/kerberos-kdc.yaml @@ -0,0 +1,9 @@ +iptables_extra_public_tcp_ports: + - 88 + - 464 + - 749 + - 754 +iptables_extra_public_udp_ports: + - 88 + - 464 + - 749 diff --git a/playbooks/roles/kerberos-kdc/README.rst b/playbooks/roles/kerberos-kdc/README.rst new file mode 100644 index 0000000000..5791aecbf0 --- /dev/null +++ b/playbooks/roles/kerberos-kdc/README.rst @@ -0,0 +1,27 @@ +Configure a Kerberos KDC server + +All KDC servers (primary and replicas) should be in a common +``kerberos-kdc`` group that defines ``kerberos_kdc_realm`` and +``kerberos_kdc_master_key``. + +The ``kerberos-kdc-primary`` group should have a single primary KDC +host. It will be configured to replicate its database to hosts in +the ``kerberos-kdc-replica`` group. + +Hosts in the ``kerberos-kdc-replica`` group will be configured to +receive updates from the ``kerberos-kdc-primary`` host. + +The role should be run twice; once limited to the primary group and +then a second time limited to the secondary group. + +**Role Variables** + + +.. zuul:rolevar:: kerberos_kdc_relam + + The realm for all KDC servers. + +.. zuul:rolevar:: kerberos_kdc_master_key + + The master key written into the *stash* file for each KDC, which + allows them to auth. diff --git a/playbooks/roles/kerberos-kdc/files/kadm5.acl b/playbooks/roles/kerberos-kdc/files/kadm5.acl new file mode 100644 index 0000000000..5ad0e1e54f --- /dev/null +++ b/playbooks/roles/kerberos-kdc/files/kadm5.acl @@ -0,0 +1,6 @@ +# This file Is the access control list for krb5 administration. +# When this file is edited run /etc/init.d/krb5-admin-server restart to activate +# One common way to set up Kerberos administration is to allow any principal +# ending in /admin is given full administrative rights. +# To enable this, uncomment the following line: +*/admin * diff --git a/playbooks/roles/kerberos-kdc/files/krb5-kpropd.service b/playbooks/roles/kerberos-kdc/files/krb5-kpropd.service new file mode 100644 index 0000000000..91a9759cee --- /dev/null +++ b/playbooks/roles/kerberos-kdc/files/krb5-kpropd.service @@ -0,0 +1,14 @@ +[Unit] +Description=Kerberos 5 replica KDC update server + +[Service] +ExecReload=/bin/kill -HUP $MAINPID +EnvironmentFile=-/etc/default/krb5-kpropd +ExecStart=/usr/sbin/kpropd -D $DAEMON_ARGS +InaccessibleDirectories=/etc/ssh /etc/ssl/private /root +ReadOnlyDirectories=/ +ReadWriteDirectories=/var/tmp /tmp /var/lib/krb5kdc /var/run /run +CapabilityBoundingSet=CAP_NET_BIND_SERVICE + +[Install] +WantedBy=multi-user.target diff --git a/playbooks/roles/kerberos-kdc/tasks/main.yaml b/playbooks/roles/kerberos-kdc/tasks/main.yaml new file mode 100644 index 0000000000..076c1cede5 --- /dev/null +++ b/playbooks/roles/kerberos-kdc/tasks/main.yaml @@ -0,0 +1,32 @@ +- name: Install packages + package: + name: + - krb5-kdc + state: present + +- name: Ensure directories + file: + path: '{{ item }}' + state: directory + mode: 0755 + owner: root + group: root + loop: + - /etc/krb5kdc + - /var/krb5kdc + +- name: Install KDC config + template: + src: 'kdc.conf.j2' + dest: '/etc/krb5kdc/kdc.conf' + mode: 0644 + owner: root + group: root + +- name: Copy kadm5.acl + copy: + src: kadm5.acl + dest: '/etc/krb5kdc/kadm5.acl' + mode: 0644 + owner: root + group: root diff --git a/playbooks/roles/kerberos-kdc/tasks/primary.yaml b/playbooks/roles/kerberos-kdc/tasks/primary.yaml new file mode 100644 index 0000000000..199f50890e --- /dev/null +++ b/playbooks/roles/kerberos-kdc/tasks/primary.yaml @@ -0,0 +1,94 @@ +- name: Install packages + package: + name: + - krb5-admin-server + state: present + +# Note the following is not really for production, where we already +# have a database setup. It is exercsied by testing however. +- name: Look for primary database + stat: + path: /var/lib/krb5kdc/principal + register: _db_created + +- name: Setup clean primary + when: not _db_created.stat.exists + block: + + - name: Setup primary db + shell: | + yes {{ kerberos_kdc_master_key }} | kdb5_util create -r {{ kerberos_kdc_realm }} -s + + - name: Generate and save admin principal password + copy: + dest: '/etc/krb5kdc/admin.passwd' + content: '{{ lookup("password", "/dev/null chars=ascii_letters,digits length=12") }}' + owner: root + group: root + mode: '0600' + + - name: Setup initial admin principal + shell: | + echo "addprinc -pw $(cat /etc/krb5kdc/admin.passwd) admin/admin@{{ kerberos_kdc_realm }}" | kadmin.local + + # https://web.mit.edu/kerberos/krb5-latest/doc/admin/install_kdc.html + # It is not strictly necessary to have the primary KDC server in + # the Kerberos database, but it can be handy if you want to be + # able to swap the primary KDC with one of the replicas. + - name: Create primary host principal and keytab + shell: + cmd: | + echo "addprinc -randkey host/{{ inventory_hostname }}" | kadmin.local + echo "ktadd host/{{ inventory_hostname }}" | kadmin.local + + - name: Create replica host principals + shell: + cmd: 'echo "addprinc -randkey host/{{ item }}" | kadmin.local' + with_inventory_hostnames: kerberos-kdc-replica + +# The stash file is used to decrypt the on-disk database. Without +# this you are prompted for the master password on daemon start. This +# needs to be distributed to the replicas so they can also open the +# database. +- name: Read and save stash file + slurp: + src: '/etc/krb5kdc/stash' + register: kerberos_kdc_stash_file_contents + +# Export this so replica servers can use this variable to authenicate +# and create keytabs for their host principals, if they need to. +- name: Read in admin/admin password + slurp: + src: "/etc/krb5kdc/admin.passwd" + register: _admin_password +- name: Export admin password + set_fact: + kerberos_kdc_admin_password: '{{ _admin_password.content | b64decode }}' + +# kprop is what pushes the db to replicas. Set it up to run via cron +# periodically. +- name: Install kprop script + template: + src: 'run-kprop.sh.j2' + dest: '/usr/local/bin/run-kprop.sh' + mode: 0755 + owner: root + group: root + +- name: kprop cron to push db to replicas + cron: + name: kprop + minute: 15 + job: '/usr/local/bin/run-kprop.sh >/dev/null 2>&1' + +- name: start krb5-admin-server + systemd: + state: started + enabled: yes + name: krb5-admin-server + +- name: start krb5-kdc + systemd: + state: started + enabled: yes + name: krb5-kdc diff --git a/playbooks/roles/kerberos-kdc/tasks/replica.yaml b/playbooks/roles/kerberos-kdc/tasks/replica.yaml new file mode 100644 index 0000000000..493b30aba8 --- /dev/null +++ b/playbooks/roles/kerberos-kdc/tasks/replica.yaml @@ -0,0 +1,64 @@ +- name: Install packages + package: + name: + - krb5-kdc + - krb5-kpropd + state: present + +# This is the key to unencrypt the database pushed by the primary +- name: Install stash file from primary + shell: + cmd: 'echo "{{ hostvars[groups["kerberos-kdc-primary"][0]]["kerberos_kdc_stash_file_contents"].content }}" | base64 -d > /etc/krb5kdc/stash' + creates: '/etc/krb5kdc/stash' + +- name: Ensure stash file permsissions + file: + path: /etc/krb5kdc/stash + owner: root + group: root + mode: '0600' + +# Use the admin user to write out our host keytab +- name: Create host keytab + shell: + cmd: | + echo "ktadd host/{{ inventory_hostname }}" | kadmin -p admin/admin -w '{{ hostvars[groups["kerberos-kdc-primary"][0]]["kerberos_kdc_admin_password"] }}' + creates: '/etc/krb5.keytab' + +# This specifies servers that are allowed to send us updates; +# i.e. the primary server +- name: Install kpropd ACL + template: + src: 'kpropd.acl.j2' + dest: '/etc/krb5kdc/kpropd.acl' + mode: 0644 + owner: root + group: root + +- name: Install kpropd service + copy: + src: krb5-kpropd.service + dest: /etc/systemd/system/krb5-kpropd.service + mode: 0644 + owner: root + group: root + register: _kpropd_service_installed + +- name: Reload systemd + systemd: + daemon_reload: yes + when: _kpropd_service_installed.changed + +- name: Ensure kpropd running + systemd: + state: started + name: krb5-kpropd + enabled: yes + +# Note we can't start until replicas are distributed; the main +# service-kerberos.yaml playbook handles this. +- name: Ensure krb5-kdc is enabled + systemd: + name: krb5-kdc + enabled: yes + masked: no diff --git a/playbooks/roles/kerberos-kdc/templates/kdc.conf.j2 b/playbooks/roles/kerberos-kdc/templates/kdc.conf.j2 new file mode 100644 index 0000000000..c03c1c1d51 --- /dev/null +++ b/playbooks/roles/kerberos-kdc/templates/kdc.conf.j2 @@ -0,0 +1,16 @@ +[kdcdefaults] + kdc_ports = 750,88 + +[realms] + {{ kerberos_kdc_realm }} = { + database_name = /var/lib/krb5kdc/principal + admin_keytab = FILE:/etc/krb5kdc/kadm5.keytab + acl_file = /etc/krb5kdc/kadm5.acl + key_stash_file = /etc/krb5kdc/stash + kdc_ports = 750,88 + max_life = 10h 0m 0s + max_renewable_life = 7d 0h 0m 0s + master_key_type = aes256-cts + supported_enctypes = aes256-cts:normal + default_principal_flags = +preauth + } diff --git a/playbooks/roles/kerberos-kdc/templates/kpropd.acl.j2 b/playbooks/roles/kerberos-kdc/templates/kpropd.acl.j2 new file mode 100644 index 0000000000..d29224f8e8 --- /dev/null +++ b/playbooks/roles/kerberos-kdc/templates/kpropd.acl.j2 @@ -0,0 +1,3 @@ +{% for kdc in groups["kerberos-kdc-primary"] %} +host/{{ kdc }}@{{ kerberos_kdc_realm }} +{% endfor %} diff --git a/playbooks/roles/kerberos-kdc/templates/run-kprop.sh.j2 b/playbooks/roles/kerberos-kdc/templates/run-kprop.sh.j2 new file mode 100644 index 0000000000..d08dec862e --- /dev/null +++ b/playbooks/roles/kerberos-kdc/templates/run-kprop.sh.j2 @@ -0,0 +1,7 @@ +#!/bin/sh +kdclist="{% for s in groups['kerberos-kdc-replica'] %}{{ s }} {% endfor %}" +kdb5_util dump /var/krb5kdc/slave_datatrans +for kdc in $kdclist +do + kprop -f /var/krb5kdc/slave_datatrans $kdc +done diff --git a/playbooks/service-kerberos.yaml b/playbooks/service-kerberos.yaml new file mode 100644 index 0000000000..9ea74ee725 --- /dev/null +++ b/playbooks/service-kerberos.yaml @@ -0,0 +1,47 @@ +# Setting up a fresh realm, as done in CI, is a five step process of: +# +# 1. setup common packages/config +# 2. setup primary; create db, setup kprop pushes, start services. +# 3. configure replica to accept db updates via kpropd +# 4. do a db replication +# 5. start replica daemons now they have a db copy +# +# In production this is largely a no-op just ensuring things are +# running. + +- hosts: "kerberos-kdc:!disabled" + name: "Configure common KDC components" + roles: + - kerberos-client + - kerberos-kdc + +- hosts: "kerberos-kdc-primary:!disabled" + name: "Configure Kerberos Primary" + tasks: + - name: Configure primary KDC + include_role: + name: kerberos-kdc + tasks_from: primary + +- hosts: "kerberos-kdc-replica:!disabled" + name: "Configure Kerberos Replicas" + tasks: + - name: Configure replica KDC + include_role: + name: kerberos-kdc + tasks_from: replica + +- hosts: "kerberos-kdc-primary:!disabled" + name: "Run replication" + tasks: + - name: Run a DB replication + shell: | + /usr/local/bin/run-kprop.sh + +- hosts: "kerberos-kdc-replica:!disabled" + name: "Ensure krb5-kdc running" + tasks: + - name: Start krb5-kdc + systemd: + name: krb5-kdc + state: started diff --git a/playbooks/test-kerberos.yaml b/playbooks/test-kerberos.yaml new file mode 100644 index 0000000000..89c95c30a5 --- /dev/null +++ b/playbooks/test-kerberos.yaml @@ -0,0 +1,7 @@ +- hosts: "kdc-primary.opendev.org" + tasks: + + - name: Run kinit + shell: | + cat /etc/krb5kdc/admin.passwd | kinit admin/admin + diff --git a/playbooks/zuul/run-base.yaml b/playbooks/zuul/run-base.yaml index c9a3fbe384..319361fb30 100644 --- a/playbooks/zuul/run-base.yaml +++ b/playbooks/zuul/run-base.yaml @@ -58,6 +58,7 @@ - group_vars/registry.yaml - group_vars/gitea.yaml - group_vars/gitea-lb.yaml + - group_vars/kerberos-kdc.yaml - group_vars/letsencrypt.yaml - group_vars/meetpad.yaml - group_vars/jvb.yaml diff --git a/playbooks/zuul/templates/gate-groups.yaml.j2 b/playbooks/zuul/templates/gate-groups.yaml.j2 index c420cc6c77..2e00ecdc5b 100644 --- a/playbooks/zuul/templates/gate-groups.yaml.j2 +++ b/playbooks/zuul/templates/gate-groups.yaml.j2 @@ -27,3 +27,11 @@ groups: borg-backup: - borg-backup-test01.opendev.org - borg-backup-test02.opendev.org + + kerberos-kdc: + - kdc-primary.opendev.org + - kdc-replica.opendev.org + kerberos-kdc-primary: + - kdc-primary.opendev.org + kerberos-kdc-replica: + - kdc-replica.opendev.org diff --git a/playbooks/zuul/templates/group_vars/kerberos-kdc.yaml.j2 b/playbooks/zuul/templates/group_vars/kerberos-kdc.yaml.j2 new file mode 100644 index 0000000000..4ab1f462bb --- /dev/null +++ b/playbooks/zuul/templates/group_vars/kerberos-kdc.yaml.j2 @@ -0,0 +1,10 @@ +# global server settings +kerberos_kdc_realm: OPENDEV.CI +kerberos_kdc_master_key: masterkey123 + +# client settings +kerberos_realm: OPENDEV.CI +kerberos_admin_server: kdc-primary.opendev.org +kerberos_kdcs: + - kdc-primary.opendev.org + - kdc-replica.opendev.org diff --git a/zuul.d/infra-prod.yaml b/zuul.d/infra-prod.yaml index d5b1698342..f4b4f0fcbc 100644 --- a/zuul.d/infra-prod.yaml +++ b/zuul.d/infra-prod.yaml @@ -593,6 +593,23 @@ - modules/ - manifests/ +- job: + name: infra-prod-service-kerberos + parent: infra-prod-service-base + description: Run Kerberos playbook. + vars: + playbook_name: service-kerberos.yaml + infra_prod_ansible_forks: 1 + required-projects: + - opendev/system-config + files: + - inventory/ + - playbooks/service-kerberos.yaml + - inventory/service/group_vars/kerberos-kdc.yaml + - playbooks/roles/kerberos-kdc/ + - roles/kerberos-client/ + - playbooks/roles/iptables/ + - job: name: infra-prod-remote-puppet-else parent: infra-prod-service-base diff --git a/zuul.d/project.yaml b/zuul.d/project.yaml index c490e26fc4..f953ed1477 100644 --- a/zuul.d/project.yaml +++ b/zuul.d/project.yaml @@ -25,6 +25,7 @@ - name: opendev-buildset-registry - name: system-config-build-image-hound soft: true + - system-config-run-kerberos - system-config-run-lists - system-config-run-nodepool - system-config-run-meetpad: @@ -131,6 +132,7 @@ - name: opendev-buildset-registry - name: system-config-upload-image-hound soft: true + - system-config-run-kerberos - system-config-run-lists - system-config-run-nodepool - system-config-run-meetpad: @@ -253,6 +255,7 @@ soft: true - infra-prod-service-bridge - infra-prod-service-gitea-lb + - infra-prod-service-kerberos - infra-prod-service-nameserver - infra-prod-service-nodepool - infra-prod-service-codesearch: @@ -320,6 +323,7 @@ - infra-prod-service-nameserver - infra-prod-service-etherpad - infra-prod-service-meetpad + - infra-prod-service-kerberos - infra-prod-service-mirror-update - infra-prod-service-mirror - infra-prod-service-static diff --git a/zuul.d/system-config-run.yaml b/zuul.d/system-config-run.yaml index 508a8a0f5a..d3d813636f 100644 --- a/zuul.d/system-config-run.yaml +++ b/zuul.d/system-config-run.yaml @@ -919,3 +919,35 @@ - testinfra/test_refstack.py # If we rebuild the image, we want to run this job as well. - docker/refstack/.* + +- job: + name: system-config-run-kerberos + parent: system-config-run + ansible-version: 2.9 + description: | + Run the playbook for kerberos servers + timeout: 3600 + nodeset: + nodes: + - name: bridge.openstack.org + label: ubuntu-bionic + - name: kdc-primary.opendev.org + label: ubuntu-focal + - name: kdc-replica.opendev.org + label: ubuntu-focal + host-vars: + kdc-primary.opendev.org: + host_copy_output: + '/etc/krb5kdc/': logs + '/var/krb5kdc/': logs + kdc-replica.opendev.org: + host_copy_output: + '/etc/krb5kdc/': logs + '/var/krb5kdc/': logs + vars: + run_playbooks: + - playbooks/service-kerberos.yaml + run_test_playbook: playbooks/test-kerberos.yaml + files: + - playbooks/bridge.yaml + - playbooks/roles/kerberos-kdc/