Retire repository

Fuel (from openstack namespace) and fuel-ccp (in x namespace)
repositories are unused and ready to retire.

This change removes all content from the repository and adds the usual
README file to point out that the repository is retired following the
process from
https://docs.openstack.org/infra/manual/drivers.html#retiring-a-project

See also
http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011647.html

Depends-On: https://review.opendev.org/699362
Change-Id: I7c7a84b65a9d1efd9393d8f0bdb18778e9add445
This commit is contained in:
Andreas Jaeger 2019-12-18 09:45:27 +01:00
parent 56116510d2
commit eb837bc991
28 changed files with 10 additions and 2050 deletions

64
.gitignore vendored
View File

@ -1,64 +0,0 @@
*.py[cod]
# C extensions
*.so
# Packages
*.egg*
*.egg-info
dist
build
eggs
parts
bin
var
sdist
develop-eggs
.installed.cfg
lib
lib64
# Installer logs
pip-log.txt
# Unit test / coverage reports
cover/
.coverage*
!.coveragerc
.tox
nosetests.xml
.testrepository
.venv
# Translations
*.mo
# Mr Developer
.mr.developer.cfg
.project
.pydevproject
# Complexity
output/*.html
output/*/index.html
# Sphinx
doc/build
# pbr generates these
AUTHORS
ChangeLog
# Editors
*~
.*.swp
.*sw?
# Configs
*.conf
# openrc files
openrc-*
# debug info from oslo_debug_helper
debug-*

176
LICENSE
View File

@ -1,176 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.

10
README.rst Normal file
View File

@ -0,0 +1,10 @@
This project is no longer maintained.
The contents of this repository are still available in the Git
source code management system. To see the contents of this
repository before it reached its end of life, please check out the
previous commit with "git checkout HEAD^1".
For any further questions, please email
openstack-discuss@lists.openstack.org or join #openstack-dev on
Freenode.

View File

@ -1,7 +0,0 @@
FROM {{ image_spec("base-tools") }}
MAINTAINER {{ maintainer }}
RUN pip install pymysql \
&& useradd --user-group -G microservices mysql
USER mysql

View File

@ -1,13 +0,0 @@
FROM {{ image_spec("base-tools") }}
MAINTAINER {{ maintainer }}
COPY {{ render('sources.list.debian.j2') }} /etc/apt/sources.list.d/testing.list
COPY sudoers /etc/sudoers.d/haproxy_sudoers
RUN apt-get update \
&& apt-get install -y -t testing haproxy \
&& apt-get clean \
&& chown -R haproxy: /etc/haproxy /var/lib/haproxy \
&& usermod -a -G microservices haproxy
USER haproxy

View File

@ -1,2 +0,0 @@
# Testing repos
deb {{ url.debian }} testing main

View File

@ -1 +0,0 @@
haproxy ALL=(root) NOPASSWD: /bin/chown -R haproxy\: /run/haproxy, /bin/mkdir /run/haproxy

View File

@ -1,6 +0,0 @@
FROM {{ image_spec("base-tools") }}
MAINTAINER {{ maintainer }}
RUN apt-get update \
&& apt-get install -y --force-yes --no-install-recommends mysql-client \
&& apt-get clean

View File

@ -1,18 +0,0 @@
FROM {{ image_spec("base-tools") }}
MAINTAINER {{ maintainer }}
COPY {{ render('sources.list.debian.j2') }} /etc/apt/sources.list.d/percona.list
COPY {{ render('apt_preferences.debian.j2') }} /etc/apt/preferences
COPY percona_sudoers /etc/sudoers.d/percona_sudoers
RUN apt-key adv --recv-keys --keyserver {{ url.percona.debian.keyserver }} \
{{ url.percona.debian.keyid }} \
&& apt-get update \
&& apt-get install -y --force-yes --no-install-recommends percona-xtradb-cluster-57 jq \
&& pip install --no-cache-dir pymysql \
&& chmod 750 /etc/sudoers.d \
&& chmod 440 /etc/sudoers.d/percona_sudoers \
&& usermod -a -G microservices mysql \
&& chown -R mysql: /etc/mysql
USER mysql

View File

@ -1,3 +0,0 @@
Package: *
Pin: origin "{{ url.percona.debian.repo | host }}"
Pin-Priority: 500

View File

@ -1 +0,0 @@
%microservices ALL=(root) NOPASSWD: /bin/chown mysql\:mysql /var/lib/mysql, /bin/chown mysql\:mysql /var/log/ccp/mysql

View File

@ -1,2 +0,0 @@
# Maria DB repo
deb [arch=amd64,i386] {{ url.percona.debian.repo }} jessie main

View File

@ -1,13 +0,0 @@
actions:
- name: backup-db
image: mysql-client
dependencies:
- database
parameters:
- key: db
default_value: null
command: /opt/ccp/bin/backup-db.sh
files:
- path: /opt/ccp/bin/backup-db.sh
content: backup-db.sh.j2
perm: "0700"

View File

@ -1,12 +0,0 @@
#!/bin/bash
set -ex
DB_NAME="{{ action_parameters.db }}"
BACKUP_FILE="/var/ccp/backup/${DB_NAME}/backup-$(date "+%Y%m%d%H%M%S").sql"
mkdir -p "$(dirname ${BACKUP_FILE})"
mysqldump {% if db.tls.enabled %} --ssl-mode REQUIRED {% endif %} -h {{ address(service.database) }} \
-uroot -p{{ db.root_password }} \
--single-transaction --routines --triggers "${DB_NAME}" > "${BACKUP_FILE}"

View File

@ -1 +0,0 @@
{{ security.tls.ca_cert }}

View File

@ -1,32 +0,0 @@
configs:
db:
slow_query_log_enabled: false
long_query_time: 1
general_log_enabled: false
max_timeout: 60
tls:
enabled: true
percona:
cluster_name: "k8scluster"
gcache_size: "1G"
sql_mode: null
cluster_size: 3
force_bootstrap:
enabled: false
node: null
port:
cont: 3306
secret_configs:
db:
root_password: "password"
percona:
xtrabackup_password: "password"
monitor_password: "password"
url:
percona:
debian:
repo: "http://repo.percona.com/apt"
keyserver: "hkp://keyserver.ubuntu.com:80"
keyid: "9334A25F8507EFA5"

View File

@ -1 +0,0 @@
{{ security.tls.dhparam }}

View File

@ -1,340 +0,0 @@
#!/usr/bin/env python
import BaseHTTPServer
import functools
import json
import logging
import os
import os.path
import socket
import time
import etcd
import pymysql.cursors
# Galera states
JOINING_STATE = 1
DONOR_DESYNCED_STATE = 2
JOINED_STATE = 3
SYNCED_STATE = 4
WAS_JOINED = False
OLD_STATE = 0
LOG_DATEFMT = "%Y-%m-%d %H:%M:%S"
LOG_FORMAT = "%(asctime)s.%(msecs)03d - %(levelname)s - %(message)s"
logging.basicConfig(format=LOG_FORMAT, datefmt=LOG_DATEFMT)
LOG = logging.getLogger(__name__)
LOG.setLevel(logging.DEBUG)
GLOBALS_PATH = "/etc/ccp/globals/globals.json"
GLOBALS_SECRETS_PATH = '/etc/ccp/global-secrets/global-secrets.json'
DATADIR = "/var/lib/mysql"
SST_FLAG = os.path.join(DATADIR, "sst_in_progress")
PID_FILE = os.path.join(DATADIR, "mysqld.pid")
HOSTNAME = socket.getfqdn()
IPADDR = socket.gethostbyname(HOSTNAME)
CA_CERT = '/opt/ccp/etc/tls/ca.pem'
MONITOR_PASSWORD = None
CLUSTER_NAME = None
ETCD_PATH = None
ETCD_HOST = None
ETCD_PORT = None
ETCD_TLS = None
def retry(f):
@functools.wraps(f)
def wrap(*args, **kwargs):
attempts = 4
delay = 1
while attempts > 1:
try:
return f(*args, **kwargs)
except etcd.EtcdException as e:
LOG.warning('Etcd is not ready: %s', str(e))
LOG.warning('Retrying in %d seconds...', delay)
time.sleep(delay)
attempts -= 1
except pymysql.OperationalError as e:
LOG.warning('Mysql is not ready: %s', str(e))
LOG.warning('Retrying in %d seconds...', delay)
time.sleep(delay)
attempts -= 1
return f(*args, **kwargs)
return wrap
def get_etcd_client():
if ETCD_TLS:
protocol = 'https'
ca_cert = CA_CERT
else:
protocol = 'http'
ca_cert = None
return etcd.Client(host=ETCD_HOST,
port=ETCD_PORT,
allow_reconnect=True,
protocol=protocol,
ca_cert=ca_cert,
read_timeout=2)
@retry
def get_mysql_client():
mysql_client = pymysql.connect(host='127.0.0.1',
port=33306,
user='monitor',
password=MONITOR_PASSWORD,
connect_timeout=1,
read_timeout=1,
cursorclass=pymysql.cursors.DictCursor)
return mysql_client
class GaleraChecker(object):
def __init__(self):
self.etcd_client = get_etcd_client()
# Liveness check runs every 10 seconds with 5 seconds timeout (default)
self.ttl = 20
@retry
def fetch_wsrep_data(self):
data = {}
mysql_client = get_mysql_client()
with mysql_client.cursor() as cursor:
sql = "SHOW STATUS LIKE 'wsrep%'"
cursor.execute(sql)
for i in cursor.fetchall():
data[i['Variable_name']] = i['Value']
return data
def check_if_sst_running(self):
return os.path.isfile(SST_FLAG)
def check_if_pidfile_created(self):
return True if os.path.isfile(PID_FILE) else False
def check_if_galera_ready(self):
state = self.fetch_cluster_state()
if state != 'STEADY':
LOG.error("Cluster state is not STEADY")
return False
wsrep_data = self.fetch_wsrep_data()
uuid = self.etcd_get_cluster_uuid()
if wsrep_data["wsrep_local_state_comment"] != "Synced":
LOG.error("wsrep_local_state_comment != 'Synced' - '%s'",
wsrep_data["wsrep_local_state_comment"])
return False
elif wsrep_data["wsrep_evs_state"] != "OPERATIONAL":
LOG.error("wsrep_evs_state != 'OPERATIONAL' - '%s'",
wsrep_data["wsrep_evs_state"])
return False
elif wsrep_data["wsrep_connected"] != "ON":
LOG.error("wsrep_connected != 'ON' - '%s'",
wsrep_data["wsrep_connected"])
return False
elif wsrep_data["wsrep_ready"] != "ON":
LOG.error("wsrep_ready != 'ON' - '%s'",
wsrep_data["wsrep_ready"])
return False
elif wsrep_data["wsrep_cluster_state_uuid"] != uuid:
LOG.error("wsrep_cluster_state_uuid != '%s' - '%s'",
uuid, wsrep_data["wsrep_cluster_state_uuid"])
return False
else:
LOG.info("Galera node is ready")
return True
def check_if_galera_alive(self):
# If cluster is not STEADY, nodes could be in strange positions,
# like SST sync. We should postpone liveness checks 'till bootstrap is
# done
if not self.etcd_check_if_cluster_ready():
LOG.info("Galera cluster status is not 'STEADY', skiping check")
return True
# During SST sync mysql can't accept any requests
if self.check_if_sst_running():
LOG.info("SST sync in progress, skiping check")
return True
if not self.check_if_pidfile_created():
LOG.info("Mysql pid file is not yet created, skiping check")
return True
global WAS_JOINED
global OLD_STATE
wsrep_data = self.fetch_wsrep_data()
# If local uuid is different - we have a split brain.
cluster_uuid = self.etcd_get_cluster_uuid()
mysql_uuid = wsrep_data['wsrep_cluster_state_uuid']
if cluster_uuid != mysql_uuid:
LOG.error("Cluster uuid is differs from local one.")
LOG.debug("Cluster uuid: %s Local uuid: %s",
cluster_uuid, mysql_uuid)
return False
# Node states check.
state = int(wsrep_data['wsrep_local_state'])
state_comment = wsrep_data['wsrep_local_state_comment']
if state == SYNCED_STATE or state == DONOR_DESYNCED_STATE:
WAS_JOINED = True
LOG.info("State OK: %s", state_comment)
self.etcd_register_in_path('nodes')
return True
elif state == JOINED_STATE and WAS_JOINED:
# Node was in the JOINED_STATE in prev check too. Seems to it can't
# start syncing.
if OLD_STATE == JOINED_STATE:
LOG.error("State BAD: %s", state_comment)
LOG.error("Joined, but not syncing")
self._etcd_delete()
return False
else:
LOG.info("State OK: %s", state_comment)
LOG.info("Probably will sync soon")
self.etcd_register_in_path('nodes')
return False
else:
LOG.info("State OK: %s", state_comment)
LOG.info("Just joined")
WAS_JOINED = True
self.etcd_register_in_path('nodes')
return True
OLD_STATE = state
LOG.warning("Unknown state: %s", state_comment)
return True
@retry
def _etcd_delete(self):
key = os.path.join(ETCD_PATH, 'nodes', IPADDR)
self.etcd_client.delete(key, recursive=True, dir=True)
LOG.warning("Deleted node's key '%s'", key)
@retry
def _etcd_set(self, data):
self.etcd_client.set(data[0], data[1], self.ttl)
LOG.info("Set %s with value '%s'", data[0], data[1])
@retry
def _etcd_read(self, path):
key = os.path.join(ETCD_PATH, path)
return self.etcd_client.read(key).value
def etcd_register_in_path(self, path):
key = os.path.join(ETCD_PATH, path, IPADDR)
self._etcd_set((key, time.time()))
def etcd_check_if_cluster_ready(self):
try:
state = self._etcd_read('state')
return True if state == 'STEADY' else False
except etcd.EtcdKeyNotFound:
return False
def etcd_get_cluster_uuid(self):
return self._etcd_read('uuid')
def fetch_cluster_state(self):
return self._etcd_read('state')
class GaleraHttpHandler(BaseHTTPServer.BaseHTTPRequestHandler):
def do_GET(self):
uri = self.path
LOG.debug("Started processing GET '%s' request", uri)
checker = GaleraChecker()
try:
if uri == "/liveness":
success = checker.check_if_galera_alive()
elif uri == "/readiness":
success = checker.check_if_galera_ready()
else:
LOG.error("Only '/liveness' and '/readiness' uri are"
" supported")
success = False
response = 200 if success else 503
self.send_response(response)
self.end_headers()
except Exception as err:
LOG.exception(err)
self.send_response(503)
self.end_headers()
finally:
LOG.debug("Finished processing GET request")
def run_server(port=8080):
server_class = BaseHTTPServer.HTTPServer
handler_class = GaleraHttpHandler
server_address = ('', port)
httpd = server_class(server_address, handler_class)
LOG.info('Starting http server...')
httpd.serve_forever()
def merge_configs(variables, new_config):
for k, v in new_config.items():
if k not in variables:
variables[k] = v
continue
if isinstance(v, dict) and isinstance(variables[k], dict):
merge_configs(variables[k], v)
else:
variables[k] = v
def get_config():
LOG.info("Getting global variables from %s", GLOBALS_PATH)
variables = {}
with open(GLOBALS_PATH) as f:
global_conf = json.load(f)
with open(GLOBALS_SECRETS_PATH) as f:
secrets = json.load(f)
merge_configs(global_conf, secrets)
for key in ['percona', 'etcd', 'namespace', 'cluster_domain']:
variables[key] = global_conf[key]
LOG.debug(variables)
return variables
def set_globals():
config = get_config()
global MONITOR_PASSWORD, CLUSTER_NAME
global ETCD_PATH, ETCD_HOST, ETCD_PORT, ETCD_TLS
CLUSTER_NAME = config['percona']['cluster_name']
MONITOR_PASSWORD = config['percona']['monitor_password']
ETCD_PATH = "/galera/%s" % config['percona']['cluster_name']
ETCD_HOST = "etcd.%s.svc.%s" % (config['namespace'],
config['cluster_domain'])
ETCD_PORT = int(config['etcd']['client_port']['cont'])
ETCD_TLS = config['etcd']['tls']['enabled']
if __name__ == "__main__":
get_config()
set_globals()
run_server()

View File

@ -1,33 +0,0 @@
global
# No syslog in containers
#log /dev/log local0
stats socket /run/haproxy/admin.sock mode 660 level admin
stats timeout 30s
# Tunes from MOS
tune.bufsize 32768
tune.maxrewrite 1024
# Default SSL material locations
ca-base /etc/ssl/certs
crt-base /etc/ssl/private
ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
ssl-default-bind-options no-sslv3
defaults
log global
mode tcp
option tcplog
option logasap
option dontlognull
option mysql-check
option tcpka
timeout connect 10s
timeout client 28801s
timeout server 28801s
listen galera-cluster
bind 0.0.0.0:{{ percona.port.cont }}
# We start with non-working configuration and update it via admin socket in the runtime
server primary 127.0.0.1:11111 check

View File

@ -1,304 +0,0 @@
#!/usr/bin/env python
import argparse
import functools
import json
import logging
import os
import socket
import subprocess
import sys
import time
import etcd
HOSTNAME = socket.getfqdn()
IPADDR = socket.gethostbyname(HOSTNAME)
BACKEND_NAME = "galera-cluster"
SERVER_NAME = "primary"
GLOBALS_PATH = '/etc/ccp/globals/globals.json'
CA_CERT = '/opt/ccp/etc/tls/ca.pem'
LOG_DATEFMT = "%Y-%m-%d %H:%M:%S"
LOG_FORMAT = "%(asctime)s.%(msecs)03d - %(levelname)s - %(message)s"
logging.basicConfig(format=LOG_FORMAT, datefmt=LOG_DATEFMT)
LOG = logging.getLogger(__name__)
LOG.setLevel(logging.DEBUG)
CONNECTION_ATTEMPTS = None
CONNECTION_DELAY = None
ETCD_PATH = None
ETCD_HOST = None
ETCD_PORT = None
ETCD_TLS = None
# Haproxy constant for health checks
SRV_STATE_RUNNING = 2
SRV_CHK_RES_PASSED = 3
def retry(f):
@functools.wraps(f)
def wrap(*args, **kwargs):
attempts = CONNECTION_ATTEMPTS
delay = CONNECTION_DELAY
while attempts > 1:
try:
return f(*args, **kwargs)
except etcd.EtcdException as e:
LOG.warning('Etcd is not ready: %s', str(e))
LOG.warning('Retrying in %d seconds...', delay)
time.sleep(delay)
attempts -= 1
return f(*args, **kwargs)
return wrap
def get_config():
LOG.info("Getting global variables from %s", GLOBALS_PATH)
variables = {}
with open(GLOBALS_PATH) as f:
global_conf = json.load(f)
for key in ['percona', 'etcd', 'namespace', 'cluster_domain']:
variables[key] = global_conf[key]
LOG.debug(variables)
return variables
def set_globals():
config = get_config()
global CONNECTION_ATTEMPTS, CONNECTION_DELAY
global ETCD_PATH, ETCD_HOST, ETCD_PORT, ETCD_TLS
CONNECTION_ATTEMPTS = config['etcd']['connection_attempts']
CONNECTION_DELAY = config['etcd']['connection_delay']
ETCD_PATH = "/galera/%s" % config['percona']['cluster_name']
ETCD_HOST = "etcd.%s.svc.%s" % (config['namespace'],
config['cluster_domain'])
ETCD_PORT = int(config['etcd']['client_port']['cont'])
ETCD_TLS = config['etcd']['tls']['enabled']
def get_etcd_client():
if ETCD_TLS:
protocol = 'https'
ca_cert = CA_CERT
else:
protocol = 'http'
ca_cert = None
return etcd.Client(host=ETCD_HOST,
port=ETCD_PORT,
allow_reconnect=True,
protocol=protocol,
ca_cert=ca_cert,
read_timeout=2)
def get_socket():
unix_socket = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
unix_socket.settimeout(5)
unix_socket.connect('/var/run/haproxy/admin.sock')
return unix_socket
def run_haproxy():
cmd = ["haproxy", "-f", "/etc/haproxy/haproxy.conf"]
LOG.info("Executing cmd:\n%s", cmd)
proc = subprocess.Popen(cmd)
return proc
def check_haproxy(proc):
ret_code = proc.poll()
if ret_code is not None:
LOG.error("Haproxy was terminated, exit code was: %s",
proc.returncode)
sys.exit(proc.returncode)
@retry
def etcd_set(etcd_client, key, value, ttl, dir=False, append=False, **kwargs):
etcd_client.write(key, value, ttl, dir, append, **kwargs)
LOG.info("Set %s with value '%s'", key, value)
@retry
def etcd_refresh(etcd_client, path, ttl):
key = os.path.join(ETCD_PATH, path)
etcd_client.refresh(key, ttl)
LOG.info("Refreshed %s ttl. New ttl is '%s'", key, ttl)
def send_command(cmd):
LOG.debug("Sending '%s' cmd to haproxy", cmd)
sock = get_socket()
sock.send(cmd + '\n')
file_handle = sock.makefile()
data = file_handle.read().splitlines()
sock.close()
return data
def get_haproxy_status():
state_data = send_command("show servers state galera-cluster")
stat_data = send_command("show stat typed")
# we need to parse string which looks like this:
# 'S.2.1.73.addr.1:CGS:str:10.233.76.104:33306'
for line in stat_data:
if "addr" in line:
ip, port = line.split(':')[-2:]
# It returns as a 3 elements list, with string inside.
# We have to do some magic, to make a valid dict out of it.
keys = state_data[1].split(' ')
keys.pop(0)
values = state_data[2].split(' ')
data_dict = dict(zip(keys, values))
data_dict['backend'] = "%s:%s" % (ip, port)
return data_dict
def get_cluster_state(etcd_client):
key = os.path.join(ETCD_PATH, 'state')
try:
state = etcd_client.read(key).value
return state
except etcd.EtcdKeyNotFound:
return None
def wait_for_cluster_to_be_steady(etcd_client, haproxy_proc):
while True:
state = get_cluster_state(etcd_client)
if state != 'STEADY':
check_haproxy(haproxy_proc)
LOG.warning("Cluster is not in the STEADY state, waiting...")
time.sleep(5)
else:
break
def set_server_addr(leader_ip):
cmds = ["set server %s/%s addr %s port 33306" % (
BACKEND_NAME, SERVER_NAME, leader_ip),
"set server %s/%s check-port 33306" % (
BACKEND_NAME, SERVER_NAME)]
for cmd in cmds:
# Bug in haproxy. Sometimes, haproxy can't convert port str to int.
# Will be fixed in 1.7.2
while True:
response = send_command(cmd)
if "problem converting port" in response[0]:
LOG.error("Port convertation failed, trying again...")
time.sleep(1)
else:
LOG.info("Successfuly set backend to %s:33306", leader_ip)
return
def get_leader(etcd_client):
key = os.path.join(ETCD_PATH, 'leader')
try:
leader = etcd_client.read(key).value
except etcd.EtcdKeyNotFound:
leader = None
LOG.info("Current leader is: %s", leader)
return leader
def set_leader(etcd_client, ttl, **kwargs):
key = os.path.join(ETCD_PATH, 'leader')
etcd_set(etcd_client, key, IPADDR, ttl, **kwargs)
def refresh_leader(etcd_client, ttl):
key = os.path.join(ETCD_PATH, 'leader')
etcd_refresh(etcd_client, key, ttl)
def do_we_need_to_reconfigure_haproxy(leader):
haproxy_stat = get_haproxy_status()
haproxy_leader = haproxy_stat['backend']
leader += ":33306"
LOG.debug("Haproxy server is: %s. Current leader is: %s",
haproxy_leader, leader)
return haproxy_leader != leader
def run_daemon(ttl):
LOG.debug("My IP is: %s", IPADDR)
haproxy_proc = run_haproxy()
etcd_client = get_etcd_client()
while True:
wait_for_cluster_to_be_steady(etcd_client, haproxy_proc)
leader = get_leader(etcd_client)
if not leader:
set_leader(etcd_client, ttl, prevExist=False)
leader = IPADDR
elif leader == IPADDR:
refresh_leader(etcd_client, ttl)
if do_we_need_to_reconfigure_haproxy(leader):
LOG.info("Updating haproxy configuration")
set_server_addr(leader)
check_haproxy(haproxy_proc)
LOG.info("Sleeping for 5 sec...")
time.sleep(5)
def run_readiness():
etcd_client = get_etcd_client()
state = get_cluster_state(etcd_client)
if state != 'STEADY':
LOG.error("Cluster is not in the STEADY state")
sys.exit(1)
leader = get_leader(etcd_client)
if not leader:
LOG.error("No leader found")
sys.exit(1)
else:
if do_we_need_to_reconfigure_haproxy(leader):
LOG.error("Haproxy configuration is wrong")
sys.exit(1)
haproxy_stat = get_haproxy_status()
LOG.debug(haproxy_stat)
if (int(haproxy_stat['srv_op_state']) != SRV_STATE_RUNNING and
int(haproxy_stat['srv_check_result']) != SRV_CHK_RES_PASSED):
LOG.error("Current leader is not alive")
sys.exit(1)
LOG.info("Service is ready")
sys.exit(0)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('type', choices=['daemon', 'readiness'])
args = parser.parse_args()
get_config()
set_globals()
if args.type == 'daemon':
run_daemon(ttl=20)
elif args.type == 'readiness':
run_readiness()
# vim: set ts=4 sw=4 tw=0 et :

View File

@ -1,54 +0,0 @@
[mysqld]
bind-address = 0.0.0.0
port = 33306
datadir = /var/lib/mysql
pid-file = /var/lib/mysql/mysqld.pid
log-error = /var/log/ccp/mysql/mysql.log
general_log = {{ '1' if db.general_log_enabled else '0' }}
general_log_file = /var/log/ccp/mysql/general-mysql.log
long_query_time = {{ db.long_query_time }}
slow_query_log = {{ '1' if db.slow_query_log_enabled else '0' }}
slow_query_log_file = /var/log/ccp/mysql/slow-mysql.log
max_connections = 10000
open_files_limit = 102400
skip-name-resolve
character-set-server = utf8
collation-server = utf8_general_ci
binlog_format = ROW
default_storage_engine = InnoDB
innodb_autoinc_lock_mode = 2
innodb_buffer_pool_size = 512M
innodb_file_per_table = 1
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_io_capacity = 500
innodb_read_io_threads = 8
innodb_write_io_threads = 8
{% if percona.sql_mode %}
sql_mode = "{{ percona.sql_mode }}"
{% endif -%}
wsrep_slave_threads = 4
wsrep_cluster_address = gcomm://
wsrep_provider = /usr/lib/galera3/libgalera_smm.so
wsrep_cluster_name = {{ percona.cluster_name }}
wsrep_sst_method = xtrabackup-v2
wsrep_sst_auth = "xtrabackup:{{ percona.xtrabackup_password }}"
wsrep_provider_options = "gcache.size={{ percona.gcache_size }};gcache.recover=yes{% if db.tls.enabled %};socket.ssl=yes;socket.ssl_key=/opt/ccp/etc/tls/server-key.pem;socket.ssl_cert=/opt/ccp/etc/tls/server-cert.pem;socket.ssl_ca=/opt/ccp/etc/tls/ca.pem"{% endif %}
{% if db.tls.enabled %}
ssl-ca = /opt/ccp/etc/tls/ca.pem
ssl-cert = /opt/ccp/etc/tls/server-cert.pem
ssl-key = /opt/ccp/etc/tls/server-key.pem
[sst]
encrypt = 4
ssl-ca = /opt/ccp/etc/tls/ca.pem
ssl-cert = /opt/ccp/etc/tls/server-cert.pem
ssl-key = /opt/ccp/etc/tls/server-key.pem
{% endif %}

View File

@ -1,782 +0,0 @@
#!/usr/bin/env python
import fileinput
import functools
import json
import logging
import os
import os.path
import shutil
import socket
import subprocess
import signal
import six.moves
import sys
import time
import etcd
import pymysql.cursors
HOSTNAME = socket.getfqdn()
IPADDR = socket.gethostbyname(HOSTNAME)
DATADIR = "/var/lib/mysql"
INIT_FILE = os.path.join(DATADIR, 'init.ok')
PID_FILE = os.path.join(DATADIR, "mysqld.pid")
GRASTATE_FILE = os.path.join(DATADIR, 'grastate.dat')
SST_FLAG = os.path.join(DATADIR, "sst_in_progress")
DHPARAM = os.path.join(DATADIR, "dhparams.pem")
GLOBALS_PATH = '/etc/ccp/globals/globals.json'
GLOBALS_SECRETS_PATH = '/etc/ccp/global-secrets/global-secrets.json'
CA_CERT = '/opt/ccp/etc/tls/ca.pem'
LOG_DATEFMT = "%Y-%m-%d %H:%M:%S"
LOG_FORMAT = "%(asctime)s.%(msecs)03d - %(levelname)s - %(message)s"
logging.basicConfig(format=LOG_FORMAT, datefmt=LOG_DATEFMT)
LOG = logging.getLogger(__name__)
LOG.setLevel(logging.DEBUG)
FORCE_BOOTSTRAP = None
FORCE_BOOTSTRAP_NODE = None
EXPECTED_NODES = None
MYSQL_ROOT_PASSWORD = None
CLUSTER_NAME = None
XTRABACKUP_PASSWORD = None
MONITOR_PASSWORD = None
CONNECTION_ATTEMPTS = None
CONNECTION_DELAY = None
ETCD_PATH = None
ETCD_HOST = None
ETCD_PORT = None
ETCD_TLS = None
DHPARAM_CERT = None
class ProcessException(Exception):
def __init__(self, exit_code):
self.exit_code = exit_code
self.msg = "Command exited with code %d" % self.exit_code
super(ProcessException, self).__init__(self.msg)
def retry(f):
@functools.wraps(f)
def wrap(*args, **kwargs):
attempts = CONNECTION_ATTEMPTS
delay = CONNECTION_DELAY
while attempts > 1:
try:
return f(*args, **kwargs)
except etcd.EtcdException as e:
LOG.warning('Etcd is not ready: %s', str(e))
LOG.warning('Retrying in %d seconds...', delay)
time.sleep(delay)
attempts -= 1
return f(*args, **kwargs)
return wrap
def merge_configs(variables, new_config):
for k, v in new_config.items():
if k not in variables:
variables[k] = v
continue
if isinstance(v, dict) and isinstance(variables[k], dict):
merge_configs(variables[k], v)
else:
variables[k] = v
def get_config():
LOG.info("Getting global variables from %s", GLOBALS_PATH)
variables = {}
with open(GLOBALS_PATH) as f:
global_conf = json.load(f)
with open(GLOBALS_SECRETS_PATH) as f:
secrets = json.load(f)
merge_configs(global_conf, secrets)
for key in ['percona', 'db', 'etcd', 'namespace', 'cluster_domain',
'security']:
variables[key] = global_conf[key]
LOG.debug(variables)
return variables
def set_globals():
config = get_config()
global MYSQL_ROOT_PASSWORD, CLUSTER_NAME, XTRABACKUP_PASSWORD
global MONITOR_PASSWORD, CONNECTION_ATTEMPTS, CONNECTION_DELAY
global ETCD_PATH, ETCD_HOST, ETCD_PORT, EXPECTED_NODES
global FORCE_BOOTSTRAP, FORCE_BOOTSTRAP_NODE, ETCD_TLS, DHPARAM_CERT
FORCE_BOOTSTRAP = config['percona']['force_bootstrap']['enabled']
FORCE_BOOTSTRAP_NODE = config['percona']['force_bootstrap']['node']
MYSQL_ROOT_PASSWORD = config['db']['root_password']
CLUSTER_NAME = config['percona']['cluster_name']
XTRABACKUP_PASSWORD = config['percona']['xtrabackup_password']
MONITOR_PASSWORD = config['percona']['monitor_password']
CONNECTION_ATTEMPTS = config['etcd']['connection_attempts']
CONNECTION_DELAY = config['etcd']['connection_delay']
EXPECTED_NODES = int(config['percona']['cluster_size'])
ETCD_PATH = "/galera/%s" % config['percona']['cluster_name']
ETCD_HOST = "etcd.%s.svc.%s" % (config['namespace'],
config['cluster_domain'])
ETCD_PORT = int(config['etcd']['client_port']['cont'])
ETCD_TLS = config['etcd']['tls']['enabled']
DHPARAM_CERT = config['security']['tls']['dhparam']
def get_mysql_client(insecure=False):
password = '' if insecure else MYSQL_ROOT_PASSWORD
return pymysql.connect(unix_socket='/var/run/mysqld/mysqld.sock',
user='root',
password=password,
connect_timeout=1,
read_timeout=1,
cursorclass=pymysql.cursors.DictCursor)
def get_etcd_client():
if ETCD_TLS:
protocol = 'https'
ca_cert = CA_CERT
else:
protocol = 'http'
ca_cert = None
return etcd.Client(host=ETCD_HOST,
port=ETCD_PORT,
allow_reconnect=True,
protocol=protocol,
ca_cert=ca_cert,
read_timeout=2)
def datadir_cleanup(path):
for filename in os.listdir(path):
fullpath = os.path.join(path, filename)
if os.path.isdir(fullpath):
shutil.rmtree(fullpath)
else:
os.remove(fullpath)
def create_dhparam():
if not os.path.isfile(DHPARAM):
with open(DHPARAM, 'w') as f:
f.write(DHPARAM_CERT)
LOG.info("dhparam cert created in %s", DHPARAM)
else:
LOG.info("%s exists, not overriding it", DHPARAM)
def create_init_flag():
if not os.path.isfile(INIT_FILE):
open(INIT_FILE, 'a').close()
LOG.debug("Create init_ok file: %s", INIT_FILE)
else:
LOG.debug("Init file: '%s' already exists", INIT_FILE)
def run_cmd(cmd, check_result=False):
LOG.debug("Executing cmd:\n%s", cmd)
proc = subprocess.Popen(cmd, shell=True)
if check_result:
proc.communicate()
if proc.returncode != 0:
raise ProcessException(proc.returncode)
return proc
def run_mysqld(available_nodes, donors_list, etcd_client, lock):
create_dhparam()
cmd = ("mysqld --user=mysql --wsrep_cluster_name=%s"
" --wsrep_cluster_address=%s"
" --wsrep_sst_method=xtrabackup-v2"
" --wsrep_sst_donor=%s"
" --wsrep_node_address=%s"
" --wsrep_node_name=%s"
" --pxc_strict_mode=PERMISSIVE" %
(six.moves.shlex_quote(CLUSTER_NAME),
"gcomm://%s" % six.moves.shlex_quote(available_nodes),
six.moves.shlex_quote(donors_list),
six.moves.shlex_quote(IPADDR),
six.moves.shlex_quote(IPADDR)))
mysqld_proc = run_cmd(cmd)
wait_for_mysqld_to_start(mysqld_proc, insecure=False)
def sig_handler(signum, frame):
LOG.info("Caught a signal: %d", signum)
etcd_deregister_in_path(etcd_client, 'queue')
etcd_deregister_in_path(etcd_client, 'nodes')
etcd_deregister_in_path(etcd_client, 'seqno')
etcd_delete_if_exists(etcd_client, 'leader', IPADDR)
release_lock(lock)
mysqld_proc.send_signal(signum)
signal.signal(signal.SIGTERM, sig_handler)
return mysqld_proc
def mysql_exec(mysql_client, sql_list):
with mysql_client.cursor() as cursor:
for cmd, args in sql_list:
LOG.debug("Executing mysql cmd: %s\nWith the following args: '%s'",
cmd, args)
cursor.execute(cmd, args)
return cursor.fetchall()
@retry
def fetch_status(etcd_client, path):
key = os.path.join(ETCD_PATH, path)
try:
root = etcd_client.get(key)
except etcd.EtcdKeyNotFound:
LOG.debug("Current nodes in %s is: %s", key, None)
return []
result = [str(child.key).replace(key + "/", '')
for child in root.children
if str(child.key) != key]
LOG.debug("Current nodes in %s is: %s", key, result)
return result
def fetch_wsrep_data():
wsrep_data = {}
mysql_client = get_mysql_client()
data = mysql_exec(mysql_client, [("SHOW STATUS LIKE 'wsrep%'", None)])
for i in data:
wsrep_data[i['Variable_name']] = i['Value']
return wsrep_data
@retry
def get_oldest_node_by_seqno(etcd_client, path):
"""
This fucntion returns IP addr of the node with the highes seqno.
seqno(sequence number) indicates the number of transactions ran thought
that node. Node with highes seqno is the node with the lates data.
"""
key = os.path.join(ETCD_PATH, path)
root = etcd_client.get(key)
# We need to cut etcd path prefix like "/galera/k8scluster/seqno/" to get
# the IP addr of the node.
prefix = key + "/"
result = sorted([(str(child.key).replace(prefix, ''), int(child.value))
for child in root.children])
result.sort(key=lambda x: x[1])
LOG.debug("ALL seqno is %s", result)
LOG.info("Oldest node is %s, am %s", result[-1][0], IPADDR)
return result[-1][0]
@retry
def _etcd_set(etcd_client, path, value, ttl):
key = os.path.join(ETCD_PATH, path)
etcd_client.set(key, value, ttl=ttl)
LOG.info("Set %s with value '%s' and ttl '%s'", key, value, ttl)
def _etcd_read(etcd_client, path):
key = os.path.join(ETCD_PATH, path)
try:
value = etcd_client.read(key).value
return value
except etcd.EtcdKeyNotFound:
return None
def etcd_register_in_path(etcd_client, path, ttl=60):
key = os.path.join(path, IPADDR)
_etcd_set(etcd_client, key, time.time(), ttl)
def etcd_set_seqno(etcd_client, ttl):
seqno = mysql_get_seqno()
key = os.path.join('seqno', IPADDR)
_etcd_set(etcd_client, key, seqno, ttl)
def etcd_delete_if_exists(etcd_client, path, prevValue):
key = os.path.join(ETCD_PATH, path)
try:
etcd_client.delete(key, prevValue=prevValue)
LOG.warning("Deleted key %s, with previous value '%s'", key, prevValue)
except etcd.EtcdKeyNotFound:
LOG.warning("Key %s not exist", key)
except etcd.EtcdCompareFailed:
LOG.debug("Previous of the '%s' is not the '%s'", key, prevValue)
def etcd_deregister_in_path(etcd_client, path):
key = os.path.join(ETCD_PATH, path, IPADDR)
try:
etcd_client.delete(key, recursive=True)
LOG.warning("Deleted key %s", key)
except etcd.EtcdKeyNotFound:
LOG.warning("Key %s not exist", key)
def mysql_get_seqno():
if os.path.isfile(GRASTATE_FILE):
with open(GRASTATE_FILE) as f:
content = f.readlines()
for line in content:
if line.startswith('seqno'):
return line.partition(':')[2].strip()
else:
LOG.warning("Can't find a '%s' file. Setting seqno to '-1'",
GRASTATE_FILE)
return -1
def check_for_stale_seqno(etcd_client):
queue_set = set(fetch_status(etcd_client, 'queue'))
seqno_set = set(fetch_status(etcd_client, 'seqno'))
difference = queue_set - seqno_set
if difference:
LOG.warning("Found stale seqno entries: %s, deleting", difference)
for ip in difference:
key = os.path.join(ETCD_PATH, 'seqno', ip)
try:
etcd_client.delete(key)
LOG.warning("Deleted key %s", key)
except etcd.EtcdKeyNotFound:
LOG.warning("Key %s not exist", key)
else:
LOG.debug("Found seqno set is equals to the queue set: %s = %s",
queue_set, seqno_set)
def check_if_sst_running():
return os.path.isfile(SST_FLAG)
def wait_for_expected_state(etcd_client, ttl):
while True:
status = fetch_status(etcd_client, 'queue')
if len(status) > EXPECTED_NODES:
LOG.debug("Current number of nodes is %s, expected: %s, sleeping",
len(status), EXPECTED_NODES)
time.sleep(10)
elif len(status) < EXPECTED_NODES:
LOG.debug("Current number of nodes is %s, expected: %s, sleeping",
len(status), EXPECTED_NODES)
time.sleep(1)
else:
wait_for_my_turn(etcd_client)
break
def wait_for_my_seqno(etcd_client):
oldest_node = get_oldest_node_by_seqno(etcd_client, 'seqno')
if IPADDR == oldest_node:
LOG.info("It's my turn to join the cluster")
return
else:
time.sleep(5)
def wait_for_my_turn(etcd_client):
check_for_stale_seqno(etcd_client)
LOG.info("Waiting for my turn to join cluster")
if FORCE_BOOTSTRAP:
LOG.warning("Force bootstrap flag was detected, skiping normal"
" bootstrap procedure")
if FORCE_BOOTSTRAP_NODE is None:
LOG.error("Force bootstrap node wasn't set. Can't continue")
sys.exit(1)
LOG.debug("Force bootstrap node is %s", FORCE_BOOTSTRAP_NODE)
my_node_name = os.environ['CCP_NODE_NAME']
if my_node_name == FORCE_BOOTSTRAP_NODE:
LOG.info("This node is the force boostrap one.")
set_safe_to_bootstrap()
return
else:
LOG.info("This node is not the force boostrap one."
" Waiting for the bootstrap one to create a cluster.")
while True:
nodes = fetch_status(etcd_client, 'nodes')
if nodes:
wait_for_my_seqno(etcd_client)
return
else:
time.sleep(5)
else:
wait_for_my_seqno(etcd_client)
def wait_for_sync(mysqld):
while True:
try:
wsrep_data = fetch_wsrep_data()
state = int(wsrep_data['wsrep_local_state'])
if state == 4:
LOG.info("Node synced")
# If sync was done by SST all files in datadir was lost
create_init_flag()
break
else:
LOG.debug("Waiting node to be synced. Current state is: %s",
wsrep_data['wsrep_local_state_comment'])
time.sleep(5)
except Exception:
if mysqld.poll() is None:
time.sleep(5)
else:
LOG.error('Mysqld was terminated, exit code was: %s',
mysqld.returncode)
sys.exit(mysqld.returncode)
def check_if_im_last(etcd_client):
sleep = 10
queue_status = fetch_status(etcd_client, 'queue')
while True:
nodes_status = fetch_status(etcd_client, 'nodes')
if len(nodes_status) > EXPECTED_NODES:
LOG.info("Looks like we have stale data in etcd, found %s nodes, "
"but expected to find %s, sleeping for %s sec",
len(nodes_status), EXPECTED_NODES, sleep)
time.sleep(sleep)
else:
break
if not queue_status and len(nodes_status) == EXPECTED_NODES:
LOG.info("Looks like this node is the last one")
return True
else:
LOG.info("I'm not the last node")
return False
def create_join_list(status, leader, donor=False):
if IPADDR in status:
status.remove(IPADDR)
if leader in status and donor:
status.remove(leader)
if not status:
if donor:
LOG.info("No available nodes found. Using empty donor list")
return (",")
else:
LOG.info("No available nodes found. Assuming I'm first")
return ("", True)
else:
if donor:
# We need to keep trailing comma at the end
donor_list = "%s," % ','.join(status)
LOG.debug("Donor list is: '%s'", donor_list)
return donor_list
else:
LOG.info("Joining to nodes %s", ','.join(status))
return (','.join(status), False)
def update_uuid(etcd_client):
wsrep_data = fetch_wsrep_data()
uuid = wsrep_data['wsrep_cluster_state_uuid']
_etcd_set(etcd_client, 'uuid', uuid, ttl=None)
def update_cluster_state(etcd_client, state):
_etcd_set(etcd_client, 'state', state, ttl=None)
def wait_for_mysqld(proc):
code = proc.wait()
LOG.info("Process exited with code %d", code)
sys.exit(code)
def wait_for_mysqld_to_start(proc, insecure):
LOG.info("Waiting mysql to start...")
# Sometimes initial mysql start could take some time, especialy with SSL
# enabled. FIXME - replace sleep with some additional checks.
time.sleep(30)
while True:
if check_if_sst_running():
LOG.debug("SST sync detected, waiting...")
time.sleep(30)
else:
LOG.debug("No SST sync detected")
break
for i in range(0, 59):
try:
mysql_client = get_mysql_client(insecure=insecure)
mysql_exec(mysql_client, [("SELECT 1", None)])
return
except Exception:
time.sleep(1)
else:
LOG.error("Mysql boot failed")
raise RuntimeError("Process exited with code: %s" % proc.returncode)
def wait_for_mysqld_to_stop():
"""
Since mysqld start wrapper first, we can't check for the executed proc
exit code and be assured that mysqld itself is finished working. We have
to check whole process group, so we're going to use pgrep for this.
"""
LOG.info("Waiting for mysqld to finish working")
for i in range(0, 29):
proc = run_cmd("pgrep mysqld")
proc.communicate()
if proc.returncode == 0:
time.sleep(1)
else:
LOG.info("Mysqld finished working")
break
else:
LOG.info("Can't kill the mysqld process used for bootstraping")
sys.exit(1)
def mysql_init():
datadir_cleanup(DATADIR)
run_cmd("mysqld --initialize-insecure", check_result=True)
mysqld_proc = run_cmd("mysqld --skip-networking")
wait_for_mysqld_to_start(mysqld_proc, insecure=True)
LOG.info("Mysql is running, setting up the permissions")
sql_list = [("CREATE USER 'root'@'%%' IDENTIFIED BY %s",
MYSQL_ROOT_PASSWORD),
("GRANT ALL ON *.* TO 'root'@'%' WITH GRANT OPTION", None),
("ALTER USER 'root'@'localhost' IDENTIFIED BY %s",
MYSQL_ROOT_PASSWORD),
("CREATE USER 'xtrabackup'@'localhost' IDENTIFIED BY %s",
XTRABACKUP_PASSWORD),
("GRANT RELOAD,PROCESS,LOCK TABLES,REPLICATION CLIENT ON *.*"
" TO 'xtrabackup'@'localhost'", None),
("GRANT REPLICATION CLIENT ON *.* TO monitor@'%%' IDENTIFIED"
" BY %s", MONITOR_PASSWORD),
("DROP DATABASE IF EXISTS test", None),
("FLUSH PRIVILEGES", None)]
try:
mysql_client = get_mysql_client(insecure=True)
mysql_exec(mysql_client, sql_list)
except Exception:
raise
create_init_flag()
# It's more safe to kill mysqld via pkill, since mysqld start wrapper first
run_cmd("pkill mysqld")
wait_for_mysqld_to_stop()
LOG.info("Mysql bootstraping is done")
def check_cluster(etcd_client):
state = _etcd_read(etcd_client, 'state')
nodes_status = fetch_status(etcd_client, 'nodes')
if not nodes_status and state == 'STEADY':
LOG.warning("Cluster is in the STEADY state, but there no"
" alive nodes detected, running cluster recovery")
update_cluster_state(etcd_client, 'RECOVERY')
def acquire_lock(lock, ttl):
LOG.info("Locking...")
lock.acquire(blocking=True, lock_ttl=ttl)
LOG.info("Successfuly acquired lock")
def release_lock(lock):
lock.release()
LOG.info("Successfuly released lock")
def set_safe_to_bootstrap():
"""
Less wordy way to do "inplace" edit of the file
"""
for line in fileinput.input(GRASTATE_FILE, inplace=1):
if line.startswith("safe_to_bootstrap"):
line = line.replace("safe_to_bootstrap: 0", "safe_to_bootstrap: 1")
sys.stdout.write(line)
def run_create_queue(etcd_client, lock, ttl):
"""
In this step we're making recovery preparations.
We need to get our seqno from mysql, after that we done, we'll fall into
the endless loop waiting 'till other nodes do the same and after that we
wait for our turn, based on the seqno, to start jointing the cluster.
"""
LOG.info("Creating recovery queue")
etcd_register_in_path(etcd_client, 'queue', ttl=None)
etcd_set_seqno(etcd_client, ttl=None)
release_lock(lock)
wait_for_expected_state(etcd_client, ttl)
def run_join_cluster(etcd_client, lock, ttl):
"""
In this step we're ready to join or create new cluster.
We get current nodes list, and it's empty it means we're the first one.
If the seqno queue list is empty and nodes list is equals to 3, we assume
that we're the last one. In the one last case we're the second one.
If we're the first one, we're creating the new cluster.
If we're the second one or last one, we're joinning to the existing
cluster.
If cluster state was a RECOVERY we do the same thing, but nodes take turns
not by first come - first served rule, but by the seqno of their data, so
first one node will the one with the most recent data.
"""
LOG.info("Joining the cluster")
acquire_lock(lock, ttl)
state = _etcd_read(etcd_client, 'state')
nodes_status = fetch_status(etcd_client, 'nodes')
leader = _etcd_read(etcd_client, 'leader')
available_nodes, first_one = create_join_list(nodes_status, leader)
donors_list = create_join_list(nodes_status, leader, donor=True)
if first_one:
set_safe_to_bootstrap()
# First node shouldn't have a TTL during the cluster bootstrap
ttl = None
mysqld = run_mysqld(available_nodes, donors_list, etcd_client, lock)
wait_for_sync(mysqld)
etcd_register_in_path(etcd_client, 'nodes', ttl)
if state == "RECOVERY":
etcd_deregister_in_path(etcd_client, 'seqno')
etcd_deregister_in_path(etcd_client, 'queue')
last_one = check_if_im_last(etcd_client)
release_lock(lock)
return (first_one, last_one, mysqld)
def run_update_metadata(etcd_client, first_one, last_one):
"""
In this step we updating the cluster state and metadata.
If node was the first one, it change the state of the cluster to the
BUILDING and sets it's uuid as a cluster uuid in etcd.
If node was the last one it change the state of the cluster to the STEADY.
Please note, that if it was a RECOVERY scenario, we dont change state of
the cluster until it will be fully rebuilded.
"""
LOG.info("Update cluster metadata")
state = _etcd_read(etcd_client, 'state')
if first_one:
update_uuid(etcd_client)
if state != 'RECOVERY':
update_cluster_state(etcd_client, 'BUILDING')
if last_one:
update_cluster_state(etcd_client, 'STEADY')
def main(ttl):
if not os.path.isfile(INIT_FILE):
LOG.info("Init file '%s' not found, doing full init", INIT_FILE)
mysql_init()
else:
LOG.info("Init file '%s' found. Skiping mysql bootstrap and run"
" wsrep-recover", INIT_FILE)
run_cmd("mysqld_safe --wsrep-recover", check_result=True)
try:
LOG.debug("My IP is: %s", IPADDR)
etcd_client = get_etcd_client()
lock = etcd.Lock(etcd_client, 'galera')
acquire_lock(lock, ttl)
check_cluster(etcd_client)
state = _etcd_read(etcd_client, 'state')
# Scenario 1: Initial bootstrap
if state is None or state == 'BUILDING':
LOG.info("No running cluster detected - starting bootstrap")
first_one, last_one, mysqld = run_join_cluster(etcd_client, lock,
ttl)
run_update_metadata(etcd_client, first_one, last_one)
LOG.info("Bootsraping is done. Node is ready.")
# Scenario 2: Re-connect
elif state == 'STEADY':
LOG.info("Detected running cluster, re-connecting")
first_one, last_one, mysqld = run_join_cluster(etcd_client, lock,
ttl)
LOG.info("Node joined and ready")
# Scenario 3: Recovery
elif state == 'RECOVERY':
LOG.warning("Cluster is in the RECOVERY state, re-connecting to"
" the node with the oldest data")
run_create_queue(etcd_client, lock, ttl)
first_one, last_one, mysqld = run_join_cluster(etcd_client, lock,
ttl)
run_update_metadata(etcd_client, first_one, last_one)
LOG.info("Recovery is done. Node is ready.")
wait_for_mysqld(mysqld)
except Exception as err:
LOG.exception(err)
raise
finally:
etcd_deregister_in_path(etcd_client, 'queue')
etcd_deregister_in_path(etcd_client, 'nodes')
etcd_deregister_in_path(etcd_client, 'seqno')
etcd_delete_if_exists(etcd_client, 'leader', IPADDR)
release_lock(lock)
if __name__ == "__main__":
get_config()
set_globals()
main(ttl=300)
# vim: set ts=4 sw=4 tw=0 et :

View File

@ -1 +0,0 @@
{{ security.tls.server_cert }}

View File

@ -1 +0,0 @@
{{ security.tls.server_key }}

View File

@ -1,128 +0,0 @@
dsl_version: 0.2.0
service:
name: galera
antiAffinity: local
ports:
- {{ percona.port }}
containers:
- name: galera-checker
image: galera-checker
volumes:
- name: mysql-storage
path: "/var/lib/mysql"
type: host
readOnly: true
daemon:
files:
- galera-checker
# {% if db.tls.enabled %}
- ca.pem
- server-key.pem
- server-cert.pem
# {% endif %}
dependencies:
- etcd
command: "/opt/ccp/bin/galera_checker.py"
- name: galera-haproxy
image: galera-haproxy
probes:
readiness: "/opt/ccp/bin/haproxy_entrypoint.py readiness"
pre:
- name: mkdir-run
command: "sudo /bin/mkdir /run/haproxy"
- name: chown-run
command: "sudo /bin/chown -R haproxy: /run/haproxy"
daemon:
files:
- haproxy-conf
- haproxy_entrypoint
# {% if db.tls.enabled %}
- ca.pem
- server-key.pem
- server-cert.pem
# {% endif %}
dependencies:
- etcd
command: "/opt/ccp/bin/haproxy_entrypoint.py daemon"
- name: galera
image: percona
probes:
readiness:
path: "/readiness"
type: "httpGet"
port: 8080
timeout: 5
scheme: "http"
liveness:
path: "/liveness"
type: "httpGet"
port: 8080
timeout: 30
initialDelay: 60
scheme: "http"
volumes:
- name: mysql-logs
path: "/var/log/ccp/mysql"
type: host
readOnly: false
- name: mysql-storage
path: "/var/lib/mysql"
type: host
readOnly: false
pre:
- name: chown-logs-dir
command: "sudo /bin/chown mysql:mysql /var/log/ccp/mysql"
- name: chown-data-dir
command: "sudo /bin/chown mysql:mysql /var/lib/mysql"
daemon:
files:
- entrypoint
- mycnf
- galera-checker
# {% if db.tls.enabled %}
- ca.pem
- server-key.pem
- server-cert.pem
# {% endif %}
dependencies:
- etcd
command: /opt/ccp/bin/entrypoint.py
files:
entrypoint:
path: /opt/ccp/bin/entrypoint.py
content: percona_entrypoint.py
perm: "0755"
mycnf:
path: /etc/mysql/my.cnf
content: my.cnf.j2
galera-checker:
path: /opt/ccp/bin/galera_checker.py
content: galera_checker.py
perm: "0755"
haproxy-conf:
path: /etc/haproxy/haproxy.conf
content: haproxy.conf.j2
haproxy_entrypoint:
path: /opt/ccp/bin/haproxy_entrypoint.py
content: haproxy_entrypoint.py
perm: "0755"
# {% if db.tls.enabled %}
ca.pem:
path: /opt/ccp/etc/tls/ca.pem
content: ca.pem.j2
perm: "0400"
server-key.pem:
path: /opt/ccp/etc/tls/server-key.pem
content: server-key.pem.j2
perm: "0400"
server-cert.pem:
path: /opt/ccp/etc/tls/server-cert.pem
content: server-cert.pem.j2
perm: "0400"
# Cant use it right now, 'cos of the file creation order
dhparams.pem:
path: /var/lib/mysql/dhparams.pem
content: dhparams.pem.j2
perm: "0400"
# {% endif %}

View File

@ -1,5 +0,0 @@
#!/bin/bash
set -ex
workdir=$(dirname $0)
yamllint -c $workdir/yamllint.yaml $(find . -not -path '*/\.*' -type f -name '*.yaml')

View File

@ -1,21 +0,0 @@
extends: default
rules:
braces:
max-spaces-inside: 1
comments:
level: error
comments-indentation:
level: warning
document-end:
present: no
document-start:
level: error
present: no
empty-lines:
max: 1
max-start: 0
max-end: 0
line-length:
level: warning
max: 120

29
tox.ini
View File

@ -1,29 +0,0 @@
[tox]
minversion = 1.7
envlist = py35,py34,py27,pypy,pep8,linters
skipsdist = True
[testenv:linters]
deps = yamllint
commands =
{toxinidir}/tools/yamllint.sh
[testenv:pep8]
deps = flake8
commands =
flake8 {posargs}
[testenv:venv]
commands = {posargs}
[testenv:venv3]
basepython = python3
commands = {posargs}
[flake8]
# H102 skipped as it's a non-free project
show-source = True
ignore = H102
builtins = _
exclude=.venv,.git,.tox,dist,doc,*openstack/common*,*lib/python*,*egg,build