puppet-tripleo/manifests/profile
Mike Bayer f483c91fae Force MySQL / MariaDB log_warnings to 1
This MySQL / MariaDB server value was changed
from the value 1 to 2 between MariaDB 10.1 and 10.2 [1].  The
result of this change is that any database connection which is
not gracefully closed results in a log message
"Got an error reading communication packets" in the MySQL server
log, which is misleading as it does not usually refer to any
actionable issue; real connectivity issues are always seen in
application logs and most of these messages in the server
logs are likely to be false positives due to the behavior of HAProxy.

While applications can reduce the occurence of this error by
ensuring that database connections are gracefully closed, this
is already the behavior of oslo.db and SQLAlchemy which maintains
a connection pool that closes out stale connections explicitly
when requests are made.

The majority of these warnings are likely the result of normal HAProxy
operation, where the settings "timeout client" and "timeout server"
are set to 90 minutes, such that any connection older than this
time will be non-gracefully closed by the proxy, generating
the warning.  An idle application server process will not have attended
to connections that are older than the timeout period,
leading to these connections being left for HAProxy to handle;
HAProxy's timeout behavior leading to this message in the logs has been
confirmed in local experimentation.

The application server itself is never exposed to this as upon
the start of work will always recycle any connection that is older
than its own timeout, which defaults to 60 minutes for applications
using oslo.config + oslo.db.   Without HAProxy having the capability
to close out these connections using MySQL's protocol, the messages
are unavoidable.

The message will also occur anytime an Openstack process is stopped
or killed for all connections that are pooled in that process.

The correct way to diagnose if an application is having connectivity
issues is to look in the application server log itself for error
messages and stack traces that have much more detail as to the context
that produced a particular error message.   This warning is also
known to occur when an application server is not able to respond
to packets quickly enough as has been observed with services
such as Cinder where eventlet monkeypatching causes the PyMySQL
client to be blocked; however when this occurs, there is an
informative stack trace and error message in the application logs
that shows what's going on.

As this particular warning message is not useful in that most
occurences will refer to normal behavior as designed, the
log level should be forced to "1" to prevent these messages
as they are causing confusion in downstream environments.

[1] https://mariadb.com/kb/en/upgrading-from-mariadb-101-to-mariadb-102/#incompatible-changes-between-101-and-102

Change-Id: I0efb4f77aaceda635c8983d6b7a240171a7accdc
2020-11-19 11:19:49 +00:00
..
base Force MySQL / MariaDB log_warnings to 1 2020-11-19 11:19:49 +00:00
pacemaker Force MySQL / MariaDB log_warnings to 1 2020-11-19 11:19:49 +00:00