By default Python configures SIGPIPE to be SIG_IGN, which means to
ignore the signal. We don't want that as it causes problems when
journald restarts and our log calls start triggering SIGPIPEs.
Instead, we want to allow the SIGPIPE to kill the process so it can
be restarted by systemd.
Change-Id: I512139b96b2de8b372efc91e8a3fc8d33553405a
Closes-Bug: 1795030
When the os-collect-config process is started on multiple systems at the
same time, the polling intervals can line up to cause performance
problems against the configuration source. To reduce the impact, this
change adds a splay option to allow the operator to configure a random
delay prior to the polling to attempt to offset the polling
syncronization.
Change-Id: I1a8be3345d783da9014eca7ea26da19d57e767c0
Closes-Bug: #1677314
There are 2 things at play here. First the logic added in
4cfeb28d12 is no longer needed
because our initial sleep time is very low and increases
gradually up to the max.
2nd, I'm proposing that we avoid reexecing unless the config
file actually changes. Not re-execing will give us the option
to optimize os-collect-config for some long running collectors
(like zaqar websockets). Also, Os-collect-config updates would
already be handled by packaging restarts and or other deployment
system changes anyways.
Change-Id: I04b2752d007089f72af42c88c4249c3e11c8346f
We're currently still using unmaintained oslo-incubator code for
our logging, which is bad. This switches us to oslo.log as
described in [1].
1: http://docs.openstack.org/developer/oslo.log/migration.html
Change-Id: Ibce86ab4ee24eeb55d0de1b0d5ff4ee4ea6ef66f
This isn't quite right and broke on stable/liberty. Pushing
a revert in case I3c22d77dece399d21ab94783b74990789a1e1481
doesn't actually fix the problem. We should probably merge
whichever passes first.
This reverts commit 69653318f4.
Change-Id: I9304429f25d28ca756e50b1788e149c5bb46b1d6
The old oslo-incubator log module isn't maintained (and doesn't even
exist anymore), so we don't really want to be using it. It appears
this was the only incubator module we were actually using, so this
allows us to remove all of the unmaintained incubator code.
Change-Id: Ib4ad3b231360987a1ef4f95b5b5a8b656232efc4
This patch updates os-collect-config so that the sleep interval
time is shortened if changes are detected. This should decrease
deployment time when using Heat templates which use depends_on
to step through a sequence of software deployment resources.
The new default sleep is set to 1 and increases
by sleep_time *= 2 until it reaches the default
sleep interval again.
Change-Id: I5cbd0956db2abebec876b15bee72b70ec64d5aef
This adds a new zaqar collector able to read configuration from a
specific zaqar queue.
blueprint software-config-zaqar
Change-Id: Ie38af7b59e7a1aa370ac7760bb7819e37c2165c3
This is required so that a swift-enabled TripleO undercloud can switch
to polling for metadata from a TempURL rather than heat.
Change-Id: I73ac9e01f85e0c72ce7411e2c61c545322f3dccc
Closes-Bug: #1424913
The Oslo libraries have moved all of their code out of the 'oslo'
namespace package into per-library packages. The namespace package was
retained during kilo for backwards compatibility, but will be removed by
the liberty-2 milestone. This change removes the use of the namespace
package, replacing it with the new package names.
The patches in the libraries will be put on hold until application
patches have landed, or L2, whichever comes first. At that point, new
versions of the libraries without namespace packages will be released as
a major version update.
Please merge this patch, or an equivalent, before L2 to avoid problems
with those library releases.
Blueprint: remove-namespace-packages
https://blueprints.launchpad.net/oslo-incubator/+spec/remove-namespace-packages
Change-Id: If51059c31c82d5235e2ae21143911b5561783ca6
This change implements a collector which does an HTTP GET via
python requests to fetch the metadata.
It should work with any GET-able URL, however it is designed to
work with Swift TempURLs.
Swift objects are not consistent, so the Last-Modified header is
checked for each poll and metadata is not fetched if the last
modified is not newer than the previous successful poll.
This collector will be enabled for OS::Nova::Server
software_config_transport: POLL_TEMP_URL which is available
in the Juno release of Heat. Using POLL_TEMP_URL will result
in no metadata polling load on heat, which has historically been
an issue with tripleo scalability.
Change-Id: I22155c22bdcc3c81a5e945ca5436a8f29f196528
When we detect a failed command we log ERROR but we do not return an
error status. This makes it difficult for programs which may run
os-collect-config to detect whether a run was sucessful.
This only applies to runs which are performed with --one-time argument
as this is a straightforward case.
Change-Id: I168862e8c75c15d1ea405a417908d1284feb7b32
Make changes pretty much all over the code base with respect to
encoding strings and fixing imports to support Python 3.
Change-Id: Id1920129001b8e223474c1b2faf8bd9d527fe7e7
The local collector is not in DEFAULT_COLLECTORS, but should be usable
explicitly. It, however, suffers from a bug where only
DEFAULT_COLLECTORS are allowed through.
Change-Id: Ia42d1acd39638b448e2e2bfa26aff1c7ae415b71
This collector will collect data from the local system, allowing image
builds or simple processes to influence the metadata.
implements bp tripleo-juno-occ-localdatasource
Change-Id: I0e58e8c631ffe8b63e8b4117df2c9ce2f413044f
The configuration will dictate whether or not something is configured.
If it is not, this is a normal state and should not be logged as a
warning.
Change-Id: I479f0aed5837871009bc69fa028f5eb64a060c53
Closes-Bug: #1321551
This reverts commit 6b478e9d90.
We will break anybody who is expecting CFN to be tried in all
circumstances with this. We probably just need to base which collectors
to try on what configuration we have, and not log warnings on
unconfigured collectors.
Change-Id: I4bf7d6f9af9487bf9d2c0942381c0ba68fc03ee9
Previously we were relying on the CFN compatibility API. This makes the
native Heat version the default.
Note that we want to keep full coverage, which is why we are explicitly
adding cfn back in during tests.
Change-Id: I5adedd052827e176e2f39071c719600df62019d7
Closes-Bug: #1321551
In later commits we will use this cache to memoize access to the
authentication details.
Change-Id: I389f78fe1eb176e37c90a1a87a4ba5fde3b33f05
Related-Bug: #1321437
This collector uses keystoneclient and heatclient to poll for the
configured resource metadata.
Changes were required to test_collect to allow collectors which needed
to fake something other than requests.
Change-Id: I3e93fe38b15f71193a4c024b24e6260d6adcf1b3
Before this, the exploded deployments that the cfn collector produced
would not ever be committed, and thus would always appear to have been
changed. This resulted in os-collect-config running the command
endlessly.
This requires some refactoring so that we commit changes to the cache
based on what was actually written, rather than just the static list of
collectors.
Closes-Bug: #1307153
Change-Id: I618ef5d752ed6519e8b7bfc090de03f2f24e73ce
With the new OS::Heat::StructuredDeployment resource, each Metadata
section may have multiple "deployments" in it. With this, we will return
a list with tuples of key and content to write to the cache.
Change-Id: I9f4272b0761e1dfd850bc5a5c6b27a78f126281f
Related-Bug: #1295787
While this is called a "cache", it is important for it to survive. On
reboot, servers may need what was in the cfn config to restore complex
network configurations.
We introduce a new command line option, --backup-cachedir, that will
default to the old path, /var/run/os-collect-config. This will keep
things working for any tools that have been hard coded to use the old
path.
Change-Id: I78b3851b35addfc16913e3cd53c9d0e7eb3d191a
We pass the list of json files containing the collected metadata to
os-refresh-config using the OS_CONFIG_FILES env variable, so it's a
pretty useful piece of information to log.
Change-Id: Id09e3f352e6a5a09e4183c0743a6e99a2783a888
The initial value of 300 seconds was a conservative estimate. However,
the requests and responses are somewhat small, so we can drop the polling
interval significantly and still maintain a high degree of network
scalability. After measuring the responses from the ec2 and cfn servers
with typical workloads, at 30 second intervals 100 servers will generate
around 26kB/s of requests, with about 66kB/s of responses.
Change-Id: Iaa99ae405ba7c72ef8afc11c946400a2d0db5206
Without this change, if a user runs os-collect-config --force, it will
lock the user in an infinite loop running the command over and over with
very little chance to cancel. There are no compelling use cases for that
behavior, but it is extremely inconvenient, so implying --one-time
improves usability of os-collect-config for users.
Change-Id: Ia8c9bf0bf97ab9e40e465c947c2f0cbeb981c08e
The use case for --print is an administrator wanting to view the
metadata that os-collect-config sees without running any commands.
Fixes bug #1213195
Change-Id: I0251f2c70574aeaa79997ce822d2a5ffbe08e345
This is a useful debugging and/or system fixer tool for instances where
metadata has not changed but one needs to re-run the configuration.
Fixes bug #1223693
Change-Id: I62b097bafa339fefcf6e03d11636f5ab622fb71a
This will allow tools like os-apply-config to read the list even when
they are run out of band from os-collect-config.
Change-Id: Ic4eaf649e234f4a1367d20c7ec52e93e787a7bb3
The option allows other programs to find the cache directory and files
without having access to OS_CONFIG_FILES.
Change-Id: Iad87efb65ea4db387e94160376c9eaf956fff413
Keep a hash of the config file for os-collect-config and if it changes
during a failed run then rerun immediately(without sleep), effectively
causing new nodes to be ready 5 minutes earlier.
Because the cfn credentials are placed into os-collect-config.conf by
os-apply-config and are not in place the first time os-collect-config is
run, the first run of os-collect-config results in error, o-c-c then
sleeps for 5 minutes before running successfully the second time.
Fixes bug #1219186
Change-Id: I090de7a3d84e0ea342f1a422646c0c455eb37f4a
On a system with o-c-c installed by pip the binary generated by PBR
calls __main__() directly, the code that sets up logging sould be placed
here otherwise it will be bypassed. Resulting in missing log messages.
Change-Id: I94ba4f61be9595a6ddee134d806e5f99ae4adf73
__main__ is called directly during tests, but was resetting the
logging environment within it, which prevented tests from capturing
the log events.
Change-Id: If710e11091723144c97c88aab4aa5e6126844d2b
The point of delaying the commit of data to the cache is that we want to
make sure the command succeeds before giving up on the data changes.
This will ensure that we keep trying the command with any given change
to the metadata until it succeeds.
Change-Id: Idf3a09686b4bbf0e16a9bc9f3359ee9937fcc627
The default order results in less-dynamic heat_local and ec2 overriding
the more dynamic cfn source. That is the opposite of what is desired.
Change-Id: I7e1feb2e6869b4f076200668dd204219ecc4224e
After we have run a command and committed, re-execute ourselves. This
ensures that we will get any configurations that may have come from
underlying commands. Also re-execute if os-collect-config is sent HUP.
Change-Id: I87b4d8ce44fcbc9458a3a4fbb2445e4c9d0ad4e7
This makes os-collect-config stay resident and prepares it for a more
event based operation when the Heat API is ready for that via longpoll
or callbacks or something else.
Change-Id: Ic91f2201d504e9f8e0ada6d34a7d6d94785aec87
Positional arguments now specify which collectors to use. This allows
disabling a collector if it is problematic, and also re-ordering of the
emitted $OS_CONFIG_FILES from the default order if necessary.
Change-Id: I7e76db991c0b16c529c1cbf9a1ba9beb78e45482