The patch fix the var need_job init value to False,
In some case, The None will cause some errors that
are not easy to be detected.
Change-Id: Ic4eac2fbc274fc5dfe9b2f4b796888b96bd78d0c
Story: 2007352
Task: 38932
All user settable options should be stored the
devstack/setting file so they are defiend when it is
sourced early in devstack to allow values to be shared
between plugins if required.
This change moves the cyborg settings from
devstack/lib/cyborg to devstack/settings to conform to
the standard plugin interface.
The name of the folder where devstack clones the plugin
is specified in the first argument ot the enable_plugin
function invocation in the local.conf.
This change makes updates CYBORG_DIR to respect that
and adds TODOs for other issues that will be adressed in
follow up patches.
Change-Id: I5b6879e5ddb86659b8c7eb87b8d26cee33ed4754
This adds a method to CyborgObject that allows it to convert itself
to and older version, within a compatibility window. So, if an object
had a revision that added or changed the formatting of an attribute,
the obj_make_compatible() method can fix up a primitive representation
before it is sent to a client expecting the older version.
Partial-Implements: blueprint add-description-field-to-device-profiles
Change-Id: I196629059bc32165f161fe9c071a339d63d71c10
This change alters the fake driver to include the hostname
in the deployable list so that each host in a multi node deployment
will have a unique placment RP name.
Change-Id: Ib0e202cac8af5ef7c5028c22dc0654911eb730f5
Still not find a good solution.
This reverts commit 08af6012710fc9bbbec0f7936abd701d4eccf7db.
The whole history of timeout can be 3 stage:
1. After we enable py37 there are some random timeout as follow:
https://review.opendev.org/#/c/679406/ On Sep 19 11:30 AM
https://review.opendev.org/#/c/688239/ On Oct 12 3:31 PM
https://review.opendev.org/#/c/688231/ On Oct 28 11:38 PM,
Oct 29 9:11 AM, Oct 29 12:05 PM
https://review.opendev.org/#/c/685542/ On Nov 14 12:31 AM,
Nov 14 7:42 AM
https://review.opendev.org/#/c/691872/ On Oct 29 11:55 PM
https://review.opendev.org/#/c/690509/ On Nov 15 4:33 PM
2. After this https://review.opendev.org/#/c/688593/ timeout Disappeared
for a while. This patch is merged on Nov 19 5:05 PM.
This patch set a wrong python path evn.
3. After another patch https://review.opendev.org/#/c/696397, timeout
comes out again.
The intention of this patch is to suppress confusing pep8 message, but
it also fix python path evn unintentionally.
The thread pool lib always loop to check if there are new jobs.
The action of testunit framework in py37 is different with py36. If
there are thread pool run, the testunit will never return.
you can a simple test as follow in this file:
def job1(t1=0):
print(time.time())
print("sleep %s second" % t1)
time.sleep(t1)
print(time.time())
return "Hello, world"
class TestExtARQObject(base.DbTestCase):
def test_foo(self):
print("Test Foo")
def test_bar(self):
print("Test bar")
def test_apply_patch_fpga_arq_monitor_job(self):
works = utils.ThreadWorks()
job = works.spawn(job1, 1)
return job
Change-Id: I398db324563ecdb6e8fe0abb86fd02c1336b467f
Many exceptions are defined in such a way that they will not render properly
when stringified. This is because instead of _msg_fmt, they used msg_fmt
or message in the class definition.
This fixes those and adds a test which I used to find all the offenders.
Closes task: #38817
Change-Id: I085ef5b0197b76b7b53639610f62b615fb538983
Before this change, when agent called to conductor to report_data(),
if the parent provider was not found by hostname, we would log an error,
and then continue to create the "child" provider with no parent. We should
never do this if we are supposed to have a parent. Cleanup from this
situation is also messy.
This makes us raise PlacementResourceProviderNotFound() in that case,
which aborts the report and thus does not create the provider incorrectly.
It also makes the agent catch that exception and moves the log message
to the agent where the actual problem is (i.e. likely misconfiguration).
The exception used here is actually defined incorrectly, having a message
class variable instead of _msg_fmt, which caused it to not render properly.
This fixes that along the way and adds tests for the new conductor and
agent behaviors.
Closes task: 38813
Closes task: 38814
Change-Id: Ied8ee91592eb0b4675f9c155e30a6c3a7df9b597
For we are consider to use Multiprocessing, so just submit partial of
UTs.
Other UTs will TBD.
Also fix some bugs when write UTs.
Also cover the follow Py3 compatibility:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/mock/mock.py", line 1330, in patched
return func(*args, **keywargs)
File "/opt/stack/cyborg/cyborg/tests/unit/objects/test_ext_arq_job.py", line 293, in test_job_monitor
objects.ext_arq.ExtARQ.job_monitor(self.context, works_generator, extarqs)
File "/opt/stack/cyborg/cyborg/common/utils.py", line 432, in _impl
LOG.error(msg, e.message)
AttributeError: 'AttributeError' object has no attribute 'message'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/mock/mock.py", line 1330, in patched
return func(*args, **keywargs)
File "/opt/stack/cyborg/cyborg/tests/unit/objects/test_ext_arq_job.py", line 293, in test_job_monitor
objects.ext_arq.ExtARQ.job_monitor(self.context, works_generator, extarqs)
File "/opt/stack/cyborg/cyborg/common/utils.py", line 430, in _impl
output = method(self, *args, **kwargs)
File "/opt/stack/cyborg/cyborg/objects/extarq/ext_arq_job.py", line 154, in job_monitor
for _, (exc, tb), _, err in works_generator:
File "/opt/stack/cyborg/cyborg/common/utils.py", line 333, in future_iterator
f.exception_info(), f._result, f._state, e.message)
AttributeError: 'Future' object has no attribute 'exception_info'
Change-Id: Ibbcaac060b365cf3366e3b408a385d613b006dff
Py37 job always reports timeout error recently.
Please see [1] [2] [3].
At first it was suspected that the error was reported
because of the patch [4].
Therefore, Feng Shaohe's patch [5] revoked the merge,
and at this time, disappeared at py37 timeout.
But in fact, this problem is just hidden.
After removing this setting, the job of py37
is actually running on the environment of python 3.6
(community CI default version is 3.6), please see [6]
for detailed reasons.
Therefore, this patch exposes the hidden py37 timeout problem,
and at the same time, found method test_apply_patch_fpga_arq_monitor_job
, think it is the reason of the timeout. The reason I can find
this method is based on the the troubleshooting of tox -epy37 log.
After commenting out this method, I found that tox -epy37 can run
normally and there is no longer a timeout problem.
If you want to test, please ensure that you have a local
python3.7 environment, not 3.6, and execute rm .tox / -rf.
Then execute tox -epy37.
Therefore, the best way is to comment out this method and
restore py37 job at the same time.
If a friend discovers further reasons and solution, this method
can be restored, please refer to [7].
What went wrong in this method?
It is because in the deep call of this method, ThreadWork of
the thread pool will be used, which under Python3.7 will block
the execution of unit tests. For specific reasons, please see
[8] [9].
Reference:
[1]. https://review.opendev.org/#/c/702578/
[2]. https://review.opendev.org/#/c/703049/
[3]. https://review.opendev.org/#/c/703253/
[4]. https://review.opendev.org/#/c/696397/
[5]. https://review.opendev.org/#/c/706911/
[6]. http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2020-02-12.log.html#t2020-02-12T16:46:18
[7]. deed9c822e
[8]. https://review.opendev.org/#/c/707045/5//COMMIT_MSG
[9]. c61dd8c376/cyborg/objects/extarq/ext_arq_job.py (L41)
Change-Id: I09db889fe665c6246ec9503af92c909e7d0da24f
This is a series of optimization for exception.
In fact, we only need to use the ResourceNotFound exception
to fit NotFound Exception.
More UT for control path such as:
get,list,create,delete will be added in the future.
Change-Id: I740eb28184b434583b58f10d2bf3e5e4621c43d4
Story: 2007045
Task: 38318
This patch add some UT for ExtArq:
1. get
2. update
3. create
4. list
5. delete
Change-Id: I8f0d15d8c34f1eb77366d6021e465fcebd1be406
Story: 2007091
Task: 38133
The attribute and deployable tables have their separate
tables. We should remove the attributes from the
deployable object.
Change-Id: I1be185a6bce2ae90eca244b21b207a22e5a92044
Story: 2007182
Task: 38303
This patch add some UT for deployable:
1. get
2. update
3. create
4. list
5. delete
Change-Id: I39b3f02e898b67e4d4eb686b5a6cf9065c6280de
Story: 2007091
Task: 38141
This patch add some UT for device:
1. get
2. update
3. create
4. list
5. delete
Change-Id: Id66ba6f1442f87a0f8fb9644e45e147cc77a4f5e
Story: 2007091
Task: 38121
This patch add some UT for attach handle:
1. get
2. update
3. create
4. list
5. delete
6. allocate
Change-Id: I5e683c99d1e08ed6a166a110a87b665cdbc5bde3
Story: 2007091
Task: 38161
The specs directory in Cyborg is not update, and we have the
Cyborg specifications in https://specs.openstack.org/openstack/cyborg-specs/,
so remove this directory in Cyborg, to reduce Cyborg maintenance costs.
Change-Id: Iebcbf2ebd6da3bc51e85c62f18c547909026c2f0
This is based on discussion with Nova community. See:
https://review.opendev.org/#/c/692707/6/nova/objects/external_event.py@36
Each event has a unique tag, i.e. the ARQ UUID, and the
bind status for that ARQ.
Each ARQ has its own state. However, the bind status sent to Nova
should be 'completed' or 'failed'. The logic to do that conversion
should not be in nova_client.py, to keep it free of ARQ state details.
So it has been added in get_arq_bind_status() in ext_arq_job.py.
Change-Id: Iddbf9a77196fc42ac82ad1f6d88a4b0732852463
Now if we init filters=None, as call
dbapi.device_profile_list_by_filters(self.context, filters=None),
that will raise an NoneType error.
Mainly error info:
Traceback (most recent call last):
File "/home/my_work/code/cyborg/cyborg/db/sqlalchemy/api.py", line
558, in device_profile_list_by_filters
filters, exact_match_filter_names)
File "/home/my_work/code/cyborg/cyborg/db/sqlalchemy/api.py", line
223, in _exact_filter
if key not in filters:
TypeError: argument of type 'NoneType' is not iterable
This patch will add initial validation of the filters.
Change-Id: Icf711dc3621fb8d2e5b022ab1d1ce02b0885b055