Merge sysinv_fpga_agent with sysinv_agent
Merging sysinv-fpga-agent service with sysinv-agent in order to reduce overall OS overhead. Replaced calls "wait_for_n3000_reset()" and "wait_for_host_uuid()" in previous fpga-agent-manager by checks that ensure fpga devices are reset and host_uuid is available in agent-manager. Also, the content of "fpga_pci_update()" and "report_fpga_inventory()" methods is directly inserted in the body of "agent_audit()" method. Test Plan: On AIO-DX env (CentOS): <sysinv-fpga-agent tests> PASS: Check FPGA pod and its resources. PASS: Check FPGA pod and its resources after lock/unlock. PASS: Check FPGA pod and its resources after the system reboot. PASS: Verify image upload with non-functional image with retimer-included PASS: Verify retimer_a_version and retimer_b_version after applying BMC image with re-timer and bmc PASS: Verify firmware update for BMC and retimer image with retimer-include=False PASS: Verify apply BMC image without re-timer first and then BMC image with re-timer, only latest image is kept in device-image-state-list PASS: Test accelerator configuration is persistent after lock/unlock. PASS: Test to verify that the accelerator configuration is persistent after a graceful reboot. <sysinv-agent tests> PASS: Verify alarms raised by PTP feature PASS: Verify the configuration and run of single ptp-instance PASS: Verify the configuration and run of single phc2sys PASS: Verify PTP CLI commands On AIO-SX env (Debian): PASS: Check FPGA pod and its resources. PASS: Check FPGA pod and its resources after lock/unlock. PASS: Check FPGA pod and its resources after system reboot. PASS: Check if FPGA device can be detected, configured. PASS: Test accelerator configuration is persistent after lock/unlock. PASS: Test to verify that the accelerator configuration is persistent after graceful reboot. Story: 2010087 Task: 45628 Signed-off-by: Davi Frossard <dbarrosf@windriver.com> Change-Id: I83edd261898498344001ca90bb53a5f65e66728c
This commit is contained in:
parent
af35377f56
commit
6d4e2681a0
@ -25,9 +25,6 @@ cgts-client
|
||||
# sysinv-agent
|
||||
sysinv-agent
|
||||
|
||||
# sysinv-fpga-agent
|
||||
sysinv-fpga-agent
|
||||
|
||||
# sysinv
|
||||
sysinv
|
||||
|
||||
|
@ -5,7 +5,6 @@ sysinv/cert-mon
|
||||
sysinv/cert-alarm
|
||||
sysinv/cgts-client
|
||||
sysinv/sysinv-agent
|
||||
sysinv/sysinv-fpga-agent
|
||||
sysinv/sysinv
|
||||
config-gate
|
||||
tsconfig
|
||||
|
@ -8,5 +8,4 @@ controllerconfig
|
||||
#storageconfig
|
||||
sysinv
|
||||
sysinv-agent
|
||||
sysinv-fpga-agent
|
||||
workerconfig-standalone
|
||||
|
@ -1,6 +1,5 @@
|
||||
controllerconfig
|
||||
sysinv/cgts-client
|
||||
sysinv/sysinv-fpga-agent
|
||||
sysinv/sysinv
|
||||
tsconfig
|
||||
config-gate
|
||||
|
6
sysinv/sysinv-fpga-agent/.gitignore
vendored
6
sysinv/sysinv-fpga-agent/.gitignore
vendored
@ -1,6 +0,0 @@
|
||||
!.distro
|
||||
.distro/centos7/rpmbuild/RPMS
|
||||
.distro/centos7/rpmbuild/SRPMS
|
||||
.distro/centos7/rpmbuild/BUILD
|
||||
.distro/centos7/rpmbuild/BUILDROOT
|
||||
.distro/centos7/rpmbuild/SOURCES/sysinv-fpga-agent*tar.gz
|
@ -1,202 +0,0 @@
|
||||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
@ -1,12 +0,0 @@
|
||||
Metadata-Version: 1.1
|
||||
Name: sysinv-fpga-agent
|
||||
Version: 1.0
|
||||
Summary: StarlingX FPGA Agent Package
|
||||
Home-page:
|
||||
Author: Windriver
|
||||
Author-email: info@windriver.com
|
||||
License: Apache-2.0
|
||||
|
||||
Description: StarlingX FPGA Agent Package
|
||||
|
||||
Platform: UNKNOWN
|
@ -1,4 +0,0 @@
|
||||
SRC_DIR="."
|
||||
COPY_LIST_TO_TAR="LICENSE sysinv-fpga-agent sysinv-fpga-agent.conf"
|
||||
EXCLUDE_LIST_FROM_TAR="centos opensuse"
|
||||
TIS_PATCH_VER=PKG_GITREVCOUNT
|
@ -1,52 +0,0 @@
|
||||
Summary: StarlingX FPGA Agent Package
|
||||
Name: sysinv-fpga-agent
|
||||
Version: 1.0
|
||||
Release: %{tis_patch_ver}%{?_tis_dist}
|
||||
License: Apache-2.0
|
||||
Group: base
|
||||
Packager: Wind River <info@windriver.com>
|
||||
URL: unknown
|
||||
Source0: %{name}-%{version}.tar.gz
|
||||
|
||||
BuildRequires: systemd-devel
|
||||
|
||||
%description
|
||||
StarlingX FPGA Agent Package
|
||||
|
||||
%define local_etc_initd /etc/init.d/
|
||||
%define local_etc_pmond /etc/pmon.d/
|
||||
|
||||
%define debug_package %{nil}
|
||||
|
||||
%prep
|
||||
%setup
|
||||
|
||||
%build
|
||||
|
||||
%install
|
||||
# compute init scripts
|
||||
install -d -m 755 %{buildroot}%{local_etc_initd}
|
||||
install -p -D -m 755 sysinv-fpga-agent %{buildroot}%{local_etc_initd}/sysinv-fpga-agent
|
||||
|
||||
install -d -m 755 %{buildroot}%{local_etc_pmond}
|
||||
install -p -D -m 644 sysinv-fpga-agent.conf %{buildroot}%{local_etc_pmond}/sysinv-fpga-agent.conf
|
||||
install -p -D -m 644 sysinv-fpga-agent.service %{buildroot}%{_unitdir}/sysinv-fpga-agent.service
|
||||
install -p -D -m 644 sysinv-conf-watcher.service %{buildroot}%{_unitdir}/sysinv-conf-watcher.service
|
||||
install -p -D -m 644 sysinv-conf-watcher.path %{buildroot}%{_unitdir}/sysinv-conf-watcher.path
|
||||
|
||||
%post
|
||||
/usr/bin/systemctl enable sysinv-fpga-agent.service >/dev/null 2>&1
|
||||
/usr/bin/systemctl enable sysinv-conf-watcher.service >/dev/null 2>&1
|
||||
/usr/bin/systemctl enable sysinv-conf-watcher.path >/dev/null 2>&1
|
||||
|
||||
%clean
|
||||
rm -rf $RPM_BUILD_ROOT
|
||||
|
||||
%files
|
||||
%defattr(-,root,root,-)
|
||||
%doc LICENSE
|
||||
%{local_etc_initd}/sysinv-fpga-agent
|
||||
%{local_etc_pmond}/sysinv-fpga-agent.conf
|
||||
%{_unitdir}/sysinv-fpga-agent.service
|
||||
%{_unitdir}/sysinv-conf-watcher.service
|
||||
%{_unitdir}/sysinv-conf-watcher.path
|
@ -1,5 +0,0 @@
|
||||
sysinv-fpga-agent (1.0-1) unstable; urgency=medium
|
||||
|
||||
* Initial release.
|
||||
|
||||
-- Charles Short <charles.short@windriver.com> Tue, 24 Aug 2021 14:37:44 -0400
|
@ -1,13 +0,0 @@
|
||||
Source: sysinv-fpga-agent
|
||||
Section: admin
|
||||
Priority: optional
|
||||
Maintainer: StarlingX Developers <starlingx-discuss@lists.starlingx.io>
|
||||
Build-Depends: debhelper-compat (= 13)
|
||||
Standards-Version: 4.5.1
|
||||
Rules-Requires-Root: no
|
||||
|
||||
Package: sysinv-fpga-agent
|
||||
Architecture: any
|
||||
Depends: ${shlibs:Depends}, ${misc:Depends}, mtce-pmon
|
||||
Description: StarlingX FPGA agent monitor
|
||||
Startup Scripts for StarlingX FPGA agent monitor
|
@ -1,43 +0,0 @@
|
||||
Format: https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/
|
||||
Upstream-Name: sysinv-fpga-agent
|
||||
Source: https://opendev.org/starlingx/config
|
||||
|
||||
Files: *
|
||||
Copyright:
|
||||
(c) 2013-2021 Wind River Systems, Inc
|
||||
(c) Others (See individual files for more details)
|
||||
License: Apache-2
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
.
|
||||
https://www.apache.org/licenses/LICENSE-2.0
|
||||
.
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
.
|
||||
On Debian-based systems the full text of the Apache version 2.0 license
|
||||
can be found in `/usr/share/common-licenses/Apache-2.0'.
|
||||
|
||||
# If you want to use GPL v2 or later for the /debian/* files use
|
||||
# the following clauses, or change it to suit. Delete these two lines
|
||||
Files: debian/*
|
||||
Copyright: 2021 Wind River Systems, Inc
|
||||
License: Apache-2
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
.
|
||||
https://www.apache.org/licenses/LICENSE-2.0
|
||||
.
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
.
|
||||
On Debian-based systems the full text of the Apache version 2.0 license
|
||||
can be found in `/usr/share/common-licenses/Apache-2.0'.
|
@ -1,21 +0,0 @@
|
||||
#!/usr/bin/make -f
|
||||
#export DH_VERBOSE = 1
|
||||
|
||||
ROOT := $(CURDIR)/debian/tmp
|
||||
PMONDIR := ${ROOT}/usr/share/starlingx/pmon.d
|
||||
|
||||
%:
|
||||
dh $@
|
||||
|
||||
override_dh_install:
|
||||
install -p -D -m 755 sysinv-fpga-agent ${ROOT}/etc/init.d/sysinv-fpga-agent
|
||||
install -p -D -m 644 sysinv-fpga-agent.conf ${PMONDIR}/sysinv-fpga-agent.conf
|
||||
dh_install
|
||||
|
||||
override_dh_usrlocal:
|
||||
# do nothing
|
||||
|
||||
override_dh_installsystemd:
|
||||
dh_installsystemd --name=sysinv-fpga-agent sysinv-fpga-agent.service
|
||||
dh_installsystemd --name=sysinv-conf-watcher sysinv-conf-watcher.service
|
||||
dh_installsystemd --name=sysinv-conf-watcher sysinv-conf-watcher.path
|
@ -1 +0,0 @@
|
||||
3.0 (quilt)
|
@ -1,5 +0,0 @@
|
||||
[Path]
|
||||
PathChanged=/etc/sysinv/sysinv.conf
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
@ -1,11 +0,0 @@
|
||||
[Unit]
|
||||
Description=StarlingX conf watcher
|
||||
After=sysinv-fpga-agent.service
|
||||
Before=pmon.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/usr/bin/systemctl restart sysinv-fpga-agent.service
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
@ -1,2 +0,0 @@
|
||||
etc/init.d
|
||||
var/local/share/applications
|
@ -1,2 +0,0 @@
|
||||
etc/init.d/sysinv-fpga-agent
|
||||
usr/share/starlingx/pmon.d/sysinv-fpga-agent.conf
|
@ -1,15 +0,0 @@
|
||||
[Unit]
|
||||
Description=StarlingX FPGA Agent
|
||||
After=nfscommon.service sw-patch.service
|
||||
After=network-online.target systemd-udev-settle.service sysinv-agent.service
|
||||
Before=pmon.service
|
||||
|
||||
[Service]
|
||||
Type=forking
|
||||
RemainAfterExit=yes
|
||||
ExecStart=/etc/init.d/sysinv-fpga-agent start
|
||||
ExecStop=/etc/init.d/sysinv-fpga-agent stop
|
||||
PIDFile=/var/run/sysinv-fpga-agent.pid
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
@ -1,14 +0,0 @@
|
||||
---
|
||||
debname: sysinv-fpga-agent
|
||||
debver: 1.0
|
||||
src_path: null
|
||||
src_files:
|
||||
- sysinv-conf-watcher.service
|
||||
- sysinv-fpga-agent
|
||||
- sysinv-conf-watcher.path
|
||||
- sysinv-fpga-agent.conf
|
||||
- sysinv-fpga-agent.service
|
||||
- LICENSE
|
||||
revision:
|
||||
dist: $STX_DIST
|
||||
PKG_GITREVCOUNT: true
|
@ -1,4 +0,0 @@
|
||||
-------------------------------------------------------------------
|
||||
Mon May 25 13:47:02 CST 2020 - chris.friesen@windriver.com
|
||||
|
||||
- 1.0 Initial Commit
|
@ -1 +0,0 @@
|
||||
setBadness('script-without-shebang', 2)
|
@ -1,64 +0,0 @@
|
||||
Summary: StarlingX FPGA Agent Package
|
||||
Name: sysinv-fpga-agent
|
||||
Version: 1.0.0
|
||||
Release: %{tis_patch_ver}%{?_tis_dist}
|
||||
License: Apache-2.0
|
||||
Group: Development/Tools/Other
|
||||
URL: https://opendev.org/starlingx/config
|
||||
Source0: %{name}-%{version}.tar.gz
|
||||
|
||||
BuildRequires: systemd-devel
|
||||
|
||||
Requires: python-django
|
||||
Requires: python-oslo.messaging
|
||||
Requires: python-retrying
|
||||
|
||||
BuildArch: noarch
|
||||
|
||||
%description
|
||||
StarlingX FPGA Agent Package
|
||||
|
||||
%define local_etc_initd /etc/init.d/
|
||||
%define local_etc_pmond /etc/pmon.d/
|
||||
|
||||
%define debug_package %{nil}
|
||||
|
||||
%prep
|
||||
%setup
|
||||
|
||||
%build
|
||||
|
||||
%install
|
||||
# compute init scripts
|
||||
install -d -m 755 %{buildroot}%{local_etc_initd}
|
||||
install -p -D -m 755 sysinv-fpga-agent %{buildroot}%{local_etc_initd}/sysinv-fpga-agent
|
||||
|
||||
install -d -m 755 %{buildroot}%{local_etc_pmond}
|
||||
install -p -D -m 644 sysinv-fpga-agent.conf %{buildroot}%{local_etc_pmond}/sysinv-fpga-agent.conf
|
||||
install -p -D -m 644 sysinv-fpga-agent.service %{buildroot}%{_unitdir}/sysinv-fpga-agent.service
|
||||
|
||||
%clean
|
||||
rm -rf $RPM_BUILD_ROOT
|
||||
|
||||
%pre
|
||||
%service_add_pre sysinv-fpga-agent.service sysinv-fpga-agent.target
|
||||
|
||||
%post
|
||||
%service_add_post sysinv-fpga-agent.service sysinv-fpga-agent.target
|
||||
|
||||
%preun
|
||||
%service_del_preun sysinv-fpga-agent.service sysinv-fpga-agent.target
|
||||
|
||||
%postun
|
||||
%service_del_postun sysinv-fpga-agent.service sysinv-fpga-agent.target
|
||||
|
||||
|
||||
%files
|
||||
%defattr(-,root,root,-)
|
||||
%doc LICENSE
|
||||
%dir %{local_etc_pmond}
|
||||
%{local_etc_initd}/sysinv-fpga-agent
|
||||
%config %{local_etc_pmond}/sysinv-fpga-agent.conf
|
||||
%{_unitdir}/sysinv-fpga-agent.service
|
||||
|
||||
%changelog
|
@ -1,5 +0,0 @@
|
||||
[Path]
|
||||
PathChanged=/etc/sysinv/sysinv.conf
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
@ -1,11 +0,0 @@
|
||||
[Unit]
|
||||
Description=StarlingX conf watcher
|
||||
After=sysinv-fpga-agent.service
|
||||
Before=pmon.service
|
||||
|
||||
[Service]
|
||||
Type=oneshot
|
||||
ExecStart=/usr/bin/systemctl restart sysinv-fpga-agent.service
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
@ -1,120 +0,0 @@
|
||||
#! /bin/sh
|
||||
#
|
||||
# Copyright (c) 2020 Wind River Systems, Inc.
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
#
|
||||
# chkconfig: 2345 76 25
|
||||
#
|
||||
### BEGIN INIT INFO
|
||||
# Provides: sysinv-fpga-agent
|
||||
# Default-Start: 3 5
|
||||
# Required-Start:
|
||||
# Required-Stop:
|
||||
# Default-Stop: 0 1 2 6
|
||||
# Short-Description: Daemon to handle FPGA device updates
|
||||
### END INIT INFO
|
||||
|
||||
. /etc/init.d/functions
|
||||
. /etc/build.info
|
||||
|
||||
|
||||
DAEMON_NAME="sysinv-fpga-agent"
|
||||
SYSINVFPGAAGENT="/usr/bin/${DAEMON_NAME}"
|
||||
SYSINV_CONF_DIR="/etc/sysinv"
|
||||
SYSINV_CONF_FILE="${SYSINV_CONF_DIR}/sysinv.conf"
|
||||
DELAY_SEC=20
|
||||
|
||||
daemon_pidfile="/var/run/${DAEMON_NAME}.pid"
|
||||
|
||||
if [ ! -e "${SYSINVFPGAAGENT}" ] ; then
|
||||
logger "$0: ${SYSINVFPGAAGENT} is missing"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
RETVAL=0
|
||||
|
||||
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/bin
|
||||
export PATH
|
||||
|
||||
case "$1" in
|
||||
start)
|
||||
# Check for installation failure
|
||||
if [ -f /etc/platform/installation_failed ] ; then
|
||||
logger "$0: /etc/platform/installation_failed flag is set. Aborting."
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [ -e ${daemon_pidfile} ] ; then
|
||||
echo "Killing existing process before starting new"
|
||||
pid=`cat ${daemon_pidfile}`
|
||||
kill -TERM $pid
|
||||
rm -f ${daemon_pidfile}
|
||||
fi
|
||||
|
||||
# Assume that sysinv-agent will ensure that the sysinv.conf file is available.
|
||||
echo -n "Waiting for sysinv config file"
|
||||
while [ ! -e ${SYSINV_CONF_FILE} ]
|
||||
do
|
||||
sleep 1
|
||||
done
|
||||
|
||||
echo -n "Starting sysinv-fpga-agent: "
|
||||
/bin/sh -c "${SYSINVFPGAAGENT}"' >> /dev/null 2>&1 & echo $!' > ${daemon_pidfile}
|
||||
RETVAL=$?
|
||||
if [ $RETVAL -eq 0 ] ; then
|
||||
echo "OK"
|
||||
touch /var/lock/subsys/${DAEMON_NAME}
|
||||
else
|
||||
echo "FAIL"
|
||||
fi
|
||||
;;
|
||||
|
||||
stop)
|
||||
echo -n "Stopping sysinv-fpga-agent: "
|
||||
if [ -e ${daemon_pidfile} ] ; then
|
||||
pid=`cat ${daemon_pidfile}`
|
||||
kill -TERM $pid
|
||||
rm -f ${daemon_pidfile}
|
||||
rm -f /var/lock/subsys/${DAEMON_NAME}
|
||||
echo "OK"
|
||||
else
|
||||
echo "FAIL"
|
||||
fi
|
||||
;;
|
||||
|
||||
restart)
|
||||
$0 stop
|
||||
sleep 1
|
||||
$0 start
|
||||
;;
|
||||
|
||||
status)
|
||||
if [ -e ${daemon_pidfile} ] ; then
|
||||
pid=`cat ${daemon_pidfile}`
|
||||
ps -p $pid | grep -v "PID TTY" >> /dev/null 2>&1
|
||||
if [ $? -eq 0 ] ; then
|
||||
echo "sysinv-fpga-agent is running"
|
||||
RETVAL=0
|
||||
else
|
||||
echo "sysinv-fpga-agent is not running"
|
||||
RETVAL=1
|
||||
fi
|
||||
else
|
||||
echo "sysinv-fpga-agent is not running ; no pidfile"
|
||||
RETVAL=1
|
||||
fi
|
||||
;;
|
||||
|
||||
condrestart)
|
||||
[ -f /var/lock/subsys/$DAEMON_NAME ] && $0 restart
|
||||
;;
|
||||
|
||||
*)
|
||||
echo "usage: $0 { start | stop | status | restart | condrestart | status }"
|
||||
;;
|
||||
esac
|
||||
|
||||
exit $RETVAL
|
@ -1,9 +0,0 @@
|
||||
[process]
|
||||
process = sysinv-fpga-agent
|
||||
pidfile = /var/run/sysinv-fpga-agent.pid
|
||||
service = sysinv-fpga-agent
|
||||
style = lsb ; ocf or lsb
|
||||
severity = major ; minor, major, critical
|
||||
restarts = 3 ; restarts before error assertion
|
||||
interval = 5 ; number of seconds to wait between restarts
|
||||
debounce = 20 ; number of seconds to wait before degrade clear
|
@ -1,15 +0,0 @@
|
||||
[Unit]
|
||||
Description=StarlingX FPGA Agent
|
||||
After=nfscommon.service sw-patch.service
|
||||
After=network-online.target systemd-udev-settle.service sysinv-agent.service
|
||||
Before=pmon.service
|
||||
|
||||
[Service]
|
||||
Type=forking
|
||||
RemainAfterExit=yes
|
||||
ExecStart=/etc/init.d/sysinv-fpga-agent start
|
||||
ExecStop=/etc/init.d/sysinv-fpga-agent stop
|
||||
PIDFile=/var/run/sysinv-fpga-agent.pid
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
@ -112,7 +112,6 @@ install -d -m 755 %{buildroot}%{stx_app_plugind}
|
||||
|
||||
#install -p -D -m 755 %{buildroot}/usr/bin/sysinv-api %{buildroot}/usr/bin/sysinv-api
|
||||
#install -p -D -m 755 %{buildroot}/usr/bin/sysinv-agent %{buildroot}/usr/bin/sysinv-agent
|
||||
#install -p -D -m 755 %{buildroot}/usr/bin/sysinv-fpga-agent %{buildroot}/usr/bin/sysinv-fpga-agent
|
||||
#install -p -D -m 755 %{buildroot}/usr/bin/sysinv-conductor %{buildroot}/usr/bin/sysinv-conductor
|
||||
|
||||
install -d -m 755 %{buildroot}%{local_bindir}
|
||||
@ -153,7 +152,6 @@ rm -rf $RPM_BUILD_ROOT
|
||||
%{_unitdir}/sysinv-conductor.service
|
||||
|
||||
%{_bindir}/sysinv-agent
|
||||
%{_bindir}/sysinv-fpga-agent
|
||||
%{_bindir}/sysinv-api
|
||||
%{_bindir}/sysinv-conductor
|
||||
%{_bindir}/sysinv-dbsync
|
||||
|
@ -21,7 +21,6 @@ usr/bin/sysinv-api
|
||||
usr/bin/sysinv-conductor
|
||||
usr/bin/sysinv-dbsync
|
||||
usr/bin/sysinv-dnsmasq-lease-update
|
||||
usr/bin/sysinv-fpga-agent
|
||||
usr/bin/sysinv-helm
|
||||
usr/bin/sysinv-puppet
|
||||
usr/bin/sysinv-reset-n3000-fpgas
|
||||
|
@ -110,7 +110,6 @@ install -m 644 -p -D scripts/sysinv-conductor.service %{buildroot}%{_unitdir}/sy
|
||||
|
||||
#install -p -D -m 755 %%{buildroot}/usr/bin/sysinv-api %%{buildroot}/usr/bin/sysinv-api
|
||||
#install -p -D -m 755 %%{buildroot}/usr/bin/sysinv-agent %%{buildroot}/usr/bin/sysinv-agent
|
||||
#install -p -D -m 755 %%{buildroot}/usr/bin/sysinv-fpga-agent %%{buildroot}/usr/bin/sysinv-fpga-agent
|
||||
#install -p -D -m 755 %%{buildroot}/usr/bin/sysinv-conductor %%{buildroot}/usr/bin/sysinv-conductor
|
||||
|
||||
install -d -m 755 %{buildroot}%{local_bindir}
|
||||
@ -166,7 +165,6 @@ rm -rf $RPM_BUILD_ROOT
|
||||
%{_unitdir}/sysinv-conductor.service
|
||||
|
||||
%{_bindir}/sysinv-agent
|
||||
%{_bindir}/sysinv-fpga-agent
|
||||
%{_bindir}/sysinv-api
|
||||
%{_bindir}/sysinv-conductor
|
||||
%{_bindir}/sysinv-dbsync
|
||||
|
@ -29,7 +29,6 @@ packages =
|
||||
console_scripts =
|
||||
sysinv-api = sysinv.cmd.api:main
|
||||
sysinv-agent = sysinv.cmd.agent:main
|
||||
sysinv-fpga-agent = sysinv.cmd.fpga_agent:main
|
||||
sysinv-dbsync = sysinv.cmd.dbsync:main
|
||||
sysinv-conductor = sysinv.cmd.conductor:main
|
||||
sysinv-rootwrap = oslo_rootwrap.cmd:main
|
||||
|
514
sysinv/sysinv/sysinv/sysinv/agent/fpga.py
Normal file
514
sysinv/sysinv/sysinv/sysinv/agent/fpga.py
Normal file
@ -0,0 +1,514 @@
|
||||
# vim: tabstop=4 shiftwidth=4 softtabstop=4
|
||||
# coding=utf-8
|
||||
|
||||
# Copyright 2013 Hewlett-Packard Development Company, L.P.
|
||||
# Copyright 2013 International Business Machines Corporation
|
||||
# All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
#
|
||||
# Copyright (c) 2020-2022 Wind River Systems, Inc.
|
||||
#
|
||||
|
||||
|
||||
""" Perform activity related to FPGA devices on a single host.
|
||||
|
||||
On start, collect and post FPGA inventory to conductor.
|
||||
|
||||
Commands (from conductors) are received via RPC calls.
|
||||
|
||||
"""
|
||||
|
||||
from __future__ import print_function
|
||||
import errno
|
||||
from eventlet.green import subprocess
|
||||
from glob import glob
|
||||
import six
|
||||
|
||||
import os
|
||||
import shlex
|
||||
|
||||
from oslo_log import log
|
||||
from six.moves.urllib.request import urlretrieve
|
||||
|
||||
from sysinv.agent import pci
|
||||
from sysinv.common import fpga_constants
|
||||
from sysinv.common import constants as cconstants
|
||||
from sysinv.common import device as dconstants
|
||||
from sysinv.common import exception
|
||||
from sysinv.common import utils
|
||||
from sysinv.conductor import rpcapi as conductor_rpcapi
|
||||
|
||||
import tsconfig.tsconfig as tsc
|
||||
|
||||
LOG = log.getLogger(__name__)
|
||||
|
||||
# This is the location where we cache the device image file while
|
||||
# writing it to the hardware.
|
||||
DEVICE_IMAGE_CACHE_DIR = "/usr/local/share/applications/sysinv"
|
||||
|
||||
SYSFS_DEVICE_PATH = "/sys/bus/pci/devices/"
|
||||
FME_PATH = "/fpga/intel-fpga-dev.*/intel-fpga-fme.*/"
|
||||
SPI_PATH = "spi-altera.*.auto/spi_master/spi*/spi*.*/"
|
||||
|
||||
# These are relative to FME_PATH
|
||||
BITSTREAM_ID_PATH = "bitstream_id"
|
||||
|
||||
# These are relative to SPI_PATH
|
||||
ROOT_HASH_PATH = "ifpga_sec_mgr/ifpga_sec*/security/sr_root_hash"
|
||||
CANCELLED_CSKS_PATH = "ifpga_sec_mgr/ifpga_sec*/security/sr_canceled_csks"
|
||||
IMAGE_LOAD_PATH = "fpga_flash_ctrl/fpga_image_load"
|
||||
BMC_FW_VER_PATH = "bmcfw_flash_ctrl/bmcfw_version"
|
||||
BMC_BUILD_VER_PATH = "max10_version"
|
||||
RETIMER_A_VER_PATH = "pkvl/pkvl_a_version"
|
||||
RETIMER_B_VER_PATH = "pkvl/pkvl_b_version"
|
||||
|
||||
# Length of the retimer version in database
|
||||
RETIMER_VERSION_LENGTH = 32
|
||||
|
||||
|
||||
class FpgaOperator(object):
|
||||
'''Class to encapsulate FPGA operations for System Inventory'''
|
||||
|
||||
def __init__(self):
|
||||
pass
|
||||
|
||||
def ensure_device_image_cache_exists(self):
|
||||
# Make sure the image cache directory exists, create it if needed.
|
||||
try:
|
||||
os.mkdir(DEVICE_IMAGE_CACHE_DIR, 0o755)
|
||||
except OSError as exc:
|
||||
if exc.errno != errno.EEXIST:
|
||||
msg = ("Unable to create device image cache directory %s!"
|
||||
% DEVICE_IMAGE_CACHE_DIR)
|
||||
LOG.exception(msg)
|
||||
raise exception.SysinvException(msg)
|
||||
|
||||
def get_http_port(self):
|
||||
# Get the http_port from /etc/platform/platform.conf.
|
||||
prefix = "http_port="
|
||||
http_port = cconstants.SERVICE_PARAM_HTTP_PORT_HTTP_DEFAULT
|
||||
if os.path.isfile(tsc.PLATFORM_CONF_FILE):
|
||||
with open(tsc.PLATFORM_CONF_FILE, 'r') as platform_file:
|
||||
for line in platform_file:
|
||||
line = line.strip()
|
||||
if line.startswith(prefix):
|
||||
port = line[len(prefix):]
|
||||
if utils.is_int_like(port):
|
||||
LOG.info("Agent found %s%s" % (prefix, port))
|
||||
http_port = port
|
||||
break
|
||||
else:
|
||||
LOG.info("http_port entry: %s in platform.conf "
|
||||
"is not an integer" % port)
|
||||
return http_port
|
||||
|
||||
def fetch_device_image(self, filename):
|
||||
# Pull the image from the controller.
|
||||
http_port = self.get_http_port()
|
||||
url = "http://controller:{}/device_images/{}".format(http_port, filename)
|
||||
local_path = DEVICE_IMAGE_CACHE_DIR + "/" + filename
|
||||
try:
|
||||
imagefile, headers = urlretrieve(url, local_path)
|
||||
except IOError:
|
||||
msg = ("Unable to retrieve device image from %s!" % url)
|
||||
LOG.exception(msg)
|
||||
raise exception.SysinvException(msg)
|
||||
return local_path
|
||||
|
||||
def cleanup_container(self):
|
||||
# Delete container if exists
|
||||
cmd = 'ctr -n=k8s.io container list image=="%s"' % fpga_constants.OPAE_IMG
|
||||
items = subprocess.check_output(shlex.split(cmd), # pylint: disable=not-callable
|
||||
stderr=subprocess.STDOUT,
|
||||
universal_newlines=True)
|
||||
for line in items.splitlines():
|
||||
if fpga_constants.OPAE_IMG in line:
|
||||
cmd = 'ctr -n=k8s.io container rm n3000-opae'
|
||||
subprocess.check_output(shlex.split(cmd), # pylint: disable=not-callable
|
||||
stderr=subprocess.STDOUT,
|
||||
universal_newlines=True)
|
||||
LOG.info('Deleted stale container n3000-opae')
|
||||
break
|
||||
|
||||
def set_cgroup_cpuset(self):
|
||||
# Set CPU affinity by updating the cpuset.cpus
|
||||
platform_cpulist = '0'
|
||||
cpuset_path = '/sys/fs/cgroup/cpuset/platform/'
|
||||
cpuset_file = os.path.join(cpuset_path, 'cpuset.cpus')
|
||||
if not os.path.exists(cpuset_path):
|
||||
os.makedirs(cpuset_path)
|
||||
with open('/etc/platform/worker_reserved.conf', 'r') as infile:
|
||||
for line in infile:
|
||||
if "PLATFORM_CPU_LIST" in line:
|
||||
val = line.split("=")
|
||||
platform_cpulist = val[1].strip('\n')[1:-1].strip('"')
|
||||
with open(cpuset_file, 'w') as fd:
|
||||
LOG.info("Writing %s to file %s" % (platform_cpulist, cpuset_file))
|
||||
fd.write(platform_cpulist)
|
||||
|
||||
def write_device_image_n3000(self, filename, pci_addr):
|
||||
# Write the firmware image to the FPGA at the specified PCI address.
|
||||
# We're assuming that the image update tools will catch the scenario
|
||||
# where the image is not compatible with the device.
|
||||
|
||||
# If the container exists, the host probably rebooted during
|
||||
# a device update. Delete the container.
|
||||
self.cleanup_container()
|
||||
|
||||
# Set cpu affinity for the container
|
||||
self.set_cgroup_cpuset()
|
||||
|
||||
try:
|
||||
# Build up the command to perform the firmware update.
|
||||
# Note the hack to work around OPAE tool locale issues
|
||||
cmd = ("ctr -n=k8s.io run --rm --privileged " +
|
||||
"--env LC_ALL=en_US.UTF-8 --env LANG=en_US.UTF-8 " +
|
||||
"--cgroup platform " +
|
||||
"--mount type=bind,src=" + DEVICE_IMAGE_CACHE_DIR +
|
||||
",dst=/mnt/images,options=rbind:ro " + fpga_constants.OPAE_IMG +
|
||||
" n3000-opae fpgasupdate -y --log-level debug /mnt/images/" +
|
||||
filename + " " + pci_addr)
|
||||
|
||||
# Issue the command to perform the firmware update.
|
||||
subprocess.check_output(shlex.split(cmd), # pylint: disable=not-callable
|
||||
stderr=subprocess.STDOUT)
|
||||
# TODO: switch to subprocess.Popen, parse the output and send
|
||||
# progress updates.
|
||||
except subprocess.CalledProcessError as exc:
|
||||
# Check the return code, send completion info to sysinv-conductor.
|
||||
msg = ("Failed to update device image %s for device %s, "
|
||||
"return code is %d, command output: %s." %
|
||||
(filename, pci_addr, exc.returncode,
|
||||
exc.output.decode('utf-8')))
|
||||
LOG.error(msg)
|
||||
LOG.error("Check for intel-max10 kernel logs.")
|
||||
raise exception.SysinvException(msg)
|
||||
|
||||
def read_n3000_sysfs_file(self, pattern):
|
||||
# Read a sysfs file related to the N3000.
|
||||
# The result should be an empty string if the file doesn't exist,
|
||||
# or a single line of text if it does.
|
||||
|
||||
# Convert the pattern to a list of matching filenames
|
||||
filenames = glob(pattern)
|
||||
|
||||
# If there are no matching files, return an empty string.
|
||||
if len(filenames) == 0:
|
||||
return ""
|
||||
|
||||
# If there's more than one filename, complain.
|
||||
if len(filenames) > 1:
|
||||
LOG.warn("Pattern %s gave %s matching filenames, using the first." %
|
||||
(pattern, len(filenames)))
|
||||
|
||||
filename = filenames[0]
|
||||
infile = open(filename)
|
||||
try:
|
||||
line = infile.readline()
|
||||
return line.strip()
|
||||
except Exception:
|
||||
LOG.exception("Unable to read file %s" % filename)
|
||||
finally:
|
||||
infile.close()
|
||||
return ""
|
||||
|
||||
def get_n3000_root_hash(self, pci_addr):
|
||||
# Query sysfs for the root key of the N3000 at the specified PCI address
|
||||
root_key_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + ROOT_HASH_PATH)
|
||||
root_key = self.read_n3000_sysfs_file(root_key_pattern)
|
||||
# If the root key hasn't been programmed, return an empty string.
|
||||
if root_key == "hash not programmed":
|
||||
root_key = ""
|
||||
return root_key
|
||||
|
||||
def get_n3000_revoked_keys(self, pci_addr):
|
||||
# Query sysfs for revoked keys of the N3000 at the specified PCI address
|
||||
revoked_key_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + CANCELLED_CSKS_PATH)
|
||||
revoked_keys = self.read_n3000_sysfs_file(revoked_key_pattern)
|
||||
return revoked_keys
|
||||
|
||||
def get_n3000_bitstream_id(self, pci_addr):
|
||||
# Query sysfs for bitstream ID of the N3000 at the specified PCI address
|
||||
bitstream_id_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
BITSTREAM_ID_PATH)
|
||||
bitstream_id = self.read_n3000_sysfs_file(bitstream_id_pattern)
|
||||
return bitstream_id
|
||||
|
||||
def get_n3000_boot_page(self, pci_addr):
|
||||
# Query sysfs for boot page of the N3000 at the specified PCI address
|
||||
image_load_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + IMAGE_LOAD_PATH)
|
||||
image_load = self.read_n3000_sysfs_file(image_load_pattern)
|
||||
if image_load == "0":
|
||||
return "factory"
|
||||
elif image_load == "1":
|
||||
return "user"
|
||||
else:
|
||||
LOG.warn("Reading image load gave unexpected result: %s" % image_load)
|
||||
return ""
|
||||
|
||||
def get_n3000_bmc_version(self, pci_addr, path):
|
||||
version_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + path)
|
||||
version = self.read_n3000_sysfs_file(version_pattern)
|
||||
|
||||
# If we couldn't read the file, return an empty string.
|
||||
if version == "":
|
||||
return ""
|
||||
|
||||
# We're expecting a 32-bit value, possibly with "0x" in front.
|
||||
try:
|
||||
vint = int(version, 16)
|
||||
except ValueError:
|
||||
return ""
|
||||
|
||||
if vint >= 1 << 32:
|
||||
LOG.warn("String (%s) read from file %s doesn't match the "
|
||||
"expected pattern" % (version, version_pattern))
|
||||
return ""
|
||||
# There's probably a better way than this.
|
||||
# We want to match the version that Intel's "fpgainfo" tool reports.
|
||||
return ("%s.%s.%s.%s" % (chr(vint >> 24), str(vint >> 16 & 0xff),
|
||||
str(vint >> 8 & 0xff), str(vint & 0xff)))
|
||||
|
||||
def get_n3000_bmc_fw_version(self, pci_addr):
|
||||
return self.get_n3000_bmc_version(pci_addr, BMC_FW_VER_PATH)
|
||||
|
||||
def get_n3000_bmc_build_version(self, pci_addr):
|
||||
return self.get_n3000_bmc_version(pci_addr, BMC_BUILD_VER_PATH)
|
||||
|
||||
def get_n3000_retimer_version(self, pci_addr, path):
|
||||
version_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + path)
|
||||
version = self.read_n3000_sysfs_file(version_pattern)
|
||||
if len(version) > RETIMER_VERSION_LENGTH:
|
||||
LOG.warn("Retimer version string (%s) read from file %s is "
|
||||
"unexpectedly long. It is truncating." %
|
||||
(version, version_pattern))
|
||||
version = version[:RETIMER_VERSION_LENGTH]
|
||||
return version
|
||||
|
||||
def get_n3000_retimer_a_version(self, pci_addr):
|
||||
return self.get_n3000_retimer_version(pci_addr, RETIMER_A_VER_PATH)
|
||||
|
||||
def get_n3000_retimer_b_version(self, pci_addr):
|
||||
return self.get_n3000_retimer_version(pci_addr, RETIMER_B_VER_PATH)
|
||||
|
||||
def get_n3000_devices(self):
|
||||
# First get the PCI addresses of each supported FPGA device
|
||||
cmd = ["lspci", "-Dm", "-d " + fpga_constants.N3000_VENDOR + ":" +
|
||||
fpga_constants.N3000_DEVICE]
|
||||
|
||||
try:
|
||||
output = subprocess.check_output( # pylint: disable=not-callable
|
||||
cmd, stderr=subprocess.STDOUT, universal_newlines=True)
|
||||
except subprocess.CalledProcessError as exc:
|
||||
msg = ("Failed to get pci devices with vendor %s and device %s, "
|
||||
"return code is %d, command output: %s." %
|
||||
(fpga_constants.N3000_VENDOR, fpga_constants.N3000_DEVICE, exc.returncode,
|
||||
exc.output))
|
||||
LOG.warn(msg)
|
||||
raise exception.SysinvException(msg)
|
||||
|
||||
# Parse the output of the lspci command and grab the PCI address
|
||||
fpga_addrs = []
|
||||
for line in output.splitlines():
|
||||
line = shlex.split(line.strip())
|
||||
fpga_addrs.append(line[0])
|
||||
return fpga_addrs
|
||||
|
||||
def get_n3000_pci_info(self):
|
||||
""" Query PCI information about N3000 PCI devices.
|
||||
|
||||
This needs to exactly mirror what sysinv-agent does as far as PCI
|
||||
updates. We could potentially modify sysinv-agent to do the PCI
|
||||
updates when triggered by an RPC cast, but we don't need to rescan
|
||||
all PCI devices, just the N3000 devices.
|
||||
"""
|
||||
pci_devs = []
|
||||
pci_device_list = []
|
||||
try:
|
||||
pci_operator = pci.PCIOperator()
|
||||
# We want to get updated info for the FPGA itself and any "virtual"
|
||||
# PCI devices implemented by the This loop isn't very
|
||||
# efficient, but so far it's only a small number of devices.
|
||||
pci_devices = []
|
||||
for device in fpga_constants.N3000_DEVICES:
|
||||
pci_devices.extend(pci_operator.pci_devices_get(
|
||||
vendor=fpga_constants.N3000_VENDOR, device=device))
|
||||
for pci_dev in pci_devices:
|
||||
pci_dev_array = pci_operator.pci_get_device_attrs(
|
||||
pci_dev.pciaddr)
|
||||
for dev in pci_dev_array:
|
||||
pci_devs.append(pci.PCIDevice(pci_dev, **dev))
|
||||
|
||||
is_fpga_n3000_reset = \
|
||||
os.path.exists(fpga_constants.N3000_RESET_FLAG)
|
||||
|
||||
for dev in pci_devs:
|
||||
pci_dev_dict = {'name': dev.name,
|
||||
'pciaddr': dev.pci.pciaddr,
|
||||
'pclass_id': dev.pclass_id,
|
||||
'pvendor_id': dev.pvendor_id,
|
||||
'pdevice_id': dev.pdevice_id,
|
||||
'pclass': dev.pci.pclass,
|
||||
'pvendor': dev.pci.pvendor,
|
||||
'pdevice': dev.pci.pdevice,
|
||||
'prevision': dev.pci.prevision,
|
||||
'psvendor': dev.pci.psvendor,
|
||||
'psdevice': dev.pci.psdevice,
|
||||
'numa_node': dev.numa_node,
|
||||
'sriov_totalvfs': dev.sriov_totalvfs,
|
||||
'sriov_numvfs': dev.sriov_numvfs,
|
||||
'sriov_vfs_pci_address': dev.sriov_vfs_pci_address,
|
||||
'sriov_vf_driver': dev.sriov_vf_driver,
|
||||
'sriov_vf_pdevice_id': dev.sriov_vf_pdevice_id,
|
||||
'driver': dev.driver,
|
||||
'enabled': dev.enabled,
|
||||
'extra_info': dev.extra_info,
|
||||
'fpga_n3000_reset': is_fpga_n3000_reset}
|
||||
LOG.debug('Sysinv FPGA Agent dev {}'.format(pci_dev_dict))
|
||||
pci_device_list.append(pci_dev_dict)
|
||||
except Exception:
|
||||
LOG.exception("Unable to query FPGA pci information, "
|
||||
"sysinv DB will be stale")
|
||||
|
||||
return pci_device_list
|
||||
|
||||
def watchdog_action(self, action):
|
||||
if action not in ["stop", "start"]:
|
||||
LOG.warn("watchdog_action called with invalid action: %s", action)
|
||||
return
|
||||
try:
|
||||
# Build up the command to perform the action.
|
||||
cmd = ["systemctl", action, "hostw"]
|
||||
|
||||
# Issue the command to stop/start the watchdog
|
||||
subprocess.check_output( # pylint: disable=not-callable
|
||||
cmd, stderr=subprocess.STDOUT,
|
||||
universal_newlines=True)
|
||||
except subprocess.CalledProcessError as exc:
|
||||
msg = ("Failed to %s hostw service, "
|
||||
"return code is %d, command output: %s." %
|
||||
(action, exc.returncode, exc.output))
|
||||
LOG.warn(msg)
|
||||
|
||||
def stop_watchdog(self):
|
||||
self.watchdog_action("stop")
|
||||
|
||||
def start_watchdog(self):
|
||||
self.watchdog_action("start")
|
||||
|
||||
def get_fpga_info(self):
|
||||
# For now we only support the N3000, eventually we may need to support
|
||||
# other FPGA devices.
|
||||
|
||||
# Get a list of N3000 FPGA device addresses.
|
||||
fpga_addrs = self.get_n3000_devices()
|
||||
|
||||
# Next, get additional information information for devices in the list.
|
||||
fpgainfo_list = []
|
||||
for addr in fpga_addrs:
|
||||
# Store information for this FPGA
|
||||
fpgainfo = {'pciaddr': addr}
|
||||
fpgainfo['bmc_build_version'] = self.get_n3000_bmc_build_version(addr)
|
||||
fpgainfo['bmc_fw_version'] = self.get_n3000_bmc_fw_version(addr)
|
||||
fpgainfo['retimer_a_version'] = self.get_n3000_retimer_a_version(addr)
|
||||
fpgainfo['retimer_b_version'] = self.get_n3000_retimer_b_version(addr)
|
||||
fpgainfo['boot_page'] = self.get_n3000_boot_page(addr)
|
||||
fpgainfo['bitstream_id'] = self.get_n3000_bitstream_id(addr)
|
||||
fpgainfo['root_key'] = self.get_n3000_root_hash(addr)
|
||||
fpgainfo['revoked_key_ids'] = self.get_n3000_revoked_keys(addr)
|
||||
|
||||
# TODO: Also retrieve the information about which NICs are on
|
||||
# the FPGA device.
|
||||
|
||||
fpgainfo_list.append(fpgainfo)
|
||||
|
||||
return fpgainfo_list
|
||||
|
||||
def device_update_image(self, context, host_uuid, pci_addr, filename, transaction_id,
|
||||
retimer_included):
|
||||
"""Write the device image to the device at the specified address.
|
||||
|
||||
Transaction is the transaction ID as specified by sysinv-conductor.
|
||||
|
||||
This must send back either success or failure to sysinv-conductor
|
||||
via an RPC cast. The transaction ID is sent back to allow sysinv-conductor
|
||||
to locate the transaction in the DB.
|
||||
|
||||
TODO: could get fancier with an image cache and delete based on LRU.
|
||||
"""
|
||||
|
||||
rpcapi = conductor_rpcapi.ConductorAPI(
|
||||
topic=conductor_rpcapi.MANAGER_TOPIC)
|
||||
|
||||
try:
|
||||
LOG.info("ensure device image cache exists")
|
||||
self.ensure_device_image_cache_exists()
|
||||
|
||||
# Pull the image from the controller via HTTP
|
||||
LOG.info("fetch device image %s" % filename)
|
||||
local_path = self.fetch_device_image(filename)
|
||||
|
||||
# TODO: check CSK used to sign image, ensure it hasn't been cancelled
|
||||
# TODO: check root key used to sign image, ensure it matches root key of hardware
|
||||
# Note: may want to check these in the sysinv API too.
|
||||
|
||||
try:
|
||||
LOG.info("setting transaction id %s as in progress" % transaction_id)
|
||||
rpcapi.device_update_image_status(
|
||||
context, host_uuid, transaction_id,
|
||||
dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS)
|
||||
|
||||
# Disable the watchdog service to prevent a reboot on things
|
||||
# like critical process death. We don't want to reboot while
|
||||
# flashing the FPGA.
|
||||
self.stop_watchdog()
|
||||
|
||||
# Write the image to the specified PCI device.
|
||||
# TODO: when we support more than just N3000, we'll need to
|
||||
# pick the appropriate low-level write function based on the
|
||||
# hardware type.
|
||||
LOG.info("writing device image %s to device %s" % (filename, pci_addr))
|
||||
self.write_device_image_n3000(filename, pci_addr)
|
||||
|
||||
# If we get an exception trying to send the status update
|
||||
# there's not much we can do.
|
||||
try:
|
||||
LOG.info("setting transaction id %s as complete" % transaction_id)
|
||||
rpcapi.device_update_image_status(
|
||||
context, host_uuid, transaction_id,
|
||||
dconstants.DEVICE_IMAGE_UPDATE_COMPLETED)
|
||||
except Exception:
|
||||
LOG.exception("Unable to send fpga update image status "
|
||||
"completion message for transaction %s."
|
||||
% transaction_id)
|
||||
finally:
|
||||
# Delete the image file.
|
||||
os.remove(local_path)
|
||||
# start the watchdog service again
|
||||
self.start_watchdog()
|
||||
# If device image contains c827 retimer firmware, set the retimer flag
|
||||
if retimer_included:
|
||||
utils.touch(fpga_constants.N3000_RETIMER_FLAG)
|
||||
|
||||
except exception.SysinvException as exc:
|
||||
LOG.info("setting transaction id %s as failed" % transaction_id)
|
||||
rpcapi.device_update_image_status(context, host_uuid,
|
||||
transaction_id,
|
||||
dconstants.DEVICE_IMAGE_UPDATE_FAILED,
|
||||
six.text_type(exc))
|
@ -56,12 +56,13 @@ from sysinv.agent import pv
|
||||
from sysinv.agent import lvg
|
||||
from sysinv.agent import pci
|
||||
from sysinv.agent import node
|
||||
from sysinv.agent import fpga
|
||||
from sysinv.agent.lldp import plugin as lldp_plugin
|
||||
from sysinv.common import fpga_constants
|
||||
from sysinv.common import constants
|
||||
from sysinv.common import exception
|
||||
from sysinv.common import service
|
||||
from sysinv.common import utils
|
||||
from sysinv.fpga_agent import constants as fpga_constants
|
||||
from sysinv.objects import base as objects_base
|
||||
from sysinv.puppet import common as puppet
|
||||
from sysinv.conductor import rpcapi as conductor_rpcapi
|
||||
@ -159,9 +160,11 @@ class AgentManager(service.PeriodicService):
|
||||
super(AgentManager, self).__init__(host, topic, serializer=serializer)
|
||||
|
||||
self._report_to_conductor_iplatform_avail_flag = False
|
||||
self._report_to_conductor_fpga_info = True
|
||||
self._ipci_operator = pci.PCIOperator()
|
||||
self._inode_operator = node.NodeOperator()
|
||||
self._idisk_operator = disk.DiskOperator()
|
||||
self._ifpga_operator = fpga.FpgaOperator()
|
||||
self._ipv_operator = pv.PVOperator()
|
||||
self._ipartition_operator = partition.PartitionOperator()
|
||||
self._ilvg_operator = lvg.LVGOperator()
|
||||
@ -1367,6 +1370,50 @@ class AgentManager(service.PeriodicService):
|
||||
|
||||
self._create_host_filesystems(rpcapi, icontext)
|
||||
|
||||
# Collect FPGA PCI data for this host.
|
||||
# We know that the PCI address of the N3000 can change the first time
|
||||
# We reset it after boot, so we need to gather the new PCI device
|
||||
# information and send it to sysinv-conductor.
|
||||
# This needs to exactly mirror what sysinv-agent does as far as PCI
|
||||
# updates. We could potentially modify sysinv-agent to do the PCI
|
||||
# updates when triggered by an RPC cast, but we don't need to rescan
|
||||
# all PCI devices, just the N3000 devices.
|
||||
if os.path.exists(fpga_constants.N3000_RESET_FLAG) and \
|
||||
self._report_to_conductor_fpga_info:
|
||||
LOG.info("Found n3000 reset flag, continuing.")
|
||||
LOG.info("Updating N3000 PCI info.")
|
||||
pci_device_list = self._ifpga_operator.get_n3000_pci_info()
|
||||
try:
|
||||
if pci_device_list:
|
||||
LOG.info("reporting N3000 PCI devices for host %s: %s" %
|
||||
(self._ihost_uuid, pci_device_list))
|
||||
|
||||
# Don't ask conductor to cleanup stale entries while worker
|
||||
# manifest is not complete. For N3000 device, it could get rid
|
||||
# of a valid entry with a different PCI address but restored
|
||||
# from previous database backup
|
||||
cleanup_stale = \
|
||||
os.path.exists(tsc.VOLATILE_WORKER_CONFIG_COMPLETE)
|
||||
rpcapi.pci_device_update_by_host(icontext,
|
||||
self._ihost_uuid,
|
||||
pci_device_list,
|
||||
cleanup_stale)
|
||||
except Exception:
|
||||
LOG.exception("Exception updating n3000 PCI devices, "
|
||||
"this will likely cause problems.")
|
||||
pass
|
||||
|
||||
# Collect FPGA data for this host.
|
||||
fpgainfo_list = self._ifpga_operator.get_fpga_info()
|
||||
LOG.info("reporting FPGA inventory for host %s: %s" %
|
||||
(self._ihost_uuid, fpgainfo_list))
|
||||
try:
|
||||
rpcapi.fpga_device_update_by_host(icontext, self._ihost_uuid, fpgainfo_list)
|
||||
self._report_to_conductor_fpga_info = False
|
||||
except exception.SysinvException:
|
||||
LOG.exception("Exception updating fpga devices.")
|
||||
pass
|
||||
|
||||
# Notify conductor of inventory completion after necessary
|
||||
# inventory reports have been sent to conductor.
|
||||
# This is as defined by _conditions_for_inventory_complete_met().
|
||||
@ -1919,6 +1966,27 @@ class AgentManager(service.PeriodicService):
|
||||
|
||||
return
|
||||
|
||||
def device_update_image(self, context, host_uuid, pci_addr, filename, transaction_id,
|
||||
retimer_included):
|
||||
"""Write the device image to the device at the specified address.
|
||||
|
||||
Transaction is the transaction ID as specified by sysinv-conductor.
|
||||
|
||||
This must send back either success or failure to sysinv-conductor
|
||||
via an RPC cast. The transaction ID is sent back to allow sysinv-conductor
|
||||
to locate the transaction in the DB.
|
||||
|
||||
TODO: could get fancier with an image cache and delete based on LRU.
|
||||
"""
|
||||
LOG.debug("AgentManager.device_update_image: %s" % pci_addr)
|
||||
if self._ihost_uuid and self._ihost_uuid == host_uuid:
|
||||
self._ifpga_operator.device_update_image(context,
|
||||
host_uuid,
|
||||
pci_addr,
|
||||
filename,
|
||||
transaction_id,
|
||||
retimer_included)
|
||||
|
||||
def execute_command(self, context, host_uuid, command):
|
||||
"""Execute a command on behalf of sysinv-conductor
|
||||
|
||||
|
@ -20,10 +20,10 @@ from eventlet.green import subprocess
|
||||
from glob import glob
|
||||
from oslo_log import log
|
||||
|
||||
from sysinv.common import fpga_constants
|
||||
from sysinv.common import utils
|
||||
from sysinv.common import exception
|
||||
from sysinv.fpga_agent.manager import get_n3000_devices
|
||||
from sysinv.fpga_agent import constants
|
||||
from sysinv.agent import fpga
|
||||
|
||||
# Volatile flag file so we only reset the N3000s once after bootup.
|
||||
LOG = log.getLogger(__name__)
|
||||
@ -42,48 +42,48 @@ EEPROM_UPDATE_SUCCESS = '0x1111'
|
||||
|
||||
|
||||
def n3000_img_accessible():
|
||||
cmd = 'ctr -n=k8s.io image list name=="%s"' % constants.OPAE_IMG
|
||||
cmd = 'ctr -n=k8s.io image list name=="%s"' % fpga_constants.OPAE_IMG
|
||||
items = subprocess.check_output(shlex.split(cmd), # pylint: disable=not-callable
|
||||
stderr=subprocess.STDOUT,
|
||||
universal_newlines=True)
|
||||
for line in items.splitlines():
|
||||
if constants.OPAE_IMG in line:
|
||||
LOG.info('%s image found' % constants.OPAE_IMG)
|
||||
return constants.OPAE_IMG
|
||||
if fpga_constants.OPAE_IMG in line:
|
||||
LOG.info('%s image found' % fpga_constants.OPAE_IMG)
|
||||
return fpga_constants.OPAE_IMG
|
||||
LOG.info('%s image not found, check older image' %
|
||||
constants.OPAE_IMG)
|
||||
fpga_constants.OPAE_IMG)
|
||||
# During upgrade. check if previous version is available
|
||||
cmd = 'ctr -n=k8s.io image list name=="%s"' % constants.OPAE_IMG_PREV
|
||||
cmd = 'ctr -n=k8s.io image list name=="%s"' % fpga_constants.OPAE_IMG_PREV
|
||||
items = subprocess.check_output(shlex.split(cmd), # pylint: disable=not-callable
|
||||
stderr=subprocess.STDOUT,
|
||||
universal_newlines=True)
|
||||
for line in items.splitlines():
|
||||
if constants.OPAE_IMG_PREV in line:
|
||||
LOG.info('%s image found' % constants.OPAE_IMG_PREV)
|
||||
return constants.OPAE_IMG_PREV
|
||||
if fpga_constants.OPAE_IMG_PREV in line:
|
||||
LOG.info('%s image found' % fpga_constants.OPAE_IMG_PREV)
|
||||
return fpga_constants.OPAE_IMG_PREV
|
||||
LOG.info('%s image not found, try image pull from controller' %
|
||||
constants.OPAE_IMG_PREV)
|
||||
fpga_constants.OPAE_IMG_PREV)
|
||||
|
||||
# n3000 image not found in containerd, get it from the controller
|
||||
try:
|
||||
subprocess.check_output(["crictl", "pull", constants.OPAE_IMG]) # pylint: disable=not-callable
|
||||
LOG.info("Image %s imported by containerd" % constants.OPAE_IMG)
|
||||
return constants.OPAE_IMG
|
||||
subprocess.check_output(["crictl", "pull", fpga_constants.OPAE_IMG]) # pylint: disable=not-callable
|
||||
LOG.info("Image %s imported by containerd" % fpga_constants.OPAE_IMG)
|
||||
return fpga_constants.OPAE_IMG
|
||||
except subprocess.CalledProcessError as exc:
|
||||
msg = ("Failed to pull image %s, "
|
||||
"return code is %d, command output: %s." %
|
||||
(constants.OPAE_IMG, exc.returncode, exc.output))
|
||||
(fpga_constants.OPAE_IMG, exc.returncode, exc.output))
|
||||
LOG.info(msg)
|
||||
# During upgrade the current version is not available,
|
||||
# try pulling the previous version
|
||||
try:
|
||||
subprocess.check_output(["crictl", "pull", constants.OPAE_IMG_PREV]) # pylint: disable=not-callable
|
||||
LOG.info("Image %s imported by containerd" % constants.OPAE_IMG_PREV)
|
||||
return constants.OPAE_IMG_PREV
|
||||
subprocess.check_output(["crictl", "pull", fpga_constants.OPAE_IMG_PREV]) # pylint: disable=not-callable
|
||||
LOG.info("Image %s imported by containerd" % fpga_constants.OPAE_IMG_PREV)
|
||||
return fpga_constants.OPAE_IMG_PREV
|
||||
except subprocess.CalledProcessError as exc:
|
||||
msg = ("Failed to pull image %s, "
|
||||
"return code is %d, command output: %s." %
|
||||
(constants.OPAE_IMG_PREV, exc.returncode, exc.output))
|
||||
(fpga_constants.OPAE_IMG_PREV, exc.returncode, exc.output))
|
||||
LOG.info(msg)
|
||||
return None
|
||||
|
||||
@ -155,12 +155,12 @@ def update_device_n3000_retimer(pci_addr):
|
||||
|
||||
|
||||
def reset_n3000_fpgas():
|
||||
if not os.path.exists(constants.N3000_RESET_FLAG):
|
||||
if not os.path.exists(fpga_constants.N3000_RESET_FLAG):
|
||||
# Reset all N3000 FPGAs on the system.
|
||||
# TODO: make this run in parallel if there are multiple devices.
|
||||
LOG.info("Resetting N3000 FPGAs.")
|
||||
got_exception = False
|
||||
fpga_addrs = get_n3000_devices()
|
||||
fpga_addrs = fpga.FpgaOperator().get_n3000_devices()
|
||||
opae_img = n3000_img_accessible()
|
||||
if opae_img is None:
|
||||
LOG.info("n3000 opae image is not ready, exit...")
|
||||
@ -172,9 +172,9 @@ def reset_n3000_fpgas():
|
||||
except Exception:
|
||||
got_exception = True
|
||||
|
||||
if not got_exception and os.path.exists(constants.N3000_RETIMER_FLAG):
|
||||
if not got_exception and os.path.exists(fpga_constants.N3000_RETIMER_FLAG):
|
||||
# The retimer included flag is set, execute additional steps
|
||||
fpga_addrs = get_n3000_devices()
|
||||
fpga_addrs = fpga.FpgaOperator().get_n3000_devices()
|
||||
for fpga_addr in fpga_addrs:
|
||||
try:
|
||||
LOG.info("Updating retimer")
|
||||
@ -186,9 +186,9 @@ def reset_n3000_fpgas():
|
||||
|
||||
LOG.info("Done resetting N3000 FPGAs.")
|
||||
if not got_exception:
|
||||
utils.touch(constants.N3000_RESET_FLAG)
|
||||
if os.path.exists(constants.N3000_RETIMER_FLAG):
|
||||
os.remove(constants.N3000_RETIMER_FLAG)
|
||||
utils.touch(fpga_constants.N3000_RESET_FLAG)
|
||||
if os.path.exists(fpga_constants.N3000_RETIMER_FLAG):
|
||||
os.remove(fpga_constants.N3000_RETIMER_FLAG)
|
||||
return True
|
||||
else:
|
||||
return False
|
@ -281,3 +281,16 @@ class AgentAPI(sysinv.openstack.common.rpc.proxy.RpcProxy):
|
||||
return self.call(context,
|
||||
self.make_msg('update_host_lvm',
|
||||
host_uuid=host_uuid))
|
||||
|
||||
# handle firmware updates on FPGA devices
|
||||
def host_device_update_image(self, context, host_uuid, hostname, pci_addr,
|
||||
filename, transaction_id, retimer_included):
|
||||
LOG.info("sending device_update_image to host %s" % hostname)
|
||||
topic = '%s.%s' % (self.topic, hostname)
|
||||
return self.cast(context,
|
||||
self.make_msg('device_update_image',
|
||||
host_uuid=host_uuid,
|
||||
pci_addr=pci_addr, filename=filename,
|
||||
transaction_id=transaction_id,
|
||||
retimer_included=retimer_included),
|
||||
topic=topic)
|
||||
|
@ -23,7 +23,7 @@ from sysinv.common import constants
|
||||
from sysinv.common import device as dconstants
|
||||
from sysinv.common import exception
|
||||
from sysinv.common import utils as cutils
|
||||
from sysinv.fpga_agent import constants as fpga_constants
|
||||
from sysinv.common import fpga_constants
|
||||
from sysinv import objects
|
||||
|
||||
LOG = log.getLogger(__name__)
|
||||
|
@ -1,49 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- encoding: utf-8 -*-
|
||||
#
|
||||
# vim: tabstop=4 shiftwidth=4 softtabstop=4
|
||||
#
|
||||
# Copyright 2013 Hewlett-Packard Development Company, L.P.
|
||||
# All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
|
||||
# Copyright (c) 2020 Wind River Systems, Inc.
|
||||
|
||||
"""
|
||||
The System Inventory FPGA Agent Service
|
||||
"""
|
||||
|
||||
import sys
|
||||
|
||||
from oslo_config import cfg
|
||||
|
||||
from sysinv.openstack.common import service
|
||||
|
||||
from sysinv.common import service as sysinv_service
|
||||
from sysinv.fpga_agent import manager
|
||||
from sysinv import sanity_coverage
|
||||
|
||||
CONF = cfg.CONF
|
||||
|
||||
|
||||
def main():
|
||||
if sanity_coverage.flag_file_exists():
|
||||
sanity_coverage.start()
|
||||
# Parse config file and command line options, then start logging
|
||||
sysinv_service.prepare_service(sys.argv)
|
||||
|
||||
# beware: connection is based upon host and MANAGER_TOPIC
|
||||
mgr = manager.FpgaAgentManager(CONF.host, manager.MANAGER_TOPIC)
|
||||
launcher = service.launch(mgr)
|
||||
launcher.wait()
|
@ -16,7 +16,7 @@
|
||||
|
||||
from oslo_config import cfg
|
||||
from oslo_log import log as logging
|
||||
from sysinv.fpga_agent.reset_n3000_fpgas import reset_n3000_fpgas
|
||||
from sysinv.agent.reset_n3000_fpgas import reset_n3000_fpgas
|
||||
|
||||
LOG = logging.getLogger(__name__)
|
||||
CONF = cfg.CONF
|
||||
|
@ -1,5 +1,5 @@
|
||||
#
|
||||
# Copyright (c) 2020-2021 Wind River Systems, Inc.
|
||||
# Copyright (c) 2020-2022 Wind River Systems, Inc.
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
@ -84,6 +84,7 @@ from sysinv.api.controllers.v1 import kube_app as kube_api
|
||||
from sysinv.api.controllers.v1 import mtce_api
|
||||
from sysinv.api.controllers.v1 import utils
|
||||
from sysinv.api.controllers.v1 import vim_api
|
||||
from sysinv.common import fpga_constants
|
||||
from sysinv.common import constants
|
||||
from sysinv.common import ceph as cceph
|
||||
from sysinv.common import dc_api
|
||||
@ -106,8 +107,6 @@ from sysinv.conductor import openstack
|
||||
from sysinv.conductor import docker_registry
|
||||
from sysinv.conductor import keystone_listener
|
||||
from sysinv.db import api as dbapi
|
||||
from sysinv.fpga_agent import rpcapi as fpga_agent_rpcapi
|
||||
from sysinv.fpga_agent import constants as fpga_constants
|
||||
from sysinv import objects
|
||||
from sysinv.objects import base as objects_base
|
||||
from sysinv.objects import kube_app as kubeapp_obj
|
||||
@ -14825,10 +14824,10 @@ class ConductorManager(service.PeriodicService):
|
||||
filename = cutils.format_image_filename(device_image)
|
||||
LOG.info("sending rpc req to update image for host %s, pciaddr: %s, filename: %s, id: %s" %
|
||||
(host.hostname, pci_device.pciaddr, filename, device_image_state.id))
|
||||
fpga_rpcapi = fpga_agent_rpcapi.AgentAPI()
|
||||
fpga_rpcapi = agent_rpcapi.AgentAPI()
|
||||
fpga_rpcapi.host_device_update_image(
|
||||
context, host.hostname, pci_device.pciaddr, filename, device_image_state.id,
|
||||
device_image.retimer_included)
|
||||
context, host_uuid, host.hostname, pci_device.pciaddr, filename,
|
||||
device_image_state.id, device_image.retimer_included)
|
||||
# We've kicked off a device image update, so exit the function.
|
||||
return
|
||||
LOG.info("no more device images to process")
|
||||
|
@ -1,11 +0,0 @@
|
||||
#
|
||||
# Copyright (c) 2020 Wind River Systems, Inc.
|
||||
#
|
||||
# SPDX-License-Identifier: Apache-2.0
|
||||
#
|
||||
|
||||
# vim: tabstop=4 shiftwidth=4 softtabstop=4
|
||||
# coding=utf-8
|
||||
|
||||
# All Rights Reserved.
|
||||
#
|
@ -1,697 +0,0 @@
|
||||
# vim: tabstop=4 shiftwidth=4 softtabstop=4
|
||||
# coding=utf-8
|
||||
|
||||
# Copyright 2013 Hewlett-Packard Development Company, L.P.
|
||||
# Copyright 2013 International Business Machines Corporation
|
||||
# All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
#
|
||||
# Copyright (c) 2020-2022 Wind River Systems, Inc.
|
||||
#
|
||||
|
||||
|
||||
""" Perform activity related to FPGA devices on a single host.
|
||||
|
||||
A single instance of :py:class:`sysinv.agent.manager.FpgaAgentManager` is
|
||||
created within the *sysinv-fpga-agent* process, and is responsible for
|
||||
performing all actions for this host related to FPGA devices.
|
||||
|
||||
On start, collect and post FPGA inventory to conductor.
|
||||
|
||||
Commands (from conductors) are received via RPC calls.
|
||||
|
||||
"""
|
||||
|
||||
from __future__ import print_function
|
||||
import errno
|
||||
from eventlet.green import subprocess
|
||||
from glob import glob
|
||||
import six
|
||||
|
||||
import os
|
||||
import shlex
|
||||
import time
|
||||
|
||||
from oslo_config import cfg
|
||||
from oslo_log import log
|
||||
from oslo_utils import uuidutils
|
||||
from six.moves.urllib.request import urlretrieve
|
||||
|
||||
from sysinv.agent import pci
|
||||
from sysinv.common import constants as cconstants
|
||||
from sysinv.common import device as dconstants
|
||||
from sysinv.common import exception
|
||||
from sysinv.common import service
|
||||
from sysinv.common import utils
|
||||
from sysinv.conductor import rpcapi as conductor_rpcapi
|
||||
from sysinv.fpga_agent import constants
|
||||
from sysinv.objects import base as objects_base
|
||||
from sysinv.openstack.common import context as ctx
|
||||
|
||||
import tsconfig.tsconfig as tsc
|
||||
|
||||
MANAGER_TOPIC = 'sysinv.fpga_agent_manager'
|
||||
|
||||
LOG = log.getLogger(__name__)
|
||||
|
||||
agent_opts = [
|
||||
cfg.StrOpt('api_url',
|
||||
default=None,
|
||||
help=('Url of SysInv API service. If not set SysInv can '
|
||||
'get current value from Keystone service catalog.')),
|
||||
cfg.IntOpt('audit_interval',
|
||||
default=60,
|
||||
help='Maximum time since the last check-in of a agent'),
|
||||
]
|
||||
|
||||
CONF = cfg.CONF
|
||||
CONF.register_opts(agent_opts, 'fpga_agent')
|
||||
|
||||
# This is the location where we cache the device image file while
|
||||
# writing it to the hardware.
|
||||
DEVICE_IMAGE_CACHE_ROOT_DIR = "/var" if utils.is_debian() else "/usr"
|
||||
DEVICE_IMAGE_CACHE_DIR = DEVICE_IMAGE_CACHE_ROOT_DIR + \
|
||||
"/local/share/applications/sysinv"
|
||||
|
||||
SYSFS_DEVICE_PATH = "/sys/bus/pci/devices/"
|
||||
FME_PATH = "/fpga/intel-fpga-dev.*/intel-fpga-fme.*/"
|
||||
SPI_PATH = "spi-altera.*.auto/spi_master/spi*/spi*.*/"
|
||||
|
||||
# These are relative to FME_PATH
|
||||
BITSTREAM_ID_PATH = "bitstream_id"
|
||||
|
||||
# These are relative to SPI_PATH
|
||||
ROOT_HASH_PATH = "ifpga_sec_mgr/ifpga_sec*/security/sr_root_hash"
|
||||
CANCELLED_CSKS_PATH = "ifpga_sec_mgr/ifpga_sec*/security/sr_canceled_csks"
|
||||
IMAGE_LOAD_PATH = "fpga_flash_ctrl/fpga_image_load"
|
||||
BMC_FW_VER_PATH = "bmcfw_flash_ctrl/bmcfw_version"
|
||||
BMC_BUILD_VER_PATH = "max10_version"
|
||||
RETIMER_A_VER_PATH = "pkvl/pkvl_a_version"
|
||||
RETIMER_B_VER_PATH = "pkvl/pkvl_b_version"
|
||||
|
||||
# Length of the retimer version in database
|
||||
RETIMER_VERSION_LENGTH = 32
|
||||
|
||||
|
||||
def wait_for_n3000_reset():
|
||||
LOG.info("Waiting for n3000 reset flag.")
|
||||
timeout = 0
|
||||
while not os.path.exists(constants.N3000_RESET_FLAG):
|
||||
if timeout > constants.N3000_RESET_TIMEOUT:
|
||||
msg = ("Timeout waiting for n3000 reset flag")
|
||||
LOG.info(msg)
|
||||
return
|
||||
time.sleep(1)
|
||||
timeout += 1
|
||||
LOG.info("Found n3000 reset flag, continuing.")
|
||||
|
||||
|
||||
def ensure_device_image_cache_exists():
|
||||
# Make sure the image cache directory exists, create it if needed.
|
||||
try:
|
||||
os.mkdir(DEVICE_IMAGE_CACHE_DIR, 0o755)
|
||||
except OSError as exc:
|
||||
if exc.errno != errno.EEXIST:
|
||||
msg = ("Unable to create device image cache directory %s!"
|
||||
% DEVICE_IMAGE_CACHE_DIR)
|
||||
LOG.exception(msg)
|
||||
raise exception.SysinvException(msg)
|
||||
|
||||
|
||||
def get_http_port():
|
||||
# Get the http_port from /etc/platform/platform.conf.
|
||||
prefix = "http_port="
|
||||
http_port = cconstants.SERVICE_PARAM_HTTP_PORT_HTTP_DEFAULT
|
||||
if os.path.isfile(tsc.PLATFORM_CONF_FILE):
|
||||
with open(tsc.PLATFORM_CONF_FILE, 'r') as platform_file:
|
||||
for line in platform_file:
|
||||
line = line.strip()
|
||||
if line.startswith(prefix):
|
||||
port = line[len(prefix):]
|
||||
if utils.is_int_like(port):
|
||||
LOG.info("Agent found %s%s" % (prefix, port))
|
||||
http_port = port
|
||||
break
|
||||
else:
|
||||
LOG.info("http_port entry: %s in platform.conf "
|
||||
"is not an integer" % port)
|
||||
return http_port
|
||||
|
||||
|
||||
def fetch_device_image(filename):
|
||||
# Pull the image from the controller.
|
||||
http_port = get_http_port()
|
||||
url = "http://controller:{}/device_images/{}".format(http_port, filename)
|
||||
local_path = DEVICE_IMAGE_CACHE_DIR + "/" + filename
|
||||
try:
|
||||
imagefile, headers = urlretrieve(url, local_path)
|
||||
except IOError:
|
||||
msg = ("Unable to retrieve device image from %s!" % url)
|
||||
LOG.exception(msg)
|
||||
raise exception.SysinvException(msg)
|
||||
return local_path
|
||||
|
||||
|
||||
def cleanup_container():
|
||||
# Delete container if exists
|
||||
cmd = 'ctr -n=k8s.io container list image=="%s"' % constants.OPAE_IMG
|
||||
items = subprocess.check_output(shlex.split(cmd), # pylint: disable=not-callable
|
||||
stderr=subprocess.STDOUT,
|
||||
universal_newlines=True)
|
||||
for line in items.splitlines():
|
||||
if constants.OPAE_IMG in line:
|
||||
cmd = 'ctr -n=k8s.io container rm n3000-opae'
|
||||
subprocess.check_output(shlex.split(cmd), # pylint: disable=not-callable
|
||||
stderr=subprocess.STDOUT,
|
||||
universal_newlines=True)
|
||||
LOG.info('Deleted stale container n3000-opae')
|
||||
break
|
||||
|
||||
|
||||
def set_cgroup_cpuset():
|
||||
# Set CPU affinity by updating the cpuset.cpus
|
||||
platform_cpulist = '0'
|
||||
cpuset_path = '/sys/fs/cgroup/cpuset/platform/'
|
||||
cpuset_file = os.path.join(cpuset_path, 'cpuset.cpus')
|
||||
if not os.path.exists(cpuset_path):
|
||||
os.makedirs(cpuset_path)
|
||||
with open('/etc/platform/worker_reserved.conf', 'r') as infile:
|
||||
for line in infile:
|
||||
if "PLATFORM_CPU_LIST" in line:
|
||||
val = line.split("=")
|
||||
platform_cpulist = val[1].strip('\n')[1:-1].strip('"')
|
||||
with open(cpuset_file, 'w') as fd:
|
||||
LOG.info("Writing %s to file %s" % (platform_cpulist, cpuset_file))
|
||||
fd.write(platform_cpulist)
|
||||
|
||||
|
||||
def write_device_image_n3000(filename, pci_addr):
|
||||
# Write the firmware image to the FPGA at the specified PCI address.
|
||||
# We're assuming that the image update tools will catch the scenario
|
||||
# where the image is not compatible with the device.
|
||||
|
||||
# If the container exists, the host probably rebooted during
|
||||
# a device update. Delete the container.
|
||||
cleanup_container()
|
||||
|
||||
# Set cpu affinity for the container
|
||||
set_cgroup_cpuset()
|
||||
|
||||
try:
|
||||
# Build up the command to perform the firmware update.
|
||||
# Note the hack to work around OPAE tool locale issues
|
||||
cmd = ("ctr -n=k8s.io run --rm --privileged " +
|
||||
"--env LC_ALL=en_US.UTF-8 --env LANG=en_US.UTF-8 " +
|
||||
"--cgroup platform " +
|
||||
"--mount type=bind,src=" + DEVICE_IMAGE_CACHE_DIR +
|
||||
",dst=/mnt/images,options=rbind:ro " + constants.OPAE_IMG +
|
||||
" n3000-opae fpgasupdate -y --log-level debug /mnt/images/" +
|
||||
filename + " " + pci_addr)
|
||||
|
||||
# Issue the command to perform the firmware update.
|
||||
subprocess.check_output(shlex.split(cmd), # pylint: disable=not-callable
|
||||
stderr=subprocess.STDOUT)
|
||||
# TODO: switch to subprocess.Popen, parse the output and send
|
||||
# progress updates.
|
||||
except subprocess.CalledProcessError as exc:
|
||||
# Check the return code, send completion info to sysinv-conductor.
|
||||
msg = ("Failed to update device image %s for device %s, "
|
||||
"return code is %d, command output: %s." %
|
||||
(filename, pci_addr, exc.returncode,
|
||||
exc.output.decode('utf-8')))
|
||||
LOG.error(msg)
|
||||
LOG.error("Check for intel-max10 kernel logs.")
|
||||
raise exception.SysinvException(msg)
|
||||
|
||||
|
||||
def read_n3000_sysfs_file(pattern):
|
||||
# Read a sysfs file related to the N3000.
|
||||
# The result should be an empty string if the file doesn't exist,
|
||||
# or a single line of text if it does.
|
||||
|
||||
# Convert the pattern to a list of matching filenames
|
||||
filenames = glob(pattern)
|
||||
|
||||
# If there are no matching files, return an empty string.
|
||||
if len(filenames) == 0:
|
||||
return ""
|
||||
|
||||
# If there's more than one filename, complain.
|
||||
if len(filenames) > 1:
|
||||
LOG.warn("Pattern %s gave %s matching filenames, using the first." %
|
||||
(pattern, len(filenames)))
|
||||
|
||||
filename = filenames[0]
|
||||
infile = open(filename)
|
||||
try:
|
||||
line = infile.readline()
|
||||
return line.strip()
|
||||
except Exception:
|
||||
LOG.exception("Unable to read file %s" % filename)
|
||||
finally:
|
||||
infile.close()
|
||||
return ""
|
||||
|
||||
|
||||
def get_n3000_root_hash(pci_addr):
|
||||
# Query sysfs for the root key of the N3000 at the specified PCI address
|
||||
root_key_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + ROOT_HASH_PATH)
|
||||
root_key = read_n3000_sysfs_file(root_key_pattern)
|
||||
# If the root key hasn't been programmed, return an empty string.
|
||||
if root_key == "hash not programmed":
|
||||
root_key = ""
|
||||
return root_key
|
||||
|
||||
|
||||
def get_n3000_revoked_keys(pci_addr):
|
||||
# Query sysfs for revoked keys of the N3000 at the specified PCI address
|
||||
revoked_key_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + CANCELLED_CSKS_PATH)
|
||||
revoked_keys = read_n3000_sysfs_file(revoked_key_pattern)
|
||||
return revoked_keys
|
||||
|
||||
|
||||
def get_n3000_bitstream_id(pci_addr):
|
||||
# Query sysfs for bitstream ID of the N3000 at the specified PCI address
|
||||
bitstream_id_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
BITSTREAM_ID_PATH)
|
||||
bitstream_id = read_n3000_sysfs_file(bitstream_id_pattern)
|
||||
return bitstream_id
|
||||
|
||||
|
||||
def get_n3000_boot_page(pci_addr):
|
||||
# Query sysfs for boot page of the N3000 at the specified PCI address
|
||||
image_load_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + IMAGE_LOAD_PATH)
|
||||
image_load = read_n3000_sysfs_file(image_load_pattern)
|
||||
if image_load == "0":
|
||||
return "factory"
|
||||
elif image_load == "1":
|
||||
return "user"
|
||||
else:
|
||||
LOG.warn("Reading image load gave unexpected result: %s" % image_load)
|
||||
return ""
|
||||
|
||||
|
||||
def get_n3000_bmc_version(pci_addr, path):
|
||||
version_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + path)
|
||||
version = read_n3000_sysfs_file(version_pattern)
|
||||
|
||||
# If we couldn't read the file, return an empty string.
|
||||
if version == "":
|
||||
return ""
|
||||
|
||||
# We're expecting a 32-bit value, possibly with "0x" in front.
|
||||
try:
|
||||
vint = int(version, 16)
|
||||
except ValueError:
|
||||
return ""
|
||||
|
||||
if vint >= 1 << 32:
|
||||
LOG.warn("String (%s) read from file %s doesn't match the "
|
||||
"expected pattern" % (version, version_pattern))
|
||||
return ""
|
||||
# There's probably a better way than this.
|
||||
# We want to match the version that Intel's "fpgainfo" tool reports.
|
||||
return ("%s.%s.%s.%s" % (chr(vint >> 24), str(vint >> 16 & 0xff),
|
||||
str(vint >> 8 & 0xff), str(vint & 0xff)))
|
||||
|
||||
|
||||
def get_n3000_bmc_fw_version(pci_addr):
|
||||
return get_n3000_bmc_version(pci_addr, BMC_FW_VER_PATH)
|
||||
|
||||
|
||||
def get_n3000_bmc_build_version(pci_addr):
|
||||
return get_n3000_bmc_version(pci_addr, BMC_BUILD_VER_PATH)
|
||||
|
||||
|
||||
def get_n3000_retimer_version(pci_addr, path):
|
||||
version_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
|
||||
SPI_PATH + path)
|
||||
version = read_n3000_sysfs_file(version_pattern)
|
||||
if len(version) > RETIMER_VERSION_LENGTH:
|
||||
LOG.warn("Retimer version string (%s) read from file %s is "
|
||||
"unexpectedly long. It is truncating." %
|
||||
(version, version_pattern))
|
||||
version = version[:RETIMER_VERSION_LENGTH]
|
||||
return version
|
||||
|
||||
|
||||
def get_n3000_retimer_a_version(pci_addr):
|
||||
return get_n3000_retimer_version(pci_addr, RETIMER_A_VER_PATH)
|
||||
|
||||
|
||||
def get_n3000_retimer_b_version(pci_addr):
|
||||
return get_n3000_retimer_version(pci_addr, RETIMER_B_VER_PATH)
|
||||
|
||||
|
||||
def get_n3000_devices():
|
||||
# First get the PCI addresses of each supported FPGA device
|
||||
cmd = ["lspci", "-Dm", "-d " + constants.N3000_VENDOR + ":" +
|
||||
constants.N3000_DEVICE]
|
||||
|
||||
try:
|
||||
output = subprocess.check_output( # pylint: disable=not-callable
|
||||
cmd, stderr=subprocess.STDOUT, universal_newlines=True)
|
||||
except subprocess.CalledProcessError as exc:
|
||||
msg = ("Failed to get pci devices with vendor %s and device %s, "
|
||||
"return code is %d, command output: %s." %
|
||||
(constants.N3000_VENDOR, constants.N3000_DEVICE, exc.returncode, exc.output))
|
||||
LOG.warn(msg)
|
||||
raise exception.SysinvException(msg)
|
||||
|
||||
# Parse the output of the lspci command and grab the PCI address
|
||||
fpga_addrs = []
|
||||
for line in output.splitlines():
|
||||
line = shlex.split(line.strip())
|
||||
fpga_addrs.append(line[0])
|
||||
return fpga_addrs
|
||||
|
||||
|
||||
def get_n3000_pci_info():
|
||||
""" Query PCI information about N3000 PCI devices.
|
||||
|
||||
This needs to exactly mirror what sysinv-agent does as far as PCI
|
||||
updates. We could potentially modify sysinv-agent to do the PCI
|
||||
updates when triggered by an RPC cast, but we don't need to rescan
|
||||
all PCI devices, just the N3000 devices.
|
||||
"""
|
||||
pci_devs = []
|
||||
pci_device_list = []
|
||||
try:
|
||||
pci_operator = pci.PCIOperator()
|
||||
# We want to get updated info for the FPGA itself and any "virtual"
|
||||
# PCI devices implemented by the FPGA. This loop isn't very
|
||||
# efficient, but so far it's only a small number of devices.
|
||||
pci_devices = []
|
||||
for device in constants.N3000_DEVICES:
|
||||
pci_devices.extend(pci_operator.pci_devices_get(
|
||||
vendor=constants.N3000_VENDOR, device=device))
|
||||
for pci_dev in pci_devices:
|
||||
pci_dev_array = pci_operator.pci_get_device_attrs(
|
||||
pci_dev.pciaddr)
|
||||
for dev in pci_dev_array:
|
||||
pci_devs.append(pci.PCIDevice(pci_dev, **dev))
|
||||
|
||||
is_fpga_n3000_reset = \
|
||||
os.path.exists(constants.N3000_RESET_FLAG)
|
||||
|
||||
for dev in pci_devs:
|
||||
pci_dev_dict = {'name': dev.name,
|
||||
'pciaddr': dev.pci.pciaddr,
|
||||
'pclass_id': dev.pclass_id,
|
||||
'pvendor_id': dev.pvendor_id,
|
||||
'pdevice_id': dev.pdevice_id,
|
||||
'pclass': dev.pci.pclass,
|
||||
'pvendor': dev.pci.pvendor,
|
||||
'pdevice': dev.pci.pdevice,
|
||||
'prevision': dev.pci.prevision,
|
||||
'psvendor': dev.pci.psvendor,
|
||||
'psdevice': dev.pci.psdevice,
|
||||
'numa_node': dev.numa_node,
|
||||
'sriov_totalvfs': dev.sriov_totalvfs,
|
||||
'sriov_numvfs': dev.sriov_numvfs,
|
||||
'sriov_vfs_pci_address': dev.sriov_vfs_pci_address,
|
||||
'sriov_vf_driver': dev.sriov_vf_driver,
|
||||
'sriov_vf_pdevice_id': dev.sriov_vf_pdevice_id,
|
||||
'driver': dev.driver,
|
||||
'enabled': dev.enabled,
|
||||
'extra_info': dev.extra_info,
|
||||
'fpga_n3000_reset': is_fpga_n3000_reset}
|
||||
LOG.debug('Sysinv FPGA Agent dev {}'.format(pci_dev_dict))
|
||||
pci_device_list.append(pci_dev_dict)
|
||||
except Exception:
|
||||
LOG.exception("Unable to query FPGA pci information, "
|
||||
"sysinv DB will be stale")
|
||||
|
||||
return pci_device_list
|
||||
|
||||
|
||||
def watchdog_action(action):
|
||||
if action not in ["stop", "start"]:
|
||||
LOG.warn("watchdog_action called with invalid action: %s", action)
|
||||
return
|
||||
try:
|
||||
# Build up the command to perform the action.
|
||||
cmd = ["systemctl", action, "hostw"]
|
||||
|
||||
# Issue the command to stop/start the watchdog
|
||||
subprocess.check_output( # pylint: disable=not-callable
|
||||
cmd, stderr=subprocess.STDOUT,
|
||||
universal_newlines=True)
|
||||
except subprocess.CalledProcessError as exc:
|
||||
msg = ("Failed to %s hostw service, "
|
||||
"return code is %d, command output: %s." %
|
||||
(action, exc.returncode, exc.output))
|
||||
LOG.warn(msg)
|
||||
|
||||
|
||||
def stop_watchdog():
|
||||
watchdog_action("stop")
|
||||
|
||||
|
||||
def start_watchdog():
|
||||
watchdog_action("start")
|
||||
|
||||
|
||||
class FpgaAgentManager(service.PeriodicService):
|
||||
"""Sysinv FPGA Agent service main class."""
|
||||
|
||||
RPC_API_VERSION = '1.0'
|
||||
|
||||
def __init__(self, host, topic):
|
||||
serializer = objects_base.SysinvObjectSerializer()
|
||||
super(FpgaAgentManager, self).__init__(host, topic, serializer=serializer)
|
||||
|
||||
self.host_uuid = None
|
||||
|
||||
def start(self):
|
||||
super(FpgaAgentManager, self).start()
|
||||
|
||||
if os.path.isfile('/etc/sysinv/sysinv.conf'):
|
||||
LOG.info('sysinv-fpga-agent started')
|
||||
else:
|
||||
LOG.info('No config file for sysinv-fpga-agent found.')
|
||||
raise exception.ConfigNotFound(message="Unable to find sysinv config file!")
|
||||
|
||||
# Wait for puppet to finish resetting n3000 devices
|
||||
wait_for_n3000_reset()
|
||||
# Wait around until someone else updates the platform.conf file
|
||||
# with our host UUID.
|
||||
self.wait_for_host_uuid()
|
||||
|
||||
context = ctx.get_admin_context()
|
||||
|
||||
# Collect updated PCI device information for N3000 FPGAs
|
||||
# and send it to sysinv-conductor
|
||||
self.fpga_pci_update(context)
|
||||
|
||||
# Collect FPGA inventory and report to conductor.
|
||||
self.report_fpga_inventory(context)
|
||||
|
||||
def periodic_tasks(self, context, raise_on_error=False):
|
||||
""" Periodic tasks are run at pre-specified intervals. """
|
||||
return self.run_periodic_tasks(context, raise_on_error=raise_on_error)
|
||||
|
||||
def wait_for_host_uuid(self):
|
||||
# Get our host UUID from /etc/platform/platform.conf. Note that the
|
||||
# file can exist before the UUID is written to it.
|
||||
prefix = "UUID="
|
||||
while self.host_uuid is None:
|
||||
if os.path.isfile(tsc.PLATFORM_CONF_FILE):
|
||||
with open(tsc.PLATFORM_CONF_FILE, 'r') as platform_file:
|
||||
for line in platform_file:
|
||||
line = line.strip()
|
||||
if not line.startswith(prefix):
|
||||
continue
|
||||
uuid = line[len(prefix):]
|
||||
if uuidutils.is_uuid_like(uuid):
|
||||
self.host_uuid = uuid
|
||||
LOG.info("Agent found host UUID: %s" % uuid)
|
||||
break
|
||||
else:
|
||||
LOG.info("UUID entry: %s in platform.conf "
|
||||
"isn't uuid-like" % uuid)
|
||||
|
||||
time.sleep(5)
|
||||
|
||||
def report_fpga_inventory(self, context):
|
||||
"""Collect FPGA data for this host.
|
||||
|
||||
This method allows host FPGA data to be collected.
|
||||
|
||||
:param: context: an admin context
|
||||
:returns: nothing
|
||||
"""
|
||||
|
||||
host_uuid = self.host_uuid
|
||||
|
||||
rpcapi = conductor_rpcapi.ConductorAPI(
|
||||
topic=conductor_rpcapi.MANAGER_TOPIC)
|
||||
|
||||
fpgainfo_list = self.get_fpga_info()
|
||||
|
||||
LOG.info("reporting FPGA inventory for host %s: %s" %
|
||||
(host_uuid, fpgainfo_list))
|
||||
try:
|
||||
rpcapi.fpga_device_update_by_host(context, host_uuid, fpgainfo_list)
|
||||
except exception.SysinvException:
|
||||
LOG.exception("Exception updating fpga devices.")
|
||||
pass
|
||||
|
||||
def get_fpga_info(self):
|
||||
# For now we only support the N3000, eventually we may need to support
|
||||
# other FPGA devices.
|
||||
|
||||
# Get a list of N3000 FPGA device addresses.
|
||||
fpga_addrs = get_n3000_devices()
|
||||
|
||||
# Next, get additional information information for devices in the list.
|
||||
fpgainfo_list = []
|
||||
for addr in fpga_addrs:
|
||||
# Store information for this FPGA
|
||||
fpgainfo = {'pciaddr': addr}
|
||||
fpgainfo['bmc_build_version'] = get_n3000_bmc_build_version(addr)
|
||||
fpgainfo['bmc_fw_version'] = get_n3000_bmc_fw_version(addr)
|
||||
fpgainfo['retimer_a_version'] = get_n3000_retimer_a_version(addr)
|
||||
fpgainfo['retimer_b_version'] = get_n3000_retimer_b_version(addr)
|
||||
fpgainfo['boot_page'] = get_n3000_boot_page(addr)
|
||||
fpgainfo['bitstream_id'] = get_n3000_bitstream_id(addr)
|
||||
fpgainfo['root_key'] = get_n3000_root_hash(addr)
|
||||
fpgainfo['revoked_key_ids'] = get_n3000_revoked_keys(addr)
|
||||
|
||||
# TODO: Also retrieve the information about which NICs are on
|
||||
# the FPGA device.
|
||||
|
||||
fpgainfo_list.append(fpgainfo)
|
||||
|
||||
return fpgainfo_list
|
||||
|
||||
def fpga_pci_update(self, context):
|
||||
"""Collect FPGA PCI data for this host.
|
||||
|
||||
We know that the PCI address of the N3000 can change the first time
|
||||
We reset it after boot, so we need to gather the new PCI device
|
||||
information and send it to sysinv-conductor.
|
||||
|
||||
This needs to exactly mirror what sysinv-agent does as far as PCI
|
||||
updates. We could potentially modify sysinv-agent to do the PCI
|
||||
updates when triggered by an RPC cast, but we don't need to rescan
|
||||
all PCI devices, just the N3000 devices.
|
||||
|
||||
:param: context: an admin context
|
||||
:returns: nothing
|
||||
"""
|
||||
|
||||
LOG.info("Updating N3000 PCI info.")
|
||||
pci_device_list = get_n3000_pci_info()
|
||||
|
||||
rpcapi = conductor_rpcapi.ConductorAPI(
|
||||
topic=conductor_rpcapi.MANAGER_TOPIC)
|
||||
|
||||
host_uuid = self.host_uuid
|
||||
try:
|
||||
if pci_device_list:
|
||||
LOG.info("reporting N3000 PCI devices for host %s: %s" %
|
||||
(host_uuid, pci_device_list))
|
||||
|
||||
# Don't ask conductor to cleanup stale entries while worker
|
||||
# manifest is not complete. For N3000 device, it could get rid
|
||||
# of a valid entry with a different PCI address but restored
|
||||
# from previous database backup
|
||||
cleanup_stale = \
|
||||
os.path.exists(tsc.VOLATILE_WORKER_CONFIG_COMPLETE)
|
||||
rpcapi.pci_device_update_by_host(context,
|
||||
host_uuid,
|
||||
pci_device_list,
|
||||
cleanup_stale)
|
||||
except Exception:
|
||||
LOG.exception("Exception updating n3000 PCI devices, "
|
||||
"this will likely cause problems.")
|
||||
pass
|
||||
|
||||
def device_update_image(self, context, pci_addr, filename, transaction_id,
|
||||
retimer_included):
|
||||
"""Write the device image to the device at the specified address.
|
||||
|
||||
Transaction is the transaction ID as specified by sysinv-conductor.
|
||||
|
||||
This must send back either success or failure to sysinv-conductor
|
||||
via an RPC cast. The transaction ID is sent back to allow sysinv-conductor
|
||||
to locate the transaction in the DB.
|
||||
|
||||
TODO: could get fancier with an image cache and delete based on LRU.
|
||||
"""
|
||||
|
||||
rpcapi = conductor_rpcapi.ConductorAPI(
|
||||
topic=conductor_rpcapi.MANAGER_TOPIC)
|
||||
|
||||
try:
|
||||
LOG.info("ensure device image cache exists")
|
||||
ensure_device_image_cache_exists()
|
||||
|
||||
# Pull the image from the controller via HTTP
|
||||
LOG.info("fetch device image %s" % filename)
|
||||
local_path = fetch_device_image(filename)
|
||||
|
||||
# TODO: check CSK used to sign image, ensure it hasn't been cancelled
|
||||
# TODO: check root key used to sign image, ensure it matches root key of hardware
|
||||
# Note: may want to check these in the sysinv API too.
|
||||
|
||||
try:
|
||||
LOG.info("setting transaction id %s as in progress" % transaction_id)
|
||||
rpcapi.device_update_image_status(
|
||||
context, self.host_uuid, transaction_id,
|
||||
dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS)
|
||||
|
||||
# Disable the watchdog service to prevent a reboot on things
|
||||
# like critical process death. We don't want to reboot while
|
||||
# flashing the FPGA.
|
||||
stop_watchdog()
|
||||
|
||||
# Write the image to the specified PCI device.
|
||||
# TODO: when we support more than just N3000, we'll need to
|
||||
# pick the appropriate low-level write function based on the
|
||||
# hardware type.
|
||||
LOG.info("writing device image %s to device %s" % (filename, pci_addr))
|
||||
write_device_image_n3000(filename, pci_addr)
|
||||
|
||||
# If we get an exception trying to send the status update
|
||||
# there's not much we can do.
|
||||
try:
|
||||
LOG.info("setting transaction id %s as complete" % transaction_id)
|
||||
rpcapi.device_update_image_status(
|
||||
context, self.host_uuid, transaction_id,
|
||||
dconstants.DEVICE_IMAGE_UPDATE_COMPLETED)
|
||||
except Exception:
|
||||
LOG.exception("Unable to send fpga update image status "
|
||||
"completion message for transaction %s."
|
||||
% transaction_id)
|
||||
finally:
|
||||
# Delete the image file.
|
||||
os.remove(local_path)
|
||||
# start the watchdog service again
|
||||
start_watchdog()
|
||||
# If device image contains c827 retimer firmware, set the retimer flag
|
||||
if retimer_included:
|
||||
utils.touch(constants.N3000_RETIMER_FLAG)
|
||||
|
||||
except exception.SysinvException as exc:
|
||||
LOG.info("setting transaction id %s as failed" % transaction_id)
|
||||
rpcapi.device_update_image_status(context, self.host_uuid,
|
||||
transaction_id,
|
||||
dconstants.DEVICE_IMAGE_UPDATE_FAILED,
|
||||
six.text_type(exc))
|
@ -1,63 +0,0 @@
|
||||
# vim: tabstop=4 shiftwidth=4 softtabstop=4
|
||||
# coding=utf-8
|
||||
|
||||
# Copyright 2013 Hewlett-Packard Development Company, L.P.
|
||||
# All Rights Reserved.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
#
|
||||
# Copyright (c) 2020 Wind River Systems, Inc.
|
||||
#
|
||||
|
||||
"""
|
||||
Client side of the agent RPC API.
|
||||
"""
|
||||
|
||||
from oslo_log import log
|
||||
from sysinv.objects import base as objects_base
|
||||
import sysinv.openstack.common.rpc.proxy
|
||||
|
||||
LOG = log.getLogger(__name__)
|
||||
|
||||
MANAGER_TOPIC = 'sysinv.fpga_agent_manager'
|
||||
|
||||
|
||||
class AgentAPI(sysinv.openstack.common.rpc.proxy.RpcProxy):
|
||||
"""Client side of the agent RPC API.
|
||||
|
||||
API version history:
|
||||
|
||||
1.0 - Initial version.
|
||||
"""
|
||||
|
||||
RPC_API_VERSION = '1.0'
|
||||
|
||||
def __init__(self, topic=None):
|
||||
if topic is None:
|
||||
topic = MANAGER_TOPIC
|
||||
|
||||
super(AgentAPI, self).__init__(
|
||||
topic=topic,
|
||||
serializer=objects_base.SysinvObjectSerializer(),
|
||||
default_version=self.RPC_API_VERSION)
|
||||
|
||||
def host_device_update_image(self, context, hostname, pci_addr,
|
||||
filename, transaction_id, retimer_included):
|
||||
LOG.info("sending device_update_image to host %s" % hostname)
|
||||
topic = '%s.%s' % (self.topic, hostname)
|
||||
return self.cast(context,
|
||||
self.make_msg('device_update_image',
|
||||
pci_addr=pci_addr, filename=filename,
|
||||
transaction_id=transaction_id,
|
||||
retimer_included=retimer_included),
|
||||
topic=topic)
|
@ -27,7 +27,7 @@ from sysinv.agent.pci import PCIOperator
|
||||
from sysinv.agent.pci import PCI
|
||||
from sysinv.agent.manager import AgentManager
|
||||
from sysinv.tests import base
|
||||
from sysinv.fpga_agent import constants as fpga_constants
|
||||
from sysinv.common import fpga_constants
|
||||
import tsconfig.tsconfig as tsc
|
||||
|
||||
FAKE_LSPCI_OUTPUT = {
|
||||
|
@ -16,7 +16,7 @@ from six.moves import http_client
|
||||
|
||||
from sysinv.common import constants
|
||||
from sysinv.common import device as dconstants
|
||||
from sysinv.fpga_agent import constants as fpga_constants
|
||||
from sysinv.common import fpga_constants
|
||||
from sysinv.tests.api import base
|
||||
from sysinv.tests.db import base as dbbase
|
||||
from sysinv.tests.db import utils as dbutils
|
||||
|
Loading…
x
Reference in New Issue
Block a user