sysinv FPGA agent initial commit

This creates a new sysinv FPGA agent.  On startup it will perform
an initial scan for supported FPGA devices and report the current
hardware status to sysinv-conductor via RPC.

It also provides basic support for flashing the specified device
images to the FPGA device using Intel-supplied tools running
in a Docker container.

Initially only the Intel N3000 FPGA is supported.

Story: 2006740
Task:  39927
Change-Id: Id8a6510a2d8cd072737a98c5d909f94dbf10a763
Depends-On: I63cfa7698285a1a43f1e9e4b98e9a536fc3dc682
This commit is contained in:
Chris Friesen 2020-03-30 15:32:45 -06:00
parent e3943e9a8b
commit 152604297d
28 changed files with 1606 additions and 7 deletions

View File

@ -20,6 +20,9 @@ cgts-client
# sysinv-agent
sysinv-agent
# sysinv-fpga-agent
sysinv-fpga-agent
# sysinv
sysinv

View File

@ -3,6 +3,7 @@ controllerconfig
storageconfig
sysinv/cgts-client
sysinv/sysinv-agent
sysinv/sysinv-fpga-agent
sysinv/sysinv
config-gate
tsconfig

6
sysinv/sysinv-fpga-agent/.gitignore vendored Normal file
View File

@ -0,0 +1,6 @@
!.distro
.distro/centos7/rpmbuild/RPMS
.distro/centos7/rpmbuild/SRPMS
.distro/centos7/rpmbuild/BUILD
.distro/centos7/rpmbuild/BUILDROOT
.distro/centos7/rpmbuild/SOURCES/sysinv-fpga-agent*tar.gz

View File

@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -0,0 +1,12 @@
Metadata-Version: 1.1
Name: sysinv-fpga-agent
Version: 1.0
Summary: StarlingX FPGA Agent Package
Home-page:
Author: Windriver
Author-email: info@windriver.com
License: Apache-2.0
Description: StarlingX FPGA Agent Package
Platform: UNKNOWN

View File

@ -0,0 +1,4 @@
SRC_DIR="."
COPY_LIST_TO_TAR="LICENSE sysinv-fpga-agent sysinv-fpga-agent.conf"
EXCLUDE_LIST_FROM_TAR="centos opensuse"
TIS_PATCH_VER=PKG_GITREVCOUNT

View File

@ -0,0 +1,46 @@
Summary: StarlingX FPGA Agent Package
Name: sysinv-fpga-agent
Version: 1.0
Release: %{tis_patch_ver}%{?_tis_dist}
License: Apache-2.0
Group: base
Packager: Wind River <info@windriver.com>
URL: unknown
Source0: %{name}-%{version}.tar.gz
BuildRequires: systemd-devel
%description
StarlingX FPGA Agent Package
%define local_etc_initd /etc/init.d/
%define local_etc_pmond /etc/pmon.d/
%define debug_package %{nil}
%prep
%setup
%build
%install
# compute init scripts
install -d -m 755 %{buildroot}%{local_etc_initd}
install -p -D -m 755 sysinv-fpga-agent %{buildroot}%{local_etc_initd}/sysinv-fpga-agent
install -d -m 755 %{buildroot}%{local_etc_pmond}
install -p -D -m 644 sysinv-fpga-agent.conf %{buildroot}%{local_etc_pmond}/sysinv-fpga-agent.conf
install -p -D -m 644 sysinv-fpga-agent.service %{buildroot}%{_unitdir}/sysinv-fpga-agent.service
%post
/usr/bin/systemctl enable sysinv-fpga-agent.service >/dev/null 2>&1
%clean
rm -rf $RPM_BUILD_ROOT
%files
%defattr(-,root,root,-)
%doc LICENSE
%{local_etc_initd}/sysinv-fpga-agent
%{local_etc_pmond}/sysinv-fpga-agent.conf
%{_unitdir}/sysinv-fpga-agent.service

View File

@ -0,0 +1,4 @@
-------------------------------------------------------------------
Mon May 25 13:47:02 CST 2020 - chris.friesen@windriver.com
- 1.0 Initial Commit

View File

@ -0,0 +1 @@
setBadness('script-without-shebang', 2)

View File

@ -0,0 +1,64 @@
Summary: StarlingX FPGA Agent Package
Name: sysinv-fpga-agent
Version: 1.0.0
Release: %{tis_patch_ver}%{?_tis_dist}
License: Apache-2.0
Group: Development/Tools/Other
URL: https://opendev.org/starlingx/config
Source0: %{name}-%{version}.tar.gz
BuildRequires: systemd-devel
Requires: python-django
Requires: python-oslo.messaging
Requires: python-retrying
BuildArch: noarch
%description
StarlingX FPGA Agent Package
%define local_etc_initd /etc/init.d/
%define local_etc_pmond /etc/pmon.d/
%define debug_package %{nil}
%prep
%setup
%build
%install
# compute init scripts
install -d -m 755 %{buildroot}%{local_etc_initd}
install -p -D -m 755 sysinv-fpga-agent %{buildroot}%{local_etc_initd}/sysinv-fpga-agent
install -d -m 755 %{buildroot}%{local_etc_pmond}
install -p -D -m 644 sysinv-fpga-agent.conf %{buildroot}%{local_etc_pmond}/sysinv-fpga-agent.conf
install -p -D -m 644 sysinv-fpga-agent.service %{buildroot}%{_unitdir}/sysinv-fpga-agent.service
%clean
rm -rf $RPM_BUILD_ROOT
%pre
%service_add_pre sysinv-fpga-agent.service sysinv-fpga-agent.target
%post
%service_add_post sysinv-fpga-agent.service sysinv-fpga-agent.target
%preun
%service_del_preun sysinv-fpga-agent.service sysinv-fpga-agent.target
%postun
%service_del_postun sysinv-fpga-agent.service sysinv-fpga-agent.target
%files
%defattr(-,root,root,-)
%doc LICENSE
%dir %{local_etc_pmond}
%{local_etc_initd}/sysinv-fpga-agent
%config %{local_etc_pmond}/sysinv-fpga-agent.conf
%{_unitdir}/sysinv-fpga-agent.service
%changelog

View File

@ -0,0 +1,120 @@
#! /bin/sh
#
# Copyright (c) 2020 Wind River Systems, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
#
# chkconfig: 2345 76 25
#
### BEGIN INIT INFO
# Provides: sysinv-fpga-agent
# Default-Start: 3 5
# Required-Start:
# Required-Stop:
# Default-Stop: 0 1 2 6
# Short-Description: Daemon to handle FPGA device updates
### END INIT INFO
. /etc/init.d/functions
. /etc/build.info
DAEMON_NAME="sysinv-fpga-agent"
SYSINVFPGAAGENT="/usr/bin/${DAEMON_NAME}"
SYSINV_CONF_DIR="/etc/sysinv"
SYSINV_CONF_FILE="${SYSINV_CONF_DIR}/sysinv.conf"
DELAY_SEC=20
daemon_pidfile="/var/run/${DAEMON_NAME}.pid"
if [ ! -e "${SYSINVFPGAAGENT}" ] ; then
logger "$0: ${SYSINVFPGAAGENT} is missing"
exit 1
fi
RETVAL=0
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/bin
export PATH
case "$1" in
start)
# Check for installation failure
if [ -f /etc/platform/installation_failed ] ; then
logger "$0: /etc/platform/installation_failed flag is set. Aborting."
exit 1
fi
if [ -e ${daemon_pidfile} ] ; then
echo "Killing existing process before starting new"
pid=`cat ${daemon_pidfile}`
kill -TERM $pid
rm -f ${daemon_pidfile}
fi
# Assume that sysinv-agent will ensure that the sysinv.conf file is available.
echo -n "Waiting for sysinv config file"
while [ ! -e ${SYSINV_CONF_FILE} ]
do
sleep 1
done
echo -n "Starting sysinv-fpga-agent: "
/bin/sh -c "${SYSINVFPGAAGENT}"' >> /dev/null 2>&1 & echo $!' > ${daemon_pidfile}
RETVAL=$?
if [ $RETVAL -eq 0 ] ; then
echo "OK"
touch /var/lock/subsys/${DAEMON_NAME}
else
echo "FAIL"
fi
;;
stop)
echo -n "Stopping sysinv-fpga-agent: "
if [ -e ${daemon_pidfile} ] ; then
pid=`cat ${daemon_pidfile}`
kill -TERM $pid
rm -f ${daemon_pidfile}
rm -f /var/lock/subsys/${DAEMON_NAME}
echo "OK"
else
echo "FAIL"
fi
;;
restart)
$0 stop
sleep 1
$0 start
;;
status)
if [ -e ${daemon_pidfile} ] ; then
pid=`cat ${daemon_pidfile}`
ps -p $pid | grep -v "PID TTY" >> /dev/null 2>&1
if [ $? -eq 0 ] ; then
echo "sysinv-fpga-agent is running"
RETVAL=0
else
echo "sysinv-fpga-agent is not running"
RETVAL=1
fi
else
echo "sysinv-fpga-agent is not running ; no pidfile"
RETVAL=1
fi
;;
condrestart)
[ -f /var/lock/subsys/$DAEMON_NAME ] && $0 restart
;;
*)
echo "usage: $0 { start | stop | status | restart | condrestart | status }"
;;
esac
exit $RETVAL

View File

@ -0,0 +1,9 @@
[process]
process = sysinv-fpga-agent
pidfile = /var/run/sysinv-fpga-agent.pid
service = sysinv-fpga-agent
style = lsb ; ocf or lsb
severity = major ; minor, major, critical
restarts = 3 ; restarts before error assertion
interval = 5 ; number of seconds to wait between restarts
debounce = 20 ; number of seconds to wait before degrade clear

View File

@ -0,0 +1,15 @@
[Unit]
Description=StarlingX FPGA Agent
After=nfscommon.service sw-patch.service
After=network-online.target systemd-udev-settle.service sysinv-agent.service
Before=pmon.service
[Service]
Type=forking
RemainAfterExit=yes
ExecStart=/etc/init.d/sysinv-fpga-agent start
ExecStop=/etc/init.d/sysinv-fpga-agent stop
PIDFile=/var/run/sysinv-fpga-agent.pid
[Install]
WantedBy=multi-user.target

View File

@ -108,6 +108,7 @@ install -m 644 -p -D scripts/sysinv-conductor.service %{buildroot}%{_unitdir}/sy
#install -p -D -m 755 %{buildroot}/usr/bin/sysinv-api %{buildroot}/usr/bin/sysinv-api
#install -p -D -m 755 %{buildroot}/usr/bin/sysinv-agent %{buildroot}/usr/bin/sysinv-agent
#install -p -D -m 755 %{buildroot}/usr/bin/sysinv-fpga-agent %{buildroot}/usr/bin/sysinv-fpga-agent
#install -p -D -m 755 %{buildroot}/usr/bin/sysinv-conductor %{buildroot}/usr/bin/sysinv-conductor
install -d -m 755 %{buildroot}%{local_bindir}
@ -145,6 +146,7 @@ rm -rf $RPM_BUILD_ROOT
%{_unitdir}/sysinv-conductor.service
%{_bindir}/sysinv-agent
%{_bindir}/sysinv-fpga-agent
%{_bindir}/sysinv-api
%{_bindir}/sysinv-conductor
%{_bindir}/sysinv-dbsync

View File

@ -109,6 +109,7 @@ install -m 644 -p -D scripts/sysinv-conductor.service %{buildroot}%{_unitdir}/sy
#install -p -D -m 755 %%{buildroot}/usr/bin/sysinv-api %%{buildroot}/usr/bin/sysinv-api
#install -p -D -m 755 %%{buildroot}/usr/bin/sysinv-agent %%{buildroot}/usr/bin/sysinv-agent
#install -p -D -m 755 %%{buildroot}/usr/bin/sysinv-fpga-agent %%{buildroot}/usr/bin/sysinv-fpga-agent
#install -p -D -m 755 %%{buildroot}/usr/bin/sysinv-conductor %%{buildroot}/usr/bin/sysinv-conductor
install -d -m 755 %{buildroot}%{local_bindir}
@ -164,6 +165,7 @@ rm -rf $RPM_BUILD_ROOT
%{_unitdir}/sysinv-conductor.service
%{_bindir}/sysinv-agent
%{_bindir}/sysinv-fpga-agent
%{_bindir}/sysinv-api
%{_bindir}/sysinv-conductor
%{_bindir}/sysinv-dbsync

View File

@ -29,6 +29,7 @@ packages =
console_scripts =
sysinv-api = sysinv.cmd.api:main
sysinv-agent = sysinv.cmd.agent:main
sysinv-fpga-agent = sysinv.cmd.fpga_agent:main
sysinv-dbsync = sysinv.cmd.dbsync:main
sysinv-conductor = sysinv.cmd.conductor:main
sysinv-rootwrap = sysinv.openstack.common.rootwrap.cmd:main

View File

@ -173,8 +173,12 @@ class PCIOperator(object):
def format_lspci_output(self, device):
# hack for now
# NOTE: this does not properly handle the case where we have both
# "-r" and "-p" optional info in the lspci output.
if device[prevision].strip() == device[pvendor].strip():
# no revision info
# no revision info reported, device[prevision] now stores the
# psvendor, and device[psvendor] now stores the psdevice. We
# need to put things where they should be.
device.append(device[psvendor])
device[psvendor] = device[prevision]
device[prevision] = "0"

View File

@ -0,0 +1,49 @@
#!/usr/bin/env python
# -*- encoding: utf-8 -*-
#
# vim: tabstop=4 shiftwidth=4 softtabstop=4
#
# Copyright 2013 Hewlett-Packard Development Company, L.P.
# All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
# Copyright (c) 2020 Wind River Systems, Inc.
"""
The System Inventory FPGA Agent Service
"""
import sys
from oslo_config import cfg
from sysinv.openstack.common import service
from sysinv.common import service as sysinv_service
from sysinv.fpga_agent import manager
from sysinv import sanity_coverage
CONF = cfg.CONF
def main():
if sanity_coverage.flag_file_exists():
sanity_coverage.start()
# Parse config file and command line options, then start logging
sysinv_service.prepare_service(sys.argv)
# beware: connection is based upon host and MANAGER_TOPIC
mgr = manager.FpgaAgentManager(CONF.host, manager.MANAGER_TOPIC)
launcher = service.launch(mgr)
launcher.wait()

View File

@ -18,8 +18,10 @@ BITSTREAM_TYPE_KEY_REVOCATION = 'key-revocation'
# Device Image Status
DEVICE_IMAGE_UPDATE_PENDING = 'pending'
DEVICE_IMAGE_UPDATE_IN_PROGRESS = 'in-progress'
DEVICE_IMAGE_UPDATE_IN_PROGRESS_ABORTED = 'in-progress-aborted'
DEVICE_IMAGE_UPDATE_COMPLETED = 'completed'
DEVICE_IMAGE_UPDATE_FAILED = 'failed'
DEVICE_IMAGE_UPDATE_NULL = ''
# Device Image Action
APPLY_ACTION = 'apply'

View File

@ -102,6 +102,7 @@ from sysinv.conductor import openstack
from sysinv.conductor import docker_registry
from sysinv.conductor import keystone_listener
from sysinv.db import api as dbapi
from sysinv.fpga_agent import rpcapi as fpga_agent_rpcapi
from sysinv import objects
from sysinv.objects import base as objects_base
from sysinv.objects import kube_app as kubeapp_obj
@ -11665,17 +11666,93 @@ class ConductorManager(service.PeriodicService):
service_affecting=False)
self.fm_api.set_fault(fault)
def host_device_image_update(self, context, host_uuid):
"""Update the device image on this host"""
def host_device_image_update_next(self, context, host_uuid):
# Find the first device on this host that needs updating,
# and trigger an update of it.
try:
host = self.dbapi.ihost_get(host_uuid)
except exception.ServerNotFound:
# This really shouldn't happen.
LOG.exception("Unable to update device images, invalid host_uuid %s" % host_uuid)
return
host_obj = objects.host.get_by_uuid(context, host_uuid)
LOG.info("Updating device image on %s" % host_obj.hostname)
device_image_states = self.dbapi.device_image_state_get_all(
host_id=host.id,
status=dconstants.DEVICE_IMAGE_UPDATE_PENDING)
# At this point we expect host.device_image_update to be either
# "in-progress" or "in-progress-aborted".
# If we've aborted the device update operation and there are device
# image updates left to do on this host, then set the host status
# back to "pending" and return. If there are no device image updates
# left, then fall through to setting the host status to null below.
if (host.device_image_update == dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS_ABORTED and
device_image_states):
host.device_image_update = dconstants.DEVICE_IMAGE_UPDATE_PENDING
host.save(context)
return
# TODO: the code below needs to be updated to order the device images for a given
# device. For the N3000 we want to apply any root-key image first, then
# any key-revocation images, then any functional images.
for device_image_state in device_image_states:
# get the PCI device for the pending device image update
pci_device = objects.pci_device.get_by_uuid(context, device_image_state.pcidevice_id)
# figure out the filename for the device image
device_image = objects.device_image.get_by_uuid(context, device_image_state.image_id)
filename = cutils.format_image_filename(device_image)
LOG.info("sending rpc req to update image for host %s, pciaddr: %s, filename: %s, id: %s" %
(host.hostname, pci_device.pciaddr, filename, device_image_state.id))
fpga_rpcapi = fpga_agent_rpcapi.AgentAPI()
fpga_rpcapi.host_device_update_image(
context, host.hostname, pci_device.pciaddr, filename, device_image_state.id)
# We've kicked off a device image update, so exit the function.
return
LOG.info("no more device images to process")
# TODO: what should host.device_image_update be set to if one or more
# of the device image updates failed?
# Getting here should mean that we're done processing so we can
# clear the "this host is currently updating device images" flag.
host.device_image_update = dconstants.DEVICE_IMAGE_UPDATE_NULL
host.save(context)
def host_device_image_update(self, context, host_uuid):
"""Update any applied device images for devices on this host"""
host = objects.host.get_by_uuid(context, host_uuid)
LOG.info("Updating device image on %s" % host.hostname)
# Set any previously "failed" updates back to "pending" to retry them.
device_image_states = self.dbapi.device_image_state_get_all(
host_id=host.id,
status=dconstants.DEVICE_IMAGE_UPDATE_FAILED)
for device_image_state in device_image_states:
device_image_state.status = dconstants.DEVICE_IMAGE_UPDATE_PENDING
device_image_state.update_start_time = None
device_image_state.save(context)
# Update the host status.
host.device_image_update = dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS
host.save()
# Find the first device on this host that needs updating,
# and trigger an update of it.
self.host_device_image_update_next(context, host_uuid)
def host_device_image_update_abort(self, context, host_uuid):
"""Abort device image update on this host"""
host_obj = objects.host.get_by_uuid(context, host_uuid)
LOG.info("Aborting device image update on %s" % host_obj.hostname)
host = objects.host.get_by_uuid(context, host_uuid)
LOG.info("Aborting device image update on %s" % host.hostname)
# If the host status is currently pending or blank or already aborted
# then just leave it as-is.
if host.device_image_update == dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS:
host.device_image_update = dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS_ABORTED
host.save(context)
@periodic_task.periodic_task(spacing=CONF.conductor.audit_interval)
def _audit_device_image_update(self, context):
@ -11713,3 +11790,126 @@ class ConductorManager(service.PeriodicService):
entity_instance_id = "%s=%s" % (fm_constants.FM_ENTITY_TYPE_SYSTEM, system_uuid)
self.fm_api.clear_fault(fm_constants.FM_ALARM_ID_DEVICE_IMAGE_UPDATE_IN_PROGRESS,
entity_instance_id)
def fpga_device_update_by_host(self, context,
host_uuid, fpga_device_dict_array):
"""Create FPGA devices for an ihost with the supplied data.
This method allows records for FPGA devices for ihost to be created.
:param context: an admin context
:param host_uuid: host uuid
:param fpga_device_dict_array: initial values for device objects
:returns: either returns nothing or raises exception
"""
LOG.info("Entering device_update_by_host %s %s" %
(host_uuid, fpga_device_dict_array))
host_uuid.strip()
try:
host = self.dbapi.ihost_get(host_uuid)
except exception.ServerNotFound:
# This really shouldn't happen.
LOG.exception("Invalid host_uuid %s" % host_uuid)
return
for fpga_dev in fpga_device_dict_array:
LOG.info("Processing dev %s" % fpga_dev)
try:
dev_found = None
try:
dev = self.dbapi.fpga_device_get(fpga_dev['pciaddr'],
hostid=host['id'])
dev_found = dev
except Exception:
LOG.info("Attempting to create new device "
"%s on host %s" % (fpga_dev, host['id']))
# Look up the PCI device in the DB, we need the id.
try:
pci_dev = self.dbapi.pci_device_get(
fpga_dev['pciaddr'], hostid=host['id'])
fpga_dev['pci_id'] = pci_dev.id
except Exception as ex:
LOG.info("Unable to find pci device entry for "
"address %s on host id %s, can't create "
"fpga_device entry, ex: %s" %
(fpga_dev['pciaddr'], host['id'], str(ex)))
return
# Save the FPGA device to the DB.
try:
dev = self.dbapi.fpga_device_create(host['id'],
fpga_dev)
except Exception as ex:
LOG.info("Unable to create fpga_device entry for "
"address %s on host id %s, ex: %s" %
(fpga_dev['pciaddr'], host['id'], str(ex)))
return
# If the device existed already, update some of the fields
if dev_found:
try:
attr = {
'bmc_build_version': fpga_dev['bmc_build_version'],
'bmc_fw_version': fpga_dev['bmc_fw_version'],
'root_key': fpga_dev['root_key'],
'revoked_key_ids': fpga_dev['revoked_key_ids'],
'boot_page': fpga_dev['boot_page'],
'bitstream_id': fpga_dev['bitstream_id'],
}
LOG.info("attr: %s" % attr)
dev = self.dbapi.fpga_device_update(dev['uuid'], attr)
except Exception as ex:
LOG.exception("Failed to update fpga fields for "
"address %s on host id %s, ex: %s" %
(dev['pciaddr'], host['id'], str(ex)))
pass
except exception.NodeNotFound:
raise exception.SysinvException(_(
"Invalid host_uuid: host not found: %s") %
host_uuid)
except Exception:
pass
def device_update_image_status(self, context, host_uuid, transaction_id,
status, progress=None, err=None):
"""Update the status of an image-update operation.
This is a status update from the agent on the node regarding a
previously-triggered firmware update operation.
:param context: an admin context
:param host_uuid: the uuid of the host calling this function
:param transaction_id: uuid to allow us to find the transaction
:param status: status of the operation
:param progress: optional progress value if status is in-progress
:param err: error string (only set if status is failure)
:returns: either returns nothing or raises exception
"""
LOG.info("device_update_image_status: transaction_id: %s, status: %s, "
"progress: %s, err: %s" %
(transaction_id, status, progress, err))
# Save the status of the completed device image update in the db.
# The status should be one of dconstants.DEVICE_IMAGE_UPDATE_*
device_image_state = objects.device_image_state.get_by_uuid(
context, transaction_id)
device_image_state.status = status
if status == dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS:
device_image_state.update_start_time = timeutils.utcnow()
device_image_state.save()
# If the device image update completed, someone will need to reboot
# the host for it to take effect.
if status == dconstants.DEVICE_IMAGE_UPDATE_COMPLETED:
host = objects.host.get_by_uuid(context, host_uuid)
host.reboot_needed = True
host.save()
if status in [dconstants.DEVICE_IMAGE_UPDATE_COMPLETED,
dconstants.DEVICE_IMAGE_UPDATE_FAILED]:
# Find the next device on the same host that needs updating,
# and trigger an update of it.
self.host_device_image_update_next(context, host_uuid)

View File

@ -1947,3 +1947,44 @@ class ConductorAPI(sysinv.openstack.common.rpc.proxy.RpcProxy):
"""
return self.cast(context, self.make_msg('host_device_image_update_abort',
host_uuid=host_uuid))
def fpga_device_update_by_host(self, context, host_uuid,
fpga_device_dict_array):
"""
Asynchronously, update information on FPGA device.
This will check whether the current state of the device matches the
expected state, and if it doesn't then an alarm will be raised.
:param context:
:param host_uuid: The host_uuid for the caller.
:param fpga_device_dict_array: An array of device information.
:return:
"""
return self.cast(context,
self.make_msg('fpga_device_update_by_host',
host_uuid=host_uuid,
fpga_device_dict_array=fpga_device_dict_array))
def device_update_image_status(self, context, host_uuid, transaction_id,
status, progress=None, err=None):
"""
Asynchronously, update status of firmware update operation
This is used to report progress and final success/failure of an FPGA image write
operation. The transaction ID maps to a unique identifier in the sysinv DB so
we don't need to report host_uuid or device PCI address.
:param context:
:param host_uuid: The host_uuid for the host that is reporting the status.
:param transaction_id: The transaction ID representing this image-update operation.
:param status: The status of the image-update operation.
:param progress: Optional progress indicator.
:param err: Optional error message.
:return:
"""
return self.cast(context,
self.make_msg('device_update_image_status',
host_uuid=host_uuid,
transaction_id=transaction_id,
status=status,
progress=progress,
err=err))

View File

@ -3479,6 +3479,49 @@ class Connection(object):
:param from_state: The state of the 'from' load.
"""
@abc.abstractmethod
def fpga_device_create(self, hostid, values):
"""Create a new FPGA device for a host.
:param hostid: The id, uuid or database object of the host to which
the device belongs.
:param values: A dict containing several items used to identify
and track the device. For example:
{
'uuid': uuidutils.generate_uuid(),
'pciaddr': '0000:0b:01.0',
'pvendor_id': '8086',
'pdevice_id': '0b30',
...etc...
}
:returns: An FPGA device
"""
@abc.abstractmethod
def fpga_device_get(self, deviceid, hostid=None):
"""Return an FPGA device
:param deviceid: The id or uuid of an FPGA device.
:param hostid: The id or uuid of a host.
:returns: An FPGA device
"""
@abc.abstractmethod
def fpga_device_update(self, deviceid, values, hostid=None):
"""Update properties of an FPGA device.
:param deviceid: The id or uuid of an FPGA device.
:param values: Dict of values to update.
For example:
{
'boot_page': 'user',
'bitstream_id': '0x23000410010309',
}
:param hostid: The id or uuid of the host to which the FPGA
device belongs.
:returns: An FPGA device
"""
@abc.abstractmethod
def pci_device_create(self, hostid, values):
"""Create a new pci device for a host.

View File

@ -0,0 +1,11 @@
#
# Copyright (c) 2020 Wind River Systems, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
# vim: tabstop=4 shiftwidth=4 softtabstop=4
# coding=utf-8
# All Rights Reserved.
#

View File

@ -0,0 +1,499 @@
# vim: tabstop=4 shiftwidth=4 softtabstop=4
# coding=utf-8
# Copyright 2013 Hewlett-Packard Development Company, L.P.
# Copyright 2013 International Business Machines Corporation
# All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
#
# Copyright (c) 2020 Wind River Systems, Inc.
#
""" Perform activity related to FPGA devices on a single host.
A single instance of :py:class:`sysinv.agent.manager.FpgaAgentManager` is
created within the *sysinv-fpga-agent* process, and is responsible for
performing all actions for this host related to FPGA devices.
On start, collect and post FPGA inventory to conductor.
Commands (from conductors) are received via RPC calls.
"""
from __future__ import print_function
import errno
from eventlet.green import subprocess
from glob import glob
import os
import shlex
import shutil
import time
import tsconfig.tsconfig as tsc
import urllib
from oslo_config import cfg
from oslo_log import log
from oslo_utils import uuidutils
from sysinv.common import device as dconstants
from sysinv.common import exception
from sysinv.common import service
from sysinv.conductor import rpcapi as conductor_rpcapi
from sysinv.objects import base as objects_base
from sysinv.openstack.common import context as ctx
MANAGER_TOPIC = 'sysinv.fpga_agent_manager'
LOG = log.getLogger(__name__)
agent_opts = [
cfg.StrOpt('api_url',
default=None,
help=('Url of SysInv API service. If not set SysInv can '
'get current value from Keystone service catalog.')),
cfg.IntOpt('audit_interval',
default=60,
help='Maximum time since the last check-in of a agent'),
]
CONF = cfg.CONF
CONF.register_opts(agent_opts, 'fpga_agent')
# Currently we only support the following FPGA. In the future we may need to
# expand this to a list of devices, each with their own special set of
# device-specific information.
FPGA_VENDOR = "8086"
FPGA_DEVICE = "0b30"
# TODO: Make this specified in the config file.
# This is the docker image containing the OPAE tools to access the FPGA device.
OPAE_IMG = "registry.local:9001/docker.io/starlingx/n3000-opae:stx.4.0-v1.0.0"
# This is the location where we cache the device image file while
# writing it to the hardware.
DEVICE_IMAGE_CACHE_DIR = "/usr/local/share/applications/sysinv"
SYSFS_DEVICE_PATH = "/sys/bus/pci/devices/"
FME_PATH = "/fpga/intel-fpga-dev.*/intel-fpga-fme.*/"
SPI_PATH = "spi-altera.*.auto/spi_master/spi*/spi*.*/"
# These are relative to FME_PATH
BITSTREAM_ID_PATH = "bitstream_id"
# These are relative to SPI_PATH
ROOT_HASH_PATH = "ifpga_sec_mgr/ifpga_sec*/security/sr_root_hash"
CANCELLED_CSKS_PATH = "ifpga_sec_mgr/ifpga_sec*/security/sr_canceled_csks"
IMAGE_LOAD_PATH = "fpga_flash_ctrl/fpga_image_load"
BMC_FW_VER_PATH = "bmcfw_flash_ctrl/bmcfw_version"
BMC_BUILD_VER_PATH = "max10_version"
def ensure_device_image_cache_exists():
# Make sure the image cache directory exists, create it if needed.
try:
os.mkdir(DEVICE_IMAGE_CACHE_DIR, 0o755)
except OSError as exc:
if exc.errno != errno.EEXIST:
msg = ("Unable to create device image cache directory %s!"
% DEVICE_IMAGE_CACHE_DIR)
LOG.exception(msg)
raise exception.SysinvException(msg)
def fetch_device_image(filename):
# Pull the image from the controller.
url = "http://controller:8080/device_images/" + filename
local_path = DEVICE_IMAGE_CACHE_DIR + "/" + filename
try:
imagefile, headers = urllib.urlretrieve(url, local_path)
except IOError:
msg = ("Unable to retrieve device image from %s!" % url)
LOG.exception(msg)
raise exception.SysinvException(msg)
return local_path
def fetch_device_image_local(filename):
# This is a hack since we only support AIO for now. Just copy the device
# image file into the well-known device image cache directory.
local_path = DEVICE_IMAGE_CACHE_DIR + "/" + filename
image_file_path = os.path.join(dconstants.DEVICE_IMAGE_PATH, filename)
try:
shutil.copyfile(image_file_path, local_path)
except (shutil.Error, IOError):
msg = ("Unable to retrieve device image from %s!" % image_file_path)
LOG.exception(msg)
raise exception.SysinvException(msg)
return local_path
def write_device_image_n3000(filename, pci_addr):
# Write the firmware image to the FPGA at the specified PCI address.
# We're assuming that the image update tools will catch the scenario
# where the image is not compatible with the device.
try:
# Build up the command to perform the firmware update.
# Note the hack to work around OPAE tool locale issues
cmd = ("docker run -t --privileged -e LC_ALL=en_US.UTF-8 "
"-e LANG=en_US.UTF-8 -v " + DEVICE_IMAGE_CACHE_DIR +
":" + "/mnt/images " + OPAE_IMG +
" fpgasupdate -y --log-level debug /mnt/images/" +
filename + " " + pci_addr)
# Issue the command to perform the firmware update.
subprocess.check_output(shlex.split(cmd),
stderr=subprocess.STDOUT)
# TODO: switch to subprocess.Popen, parse the output and send
# progress updates.
except subprocess.CalledProcessError as exc:
# Check the return code, send completion info to sysinv-conductor.
# "docker run" return code will be:
# 125 if the error is with Docker daemon itself
# 126 if the contained command cannot be invoked
# 127 if the contained command cannot be found
# Exit code of contained command otherwise
msg = ("Failed to update device image %s for device %s, "
"return code is %d, command output: %s." %
(filename, pci_addr, exc.returncode,
exc.output.decode('utf-8')))
LOG.error(msg)
LOG.error("Check for intel-max10 kernel logs.")
raise exception.SysinvException(msg)
def read_n3000_sysfs_file(pattern):
# Read a sysfs file related to the N3000.
# The result should be an empty string if the file doesn't exist,
# or a single line of text if it does.
# Convert the pattern to a list of matching filenames
filenames = glob(pattern)
# If there are no matching files, return an empty string.
if len(filenames) == 0:
return ""
# If there's more than one filename, complain.
if len(filenames) > 1:
LOG.warn("Pattern %s gave %s matching filenames, using the first." %
(pattern, len(filenames)))
filename = filenames[0]
infile = open(filename)
try:
line = infile.readline()
return line.strip()
except Exception:
LOG.exception("Unable to read file %s" % filename)
finally:
infile.close()
return ""
def get_n3000_root_hash(pci_addr):
# Query sysfs for the root key of the N3000 at the specified PCI address
root_key_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
SPI_PATH + ROOT_HASH_PATH)
root_key = read_n3000_sysfs_file(root_key_pattern)
# If the root key hasn't been programmed, return an empty string.
if root_key == "hash not programmed":
root_key = ""
return root_key
def get_n3000_revoked_keys(pci_addr):
# Query sysfs for revoked keys of the N3000 at the specified PCI address
revoked_key_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
SPI_PATH + CANCELLED_CSKS_PATH)
revoked_keys = read_n3000_sysfs_file(revoked_key_pattern)
return revoked_keys
def get_n3000_bitstream_id(pci_addr):
# Query sysfs for bitstream ID of the N3000 at the specified PCI address
bitstream_id_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
BITSTREAM_ID_PATH)
bitstream_id = read_n3000_sysfs_file(bitstream_id_pattern)
return bitstream_id
def get_n3000_boot_page(pci_addr):
# Query sysfs for boot page of the N3000 at the specified PCI address
image_load_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
SPI_PATH + IMAGE_LOAD_PATH)
image_load = read_n3000_sysfs_file(image_load_pattern)
if image_load == "0":
return "factory"
elif image_load == "1":
return "user"
else:
LOG.warn("Reading image load gave unexpected result: %s" % image_load)
return ""
def get_n3000_bmc_version(pci_addr, path):
version_pattern = (SYSFS_DEVICE_PATH + pci_addr + FME_PATH +
SPI_PATH + path)
version = read_n3000_sysfs_file(version_pattern)
# If we couldn't read the file, return an empty string.
if version == "":
return ""
# We're expecting a 32-bit value, possibly with "0x" in front.
try:
vint = int(version, 16)
except ValueError:
return ""
if vint >= 1 << 32:
LOG.warn("String (%s) read from file %s doesn't match the "
"expected pattern" % (version, version_pattern))
return ""
# There's probably a better way than this.
# We want to match the version that Intel's "fpgainfo" tool reports.
return ("%s.%s.%s.%s" % (chr(vint >> 24), str(vint >> 16 & 0xff),
str(vint >> 8 & 0xff), str(vint & 0xff)))
def get_n3000_bmc_fw_version(pci_addr):
return get_n3000_bmc_version(pci_addr, BMC_FW_VER_PATH)
def get_n3000_bmc_build_version(pci_addr):
return get_n3000_bmc_version(pci_addr, BMC_BUILD_VER_PATH)
def watchdog_action(action):
if action not in ["stop", "start"]:
LOG.warn("watchdog_action called with invalid action: %s", action)
return
try:
# Build up the command to perform the action.
cmd = ["systemctl", action, "hostw"]
# Issue the command to stop/start the watchdog
subprocess.check_output(cmd, stderr=subprocess.STDOUT)
except subprocess.CalledProcessError as exc:
msg = ("Failed to %s hostw service, "
"return code is %d, command output: %s." %
(action, exc.returncode, exc.output))
LOG.warn(msg)
def stop_watchdog():
watchdog_action("stop")
def start_watchdog():
watchdog_action("start")
class FpgaAgentManager(service.PeriodicService):
"""Sysinv FPGA Agent service main class."""
RPC_API_VERSION = '1.0'
def __init__(self, host, topic):
serializer = objects_base.SysinvObjectSerializer()
super(FpgaAgentManager, self).__init__(host, topic, serializer=serializer)
self.host_uuid = None
def start(self):
super(FpgaAgentManager, self).start()
if os.path.isfile('/etc/sysinv/sysinv.conf'):
LOG.info('sysinv-fpga-agent started')
else:
LOG.info('No config file for sysinv-fpga-agent found.')
raise exception.ConfigNotFound(message="Unable to find sysinv config file!")
# Wait around until someone else updates the platform.conf file
# with our host UUID.
self.wait_for_host_uuid()
# Collect FPGA inventory and report to conductor at startup.
context = ctx.get_admin_context()
self.report_fpga_inventory(context)
def periodic_tasks(self, context, raise_on_error=False):
""" Periodic tasks are run at pre-specified intervals. """
return self.run_periodic_tasks(context, raise_on_error=raise_on_error)
def wait_for_host_uuid(self):
# Get our host UUID from /etc/platform/platform.conf. Note that the
# file can exist before the UUID is written to it.
prefix = "UUID="
while self.host_uuid is None:
if os.path.isfile(tsc.PLATFORM_CONF_FILE):
with open(tsc.PLATFORM_CONF_FILE, 'r') as platform_file:
for line in platform_file:
line = line.strip()
if not line.startswith(prefix):
continue
uuid = line[len(prefix):]
if uuidutils.is_uuid_like(uuid):
self.host_uuid = uuid
LOG.info("Agent found host UUID: %s" % uuid)
break
else:
LOG.info("UUID entry: %s in platform.conf "
"isn't uuid-like" % uuid)
time.sleep(5)
def report_fpga_inventory(self, context):
"""Collect FPGA data for this host.
This method allows host FPGA data to be collected.
:param: context: an admin context
:returns: nothing
"""
host_uuid = self.host_uuid
rpcapi = conductor_rpcapi.ConductorAPI(
topic=conductor_rpcapi.MANAGER_TOPIC)
fpgainfo_list = self.fpga_scan()
try:
LOG.info("reporting FPGA inventory for host %s: %s" %
(host_uuid, fpgainfo_list))
rpcapi.fpga_device_update_by_host(context, host_uuid, fpgainfo_list)
except exception.SysinvException:
LOG.exception("Exception updating fpga devices.")
pass
def fpga_scan(self):
# First get the PCI addresses of each supported FPGA device
cmd = ["lspci", "-Dm", "-d " + FPGA_VENDOR + ":" + FPGA_DEVICE]
try:
output = subprocess.check_output(cmd, stderr=subprocess.STDOUT)
except subprocess.CalledProcessError as exc:
msg = ("Failed to get pci devices with vendor %s and device %s, "
"return code is %d, command output: %s." %
(FPGA_VENDOR, FPGA_DEVICE, exc.returncode, exc.output))
LOG.warn(msg)
raise exception.SysinvException(msg)
# Parse the output of the lspci command and grab the PCI address
fpga_addrs = []
for line in output.splitlines():
line = shlex.split(line.strip())
fpga_addrs.append(line[0])
fpgainfo_list = []
# Next, break down the PCI address into parts and use that to call the
# FPGA tools to get additional information
for addr in fpga_addrs:
# Store information for this FPGA
fpgainfo = {'pciaddr': addr}
fpgainfo['bmc_build_version'] = get_n3000_bmc_build_version(addr)
fpgainfo['bmc_fw_version'] = get_n3000_bmc_fw_version(addr)
fpgainfo['boot_page'] = get_n3000_boot_page(addr)
fpgainfo['bitstream_id'] = get_n3000_bitstream_id(addr)
fpgainfo['root_key'] = get_n3000_root_hash(addr)
fpgainfo['revoked_key_ids'] = get_n3000_revoked_keys(addr)
# TODO: Also retrieve the information about which NICs are on
# the FPGA device.
fpgainfo_list.append(fpgainfo)
return fpgainfo_list
def device_update_image(self, context, pci_addr, filename, transaction_id):
"""Write the device image to the device at the specified address.
Transaction is the transaction ID as specified by sysinv-conductor.
This must send back either success or failure to sysinv-conductor
via an RPC cast. The transaction ID is sent back to allow sysinv-conductor
to locate the transaction in the DB.
TODO: could get fancier with an image cache and delete based on LRU.
"""
rpcapi = conductor_rpcapi.ConductorAPI(
topic=conductor_rpcapi.MANAGER_TOPIC)
try:
LOG.info("ensure device image cache exists")
ensure_device_image_cache_exists()
# Pull the image from the controller.
LOG.info("fetch device image %s" % filename)
# For now, we only need to support AIO nodes, so just copy the
# file from where we know sysinv-conductor put it.
local_path = fetch_device_image_local(filename)
# TODO: when we need to support standalone workers, we'll need to
# pull in the image file via HTTP.
# local_path = fetch_device_image(filename)
# TODO: check CSK used to sign image, ensure it hasn't been cancelled
# TODO: check root key used to sign image, ensure it matches root key of hardware
# Note: may want to check these in the sysinv API too.
try:
LOG.info("setting transaction id %s as in progress" % transaction_id)
rpcapi.device_update_image_status(
context, self.host_uuid, transaction_id,
dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS)
# Disable the watchdog service to prevent a reboot on things
# like critical process death. We don't want to reboot while
# flashing the FPGA.
stop_watchdog()
# Write the image to the specified PCI device.
# TODO: when we support more than just N3000, we'll need to
# pick the appropriate low-level write function based on the
# hardware type.
LOG.info("writing device image %s to device %s" % (filename, pci_addr))
write_device_image_n3000(filename, pci_addr)
# If we get an exception trying to send the status update
# there's not much we can do.
try:
LOG.info("setting transaction id %s as complete" % transaction_id)
rpcapi.device_update_image_status(
context, self.host_uuid, transaction_id,
dconstants.DEVICE_IMAGE_UPDATE_COMPLETED)
except Exception:
LOG.exception("Unable to send fpga update image status "
"completion message for transaction %s."
% transaction_id)
finally:
# Delete the image file.
os.remove(local_path)
# start the watchdog service again
start_watchdog()
except exception.SysinvException as exc:
LOG.info("setting transaction id %s as failed" % transaction_id)
rpcapi.device_update_image_status(context, self.host_uuid,
transaction_id,
dconstants.DEVICE_IMAGE_UPDATE_FAILED,
exc.message)

View File

@ -0,0 +1,62 @@
# vim: tabstop=4 shiftwidth=4 softtabstop=4
# coding=utf-8
# Copyright 2013 Hewlett-Packard Development Company, L.P.
# All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
#
# Copyright (c) 2020 Wind River Systems, Inc.
#
"""
Client side of the agent RPC API.
"""
from oslo_log import log
from sysinv.objects import base as objects_base
import sysinv.openstack.common.rpc.proxy
LOG = log.getLogger(__name__)
MANAGER_TOPIC = 'sysinv.fpga_agent_manager'
class AgentAPI(sysinv.openstack.common.rpc.proxy.RpcProxy):
"""Client side of the agent RPC API.
API version history:
1.0 - Initial version.
"""
RPC_API_VERSION = '1.0'
def __init__(self, topic=None):
if topic is None:
topic = MANAGER_TOPIC
super(AgentAPI, self).__init__(
topic=topic,
serializer=objects_base.SysinvObjectSerializer(),
default_version=self.RPC_API_VERSION)
def host_device_update_image(self, context, hostname, pci_addr,
filename, transaction_id):
LOG.info("sending device_update_image to host %s" % hostname)
topic = '%s.%s' % (self.topic, hostname)
return self.cast(context,
self.make_msg('device_update_image',
pci_addr=pci_addr, filename=filename,
transaction_id=transaction_id),
topic=topic)

View File

@ -4,6 +4,10 @@
# SPDX-License-Identifier: Apache-2.0
#
# vim: tabstop=4 shiftwidth=4 softtabstop=4
# coding=utf-8
#
from sysinv.db import api as db_api
from sysinv.objects import base
from sysinv.objects import utils

View File

@ -27,6 +27,7 @@ import os.path
import uuid
from sysinv.common import constants
from sysinv.common import device as dconstants
from sysinv.common import exception
from sysinv.common import kubernetes
from sysinv.common import utils as cutils
@ -1524,6 +1525,147 @@ class ManagerTestCase(base.DbTestCase):
self.assertEqual(updated_port['node_id'], 3)
def test_fpga_device_update_by_host(self):
# Create compute-0 node
config_uuid = str(uuid.uuid4())
ihost = self._create_test_ihost(
personality=constants.WORKER,
hostname='compute-0',
uuid=str(uuid.uuid4()),
config_status=None,
config_applied=config_uuid,
config_target=config_uuid,
invprovision=constants.PROVISIONED,
administrative=constants.ADMIN_UNLOCKED,
operational=constants.OPERATIONAL_ENABLED,
availability=constants.AVAILABILITY_ONLINE,
)
host_uuid = ihost['uuid']
host_id = ihost['id']
PCI_DEV_1 = {'uuid': str(uuid.uuid4()),
'name': 'pci_dev_1',
'pciaddr': '0000:0b:01.0',
'pclass_id': '060100',
'pvendor_id': '8086',
'pdevice_id': '0443',
'enabled': True}
PCI_DEV_2 = {'uuid': str(uuid.uuid4()),
'name': 'pci_dev_2',
'pciaddr': '0000:0c:01.0',
'pclass_id': '012000',
'pvendor_id': '8086',
'pdevice_id': '0b30',
'enabled': True}
pci_device_dict_array = [PCI_DEV_1, PCI_DEV_2]
# create new PCI dev
self.service.pci_device_update_by_host(self.context, host_uuid, pci_device_dict_array)
dev = self.dbapi.pci_device_get(PCI_DEV_1['pciaddr'], host_id)
for key in PCI_DEV_1:
self.assertEqual(dev[key], PCI_DEV_1[key])
dev = self.dbapi.pci_device_get(PCI_DEV_2['pciaddr'], host_id)
for key in PCI_DEV_2:
self.assertEqual(dev[key], PCI_DEV_2[key])
FPGA_DEV_1 = {
'pciaddr': PCI_DEV_1['pciaddr'],
'bmc_build_version': 'D.2.0.6',
'bmc_fw_version': 'D.2.0.21',
'boot_page': 'user',
'bitstream_id': '0x2383A62A010504',
'root_key': '0x2973c55fc739e8181b16b9b51b786a39c0860159df8fb94652b0fbca87223bc7',
'revoked_key_ids': '2,10,50-51',
}
fpga_device_dict_array = [FPGA_DEV_1]
# Create new FPGA device.
self.service.fpga_device_update_by_host(self.context, host_uuid,
fpga_device_dict_array)
dev = self.dbapi.fpga_device_get(FPGA_DEV_1['pciaddr'], host_id)
for key in FPGA_DEV_1:
self.assertEqual(dev[key], FPGA_DEV_1[key])
# Update existing FPGA device.
fpga_dev_dict_update = {
'pciaddr': FPGA_DEV_1['pciaddr'],
'bmc_build_version': 'D.2.0.7',
'bmc_fw_version': 'D.2.0.22',
'boot_page': 'factory',
'bitstream_id': '0x2383A62A010504',
'root_key': '',
'revoked_key_ids': '',
}
fpga_dev_dict_update_array = [fpga_dev_dict_update]
self.service.fpga_device_update_by_host(self.context, host_uuid,
fpga_dev_dict_update_array)
dev = self.dbapi.fpga_device_get(FPGA_DEV_1['pciaddr'], host_id)
for key in fpga_dev_dict_update:
self.assertEqual(dev[key], fpga_dev_dict_update[key])
def test_device_update_image_status(self):
mock_host_device_image_update_next = mock.MagicMock()
p = mock.patch(
'sysinv.conductor.manager.ConductorManager.host_device_image_update_next',
mock_host_device_image_update_next)
p.start()
self.addCleanup(p.stop)
# Create compute-0 node
ihost = self._create_test_ihost(
personality=constants.WORKER,
hostname='compute-0',
uuid=str(uuid.uuid4()),
)
host_uuid = ihost.uuid
host_id = ihost.id
# Make sure we start with this set to false.
self.dbapi.ihost_update(host_uuid, {'reboot_needed': False})
DEV_IMG_STATE = {
'host_id': host_id,
'pcidevice_id': 5,
'image_id': 11,
'status': '',
}
device_image_state = self.dbapi.device_image_state_create(
DEV_IMG_STATE)
for key in DEV_IMG_STATE:
self.assertEqual(device_image_state[key], DEV_IMG_STATE[key])
# set status to "in-progress"
self.service.device_update_image_status(self.context,
host_uuid, device_image_state.uuid,
dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS)
mock_host_device_image_update_next.assert_not_called()
device_image_state = self.dbapi.device_image_state_get(
device_image_state.id)
self.assertEqual(device_image_state.status,
dconstants.DEVICE_IMAGE_UPDATE_IN_PROGRESS)
ihost = self.dbapi.ihost_get(host_id)
self.assertEqual(ihost.reboot_needed, False)
# set status to "completed"
self.service.device_update_image_status(self.context,
host_uuid, device_image_state.uuid,
dconstants.DEVICE_IMAGE_UPDATE_COMPLETED)
mock_host_device_image_update_next.assert_called_with(
self.context, host_uuid)
device_image_state = self.dbapi.device_image_state_get(
device_image_state.id)
self.assertEqual(device_image_state.status,
dconstants.DEVICE_IMAGE_UPDATE_COMPLETED)
ihost = self.dbapi.ihost_get(host_id)
self.assertEqual(ihost.reboot_needed, True)
class ManagerTestCaseInternal(base.BaseHostTestCase):

View File

@ -1951,3 +1951,53 @@ class TestMigrations(BaseMigrationTestCase, WalkVersionsMixin):
for col, coltype in memorys_cols.items():
self.assertTrue(isinstance(memorys.c[col].type,
getattr(sqlalchemy.types, coltype)))
def _check_104(self, engine, data):
# 104_fpga_devices.py
# Assert data types for new columns in table "pci_devices"
pci_devices = db_utils.get_table(engine, 'pci_devices')
pci_devices_cols = {
'status': 'String',
'needs_firmware_update': 'Boolean',
}
for col, coltype in pci_devices_cols.items():
self.assertTrue(isinstance(pci_devices.c[col].type,
getattr(sqlalchemy.types, coltype)))
# Assert data types for new table "fpga_devices"
fpga_devices = db_utils.get_table(engine, 'fpga_devices')
fpga_devices_cols = {
'created_at': 'DateTime',
'updated_at': 'DateTime',
'deleted_at': 'DateTime',
'id': 'Integer',
'uuid': 'String',
'host_id': 'Integer',
'pci_id': 'Integer',
'pciaddr': 'String',
'bmc_build_version': 'String',
'bmc_fw_version': 'String',
'root_key': 'String',
'revoked_key_ids': 'String',
'boot_page': 'String',
'bitstream_id': 'String',
}
for col, coltype in fpga_devices_cols.items():
self.assertTrue(isinstance(fpga_devices.c[col].type,
getattr(sqlalchemy.types, coltype)))
# Assert data types for new table "fpga_ports"
fpga_ports = db_utils.get_table(engine, 'fpga_ports')
fpga_ports_cols = {
'created_at': 'DateTime',
'updated_at': 'DateTime',
'deleted_at': 'DateTime',
'id': 'Integer',
'uuid': 'String',
'port_id': 'Integer',
'fpga_id': 'Integer',
}
for col, coltype in fpga_ports_cols.items():
self.assertTrue(isinstance(fpga_ports.c[col].type,
getattr(sqlalchemy.types, coltype)))