Just quick follow-up change for the edits raised in the final review of the train priorites change set[0]. [0]: https://review.opendev.org/#/c/660145 Change-Id: Iba585aa4e68c80868a02b299765dc070320c076d
8.4 KiB
Train Project Priorities
This is a list of priorities the Ironic team is prioritizing for Train development, in order of relative size and dependency addressing. Note that this is not our complete backlog for the cycle, we still hope to review and land non-priority items.
The primary contact(s) listed are responsible for tracking the status of that work and herding cats to help get that work done. They are not the only contributor(s) to this work, and not necessarily doing most of the coding! They are expected to be available on IRC and the ML for questions, and report status on the whiteboard for the weekly IRC sync-up. The number of primary contacts is typically limited to 2-3 individuals to simplify communication. We expect at least one of them to have core privileges to simplify getting changes in.
Goals
Priority | Primary Contacts |
---|---|
Deploy Steps | mgoddard, rloo |
Faster Deployments | TheJulia, dtantsur, stendulker |
Bare metal program | TheJulia, janders, hodgepodge |
Replacing WSME | mkrai, dtantsur, kaifeng, rpittau |
Redfish Virtual Media | etingof, rpittau |
Node retirement/quarantine | arne_wiebalck, rpittau |
Software RAID | arne_wiebalck, TheJulia |
Inter-Project Goals
State callbacks to nova | arne_wiebalck, tssurya |
SmartNIC Support | TheJulia, mkrai, moshele |
Community Goals
Goal | Primary Contacts |
---|---|
IPv6 support | TheJulia, derekh, dtantsur |
Single document generation | kaifeng, rpittau |
Details
Deploy Steps
As a general theme of work for the Train Cycle, the Ironic project community wishes to break the monolithic deployment step into multiple deployment steps which will further enable operators to easily create more complex declaritive deployments. This work also includes the the ability to trigger steps through the agent, something that is not presently possible today.
Faster Deployments
A repeating theme from the operator community is for ways to speed up the overall deployment time and enable faster time to deployment.
Ironic has completed some work in this area with the Fast Track Deployments, But there are other areas that can and should be explored by the community.
Bare metal program
The most powerful thing the Ironic community can do this cycle is not actually in code, but in documentation. The recently created Bare Metal SIG is working on creation of a white paper as part of the Bare Metal logo program, and needs our help for stand-alone use cases.
Replacing WSME
Most long time contributors are aware of the headaches that WSME has brought the community, along with the fact that many projects have migrated away from it.
In order to move us to something which is supported by a broader community, the consensus from the Train Project Teams Gathering, was to move ironic towards using Flask. We'll start with re-working a single endpoint and hopefully move through the rest of the API in a rapid fashion.
Redfish Virtual Media
One of the most powerful features we can offer to operators with distributed and edge ironic nodes is to offer booting the ramdisk via a generic Redfish Virtual Media boot interface. This will enable greater comptability and once completed in-gate testing of virtual media relate scenarios in CI.
More information can be found in story 1526753.
This work is also a logical step towards the L3 based deployments specification.
Node retirement/quarantine
Larger operators with Ironic have found themselves approaching a quandry of "What is the proper way to retire a machine from ironic?". A nearly identical topic has arisen from the Telecom world seeking to represent the state of the node more accurately with-in ironic as to represent if a machine is in a fault state or questionable state under-going investigation.
With that being said, we feel that we need to extend our states and state machine to better fit these overall themes.
Software RAID
Software RAID support has been long desired after by larger operators to help manage COTS server hardware where Hardware RAID controllers are undesirable or prohibitive. Work started during the Stein cycle to support this functionality, and work continues! You can learn more in the Software RAID specification.
State callbacks to nova
One of the headaches and performance issues for larger operators is the nature of power synchronization when nova is in use, as nova performs a large number of API calls to update its database with node power state. At larger scales, this is inefficient and results in the power state nova having on record from being out of state from ironic as the source of truth.
Conversely, nova presently assumes that it is always authorative in regards to power states. This work will allow ironic to inform nova of the new power state such that nova does not attempt to reset the power state.
While this is largely an effort in the nova project, we need to be aware and attempt to support this work to move foward. The nova-spec document can be found in review as change 636132.
SmartNIC Support
Smartnics complicates ironic as the NIC needs to be programmed with the power in a state such that the configuration on the NIC can be changed.
While the work in Ironic was completed this past cycle ahead of expectations. Work is on-going in Neutron this cycle to merge the functionality to make this available to users.
The story can be found at story 2003346.
IPv6 support
The Technical Committee is presently finalizing a goal for the Train cycle for projects to support and test IPv6-only deployments.
More information can be found in change 657174.
Single document generation
A goal from the Technical Committee is for each project to support the generation of a single PDF document for the whole of the documentation tree.
More information on this community goal can be in governance pdf doc generation goal documentation.