Update patch set 13

Patch Set 13:

Hi Krzysztof Klimonda, thanks for your good thoughts and comments. Yes, the upgrade process is hard if we want to do a rebuid for the node (master or worker) and drain the node before doing the rebuild. It's even harder if the cluster is a private cluster, because that means Magnum control plane cannot reach the k8s API to do drain. In other words, we're trying to do one thing which has to leverage the power from 2 sides: 1) trigger the upgrade from Magnum control plane 2) drain the node from cluster inside. So the solution could be: a) do all the work on Magnum control plane side 2) do all the things in cluster c) mixed Let's try to analyze the feasibility:

1) Obviously we cannot do all the work on contro plane side because the cluster is private, so no way to drain it
2) Can we do all the work from the cluster inside? possibily but not easy. Let's assume an agent running in cluster to detect if the cluster template is changed. If it's changed, the agent will call nova API or Heat API to rebuild the node. So far, you will see it's also not reasonable. Because the cluster is managed as a Heat stack, it doesn't make sense to update a resource of the stack directly without talking to Heat API.
3) Mixed. That's the approach I think workable so far, but does need some test as well. One thing we need to make sure is if the pre-create or pre-update hook could work for the image change action of the Nova instance. If it works, then we can have an agent running inside (or we can extend the existing Magnum auto healer) periodiclly poll the hook, do the drain accordingly if there is a hook valid, and then clear the hook after the drain.

That's my thoughts so far, comments or thoughts are welcomed. Cheers.

Patch-set: 13
This commit is contained in:
Gerrit User 6484 2021-04-27 23:11:16 +00:00 committed by Gerrit Code Review
parent c805042526
commit f2baec8058

Diff Content Not Available