stx-puppet

Author	SHA1	Message	Date
Leonardo Fagundes Luz Serrano	277529f9f9	tox/zuul: Set puppetlabs_spec_helper version to 6.0.3 Temporary fix for zuul failing puppet lint gem install TestPlan: PASS Zuul Closes-Bug: 2039880 Change-Id: Id5557933551c226516148ec74559305320e4597f Signed-off-by: Leonardo Fagundes Luz Serrano <Leonardo.FagundesLuzSerrano@windriver.com>	2023-10-19 18:56:46 +00:00
Zuul	57bd4e0e95	Merge "Update kubelet system overrides on unlock"	2023-10-12 21:42:28 +00:00
Ramesh Kumar Sivanandam	82ca22f5b6	Update correct iptable config values in /etc/sysctl.d/k8s.conf The /etc/sysctl.d/k8s.conf file is missing the below iptable config values which causes the error in kubeadm init - "/proc/sys/net/ipv6/conf/default/forwarding was not set to 1" during optimized BnR opearion. net.ipv4.ip_forward = 1 net.ipv4.conf.default.rp_filter = 0 net.ipv4.conf.all.rp_filter = 0 net.ipv6.conf.all.forwarding = 1 Recent changes in the below review modified the way Kubernetes is restored. It exposes the incorrect kernel parameters in stx-puppet. https://review.opendev.org/c/starlingx/ansible-playbooks/+/890370 This change updates the correct iptable configuration values in the file /etc/sysctl.d/k8s.conf during bootstrap which fixes the optimized BnR operation failure. These settings are intended to exactly align with the settings already being configured by the bringup-kubemaster task in the ansible-playbooks. Test Plan: PASS: Fresh install ISO as AIO-SX. Verify that /etc/sysctl.d/k8s.conf have the correct configuration values. PASS: Performed optimized BnR on IPv4 enabled AIO-SX. PASS: Performed optimized BnR on IPv6 enabled AIO-SX. Closes-Bug: 2038545 Change-Id: I585117190b2372cfd7c978eff9bd9ff6da61a88f Signed-off-by: Ramesh Kumar Sivanandam <rameshkumar.sivanandam@windriver.com>	2023-10-11 14:43:33 -04:00
Zuul	e49328c20a	Merge "Add k8s cfg file to the OAM firewall script"	2023-10-10 21:06:06 +00:00
Andre Kantek	58581b88e9	Add k8s cfg file to the OAM firewall script In the change https://review.opendev.org/c/starlingx/stx-puppet/+/897467 the OAM firewall was not updated to pass the k8s config file as argument to calico_firewall_apply_policy.sh. It then created an error that prevented the global network policy to be created, making the OAM interface to block all traffic, except for the failsafed ones. This change corrects that Test Plan [PASS] In AIO-DX remove the current OAM GNP and execute lock/unlock on one of the controllers, verify the OAM GNP is recreated. [PASS] In AIO-DX remove the current OAM GNP and force the runtime execution by creating the file /etc/platform/.platform_firewall_config_required and observe the request to recreate the OAM GNP Closes-Bug: 2038550 Change-Id: Ica03dbf6ffd9f6f592fa53efa40293191203377a Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>	2023-10-10 17:07:40 -03:00
Gleb Aronsky	f426f5c67a	Update kubelet system overrides on unlock Add logic to the platform::kubernetes::configuration method to generate the kubelet's systemd override file. This change ensures the file is generated every time a host is unlocked. This facilitates delivery of systemd service changes via patches to existing installs. This change is needed by bug 2027810 to ensure that the orphan volume cleanup script is executed as part of the systemd ExecStartPre kubelet service override. This bug is an update for the this reverted commit: https://review.opendev.org/c/starlingx/stx-puppet/+/895364 Test Plan: Pass: - Update the kube-stx-override.conf.erb file - Lock the AIO-SX host - Unlock the AIO-SX host - Verify that kube-stx-override.conf has been updated - Verify AIO-SX fresh install - Verify Standard Duplex lock/unlock and verify that kube-stx-override.conf has been updated - Verify Standard Duplex Install Partial-Bug: 2027810 Change-Id: I4e47bce634c21396acb2e5f1540cac0be3ed34ec Signed-off-by: Gleb Aronsky <gleb.aronsky@windriver.com>	2023-10-10 12:57:56 -07:00
Zuul	2a96b5b482	Merge "Update network interfaces during upgrade bootstrap execution"	2023-10-10 18:03:22 +00:00
Gleb Aronsky	3dcd90d6eb	Add missing default value for a kubernetes puppet definition Set default value for 'onlyif' in the puppet definition 'platform::kubernetes::mask_stop_service' if no conditional criteria are set. This fixes a regression where the mask_stop_kubelet method calls mask_stop_service with no value set for the onlyif parameter. Test Plan: - PASS: Upgrade Kubelet Closes-Bug: 2038858 Change-Id: I06e4cf42dbe710a78443dffb83c972c73f9789bf Signed-off-by: Gleb Aronsky <gleb.aronsky@windriver.com>	2023-10-10 15:25:16 +00:00
Lucas Ratusznei Fonseca	521f9294db	Update network interfaces during upgrade bootstrap execution There are two issues adressed by this change: 1. The change introduced by https://review.opendev.org/c/starlingx/stx-puppet/+/895726 prevents the interfaces from being activated during upgrade bootstrap. Such behaviour causes issues when specific network configurations are needed for the upgrade to finish. This change reverts it, but making sure that the sysinv lock issue is properly taken care of. 2. Given an interface that is already activated and configured with ip addresses via the 'ip' command, if an attempt is made to bring it up via the 'ifup' command, the command fails and the default route (if present in the config file) is not properly configured. To prevent this from happening, this change improves the logic in the function that brings the interface down, so that the interface is guaranteed to be in down state and have no IP adresses. Test plan Systems - AIO-SX IPv4 - AIO-SX IPv6 Test IP addresses (examples) +---+-----------------+-----------------+ \| # \| IPv4 (/24 mask) \| IPv6 (/64 mask) \| +---+-----------------+-----------------+ \| 1 \| 10.20.1.1 \| fd00::1:1 \| \| 2 \| 10.20.1.2 \| fd00::1:2 \| \| 3 \| 10.20.1.3 \| fd00::1:3 \| \| 4 \| 10.20.1.4 \| fd00::1:4 \| \| 5 \| 10.20.2.1 \| fd01::1:1 \| +---+-----------------+-----------------+ Test scenarios The initial setup for each test must follow what is described in each corresponding scenario below. For it to be valid, the configuration in the kernel must be in sync with the sysinv database. 1. Standard ethernet interface as OAM - oam0 > Type: regular ethernet > Underlaying interface: 'if0' for reference > Static IP: #2 > Gateway IP: #1 2. VLAN interface as OAM - oam0 > Type: VLAN ID 100 > Interface name: 'vlan100' for reference > Underlaying interface: 'if0' for reference > Static IP: #2 > Gateway IP: #1 Actions - Edit ifcfg-<interface>: > Manually edit /etc/network/interfaces.d/ifcfg-<interface> to change MTU value (for example, from 1500 to 1502), this will cause the script to detect the difference and trigger an interface update. - Erase ifcfg-<interface>: > Manually remove /etc/network/interfaces.d/ifcfg-<interface> file from the filesystem. - ifdown <interface>: > Run command 'ifdown <interface>' to cause the interface to be deactivated by ifupdown. - Set link up <interface>: > Run command 'ip link set up dev <interface>' to put interface's link to UP state. - Create VLAN <vlan-name> on <iface>: > Create VLAN interface and set it's link to UP through the commands 'ip link add link <iface> name <vlan-name> type vlan id <vlan-id>' and 'ip link set up dev <vlan-name>'. - Add IP <address> to <interface>: > Run command 'ip address add <address> dev <interface>' to add an IP address to the interface. - Add route to <interface> via <address>: > Run command 'ip route add default via <address> dev <interface>' to add a default route to the interface. - Modify MTU: > Modify the MTU of the interface via sysinv (example: 'system host-if-modify controller-0 oam0 -m 1502') [ Test Case 1 - Direct script call tests ] For the tests, changes to the OAM interface will be made to check if its parameters (link state, IP address, default route) are correctly restored by the apply_network_config.sh script. The changes here are made manually to the files and not through sysinv. Test procedure 1. Apply initial setup. 2. Apply actions. 3. Run /usr/local/bin/apply_network_config.sh as root. 4. Check that interface state, IP address and default route in kernel match the ones in the sysinv database. Tests For scenario #1 PASS For if0, edit ifcfg PASS For if0, erase ifcfg PASS For if0, ifdown, erase ifcfg PASS For if0, ifdown, edit ifcfg, set link up, add address IP#2 PASS For if0, ifdown, edit ifcfg, set link up, add address IP#5 PASS For if0, ifdown, edit ifcfg, set link up, add address IP#2, add route via IP#4 For scenario #2 PASS For vlan100, edit ifcfg PASS For vlan100, erase ifcfg PASS For vlan100, ifdown, erase ifcfg PASS For vlan100, ifdown, edit ifcfg, create VLAN, add address IP#2 PASS For vlan100, ifdown, edit ifcfg, create VLAN, add address IP#5 PASS For vlan100, ifdown, edit ifcfg, create VLAN, add address IP#2, add route via IP#4 PASS For if0, edit ifcfg PASS For if0, erase ifcfg [ Test Case 2 - Indirect script call on lock/unlock ] Test Procedure 1. Apply initial setup. 2. Lock host. 3. Apply actions. 4. Unlock host. 5. Check that interface state, IP address and default route in kernel match the ones in the sysinv database. Tests For scenario #1 PASS For if0, modify MTU PASS For if0, erase ifcfg (to simulate first unlock when ifcfg-* files don't exist) For scenario #2 PASS For vlan100, modify MTU PASS For if0, modify MTU PASS For if0 and vlan100, erase ifcfg [ Test Case 3 - Indirect script call on upgrade ] Tests PASS Perform upgrade on VirtualBox PASS Perform upgrade on a physical lab ----------------------------------------------------------------------- Closes-Bug: #2036451 Signed-off-by: Lucas Ratusznei Fonseca <lucas.ratuszneifonseca@windriver.com> Change-Id: Ibda7c744e9be26b0bbcbd1520ffe15825ad1f60f	2023-10-09 16:41:00 -03:00
Zuul	d9f373a3f9	Merge "Gracefully stop the isolcpu and kubelet service"	2023-10-06 20:08:36 +00:00
Zuul	bad07c4e2f	Merge "Remove worker remote firewall scripts"	2023-10-06 17:30:49 +00:00
Andre Kantek	eebf90d20e	Remove worker remote firewall scripts The implementation for worker firewall avoided using local kubectl commands. This required access to the keyring for remote ansible ad-hoc commands and leaves the /opt/platform/.config mounted on the worker. Use kubectl command with /etc/kubernetes/kubelet.conf instead, so we can refrain from mounting /opt/platform/.config Since all firewall data is generated in the host's hierada file, the worker node needs to be able to access the calico firewall resources. To achieve that a ClusterRole and ClusterRoleBinding are added, via the controller node, allowing access to only the necessary resources. Test Plan: [PASS] Install a standard setup and validate the worker node firewall configuration [PASS] Execute a DOR test in the cluster and check if the worker nodes install the firewall GNP and HE [PASS] Execute worker node lock/unlock and check if the worker nodes install the firewall GNP and HE Closes-Bug: 2038550 Change-Id: Icf31b513427120fe81c53be21b8d8a81a8e323f8 Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>	2023-10-06 08:12:27 -03:00
Andre Mauricio Zelak	179fef6fef	Fix task ordering to prevent race condition on ptp config gen Changed the task ordering for the ptpinstance runtime manifest. The previous ordering could result in a race condition where ts2phc was being restarted before clock-conf.conf file is updated. This would result PTP instance out-of-tolerance, skewed from the primary clock. Changed the order between 'platform::ptpinstance' and 'platform:: ptpinstance::nic_clock', and moved the directory "${ptp_conf_dir}/ ptpinstance" creation to nic_clock. The nic_clock class will first use it to store the clock-conf.conf file. Test Plan: PASS: Using a multiple PTP instances configuration and run the manifest with system ptp-instance-apply. Ensure that ts2phc starts after clock_nic class, the ts2phc is handling all the interfaces configured and the clock is not skewed. PASS: Host lock and unlock and check the puppet manifest log Closes-bug: 2038383 Change-Id: I97e5a6bf536f05720fcaea860a83e53454e83ab6 Signed-off-by: Andre Mauricio Zelak <andre.zelak@windriver.com>	2023-10-05 19:45:41 -03:00
Zuul	f050063889	Merge "Update deprecated K8S API references of v1beta2 to v1beta3"	2023-09-29 22:07:15 +00:00
Zuul	a86f226666	Merge "Revert "Update dnsmasq conf file for host-record support""	2023-09-29 20:23:41 +00:00
Joseph V	1fbc6cdca4	Revert "Update dnsmasq conf file for host-record support" This reverts commit 6c418b0441460ed018e76cfda0b13c97e240e000. Reason for revert: Partial-Bug LP: https://bugs.launchpad.net/starlingx/+bug/2037734 Closes-Bug: 2037734 Story: 2010835 Task: 48724 Change-Id: Ic9779078349fdbfa6e48a9e90bad8f0794b6ddd9	2023-09-29 19:00:24 +00:00
Zuul	a34c2ff255	Merge "Add DNS host records to /etc/hosts"	2023-09-28 15:57:35 +00:00
Gleb Aronsky	9f2bc83059	Gracefully stop the isolcpu and kubelet service This change ensures that the isolcpu_plugin service is stopped and masked prior to masking and stopping the kubelet service. Additionally, on startup, the kubelet service is unmasked and started prior to unmasking isolcpu_plugin. This change is intended to avoid any race conditions that can occur because of the dependency on kubelet by isolcpu_plugin, resulting in numerous restarts of both services and leading to failed node upgrades. Test Plan: - PASS: Upgrade kubelet AIO-SX - PASS: Upgrade kubelet on a STANDARD installation Closes-Bug: 2036985 Change-Id: Ifb2b512c3953d2a1f7efdba289a31d5a9315cae4 Signed-off-by: Gleb Aronsky <gleb.aronsky@windriver.com>	2023-09-28 14:23:09 +00:00
Zuul	fbadea9885	Merge "Update dnsmasq conf file for host-record support"	2023-09-27 13:34:36 +00:00
Saba Touheed Mujawar	b65a954562	Update deprecated K8S API references of v1beta2 to v1beta3 Updating just a comment that has apiVersion: kubeadm.k8s.io/v1beta2 to v1beta3. Test Plan: PASS: k8s upgrade from 1.22.5 to next all available versions. Story: 2010878 Task: 48836 Change-Id: Id06a0e73212ddeb5b1817a6f7de286ca37a78538 Signed-off-by: Saba Touheed Mujawar <sabatouheed.mujawar@windriver.com>	2023-09-27 08:50:29 -04:00
Zuul	cd13fc3c4b	Merge "Add logic to recreate routes when interfaces are brought down/up"	2023-09-25 15:32:02 +00:00
Joseph Vazhappilly	23ea7423db	Add DNS host records to /etc/hosts The controller needs to add user configured DNS host records to /etc/hosts file, before initial unlock when dnsmasq service is not running. Other personalities like worker hosts and storage hosts does not require this change. Test Plan: PASS: Successful build PASS: Successful bootstrap, initial unlock of controller-0 PASS: Verify after controller-0 unlock, lock and unlock controller-1 PASS: Verify duplex controller host-swact with dns host records PASS: Verify host records in /etc/hosts when dnsmasq is down PASS: Verify host records absent in /etc/hosts when dnsmasq is up Story: 2010835 Task: 48725 Change-Id: I3f11084c7458f89288ba4bef18c78e60ff55b74e Signed-off-by: Joseph Vazhappilly <joseph.vazhappillypaily@windriver.com>	2023-09-25 05:42:19 -04:00
Lucas Ratusznei Fonseca	99ee3f190e	Add logic to recreate routes when interfaces are brought down/up When the apply_network_config.sh script detects changes in an interface's config file, it brings the link down and up again. This causes any associated routes to be only automatically deleted from the kernel and not restored. This commit adds logic to restore any affected routes when the associated interface is brought down/up. Test plan Setup: System - AIO-SX IPv4 Interfaces - data0: ETH static ip 10.10.10.3/24 - data0.100: VLAN static IP 10.10.11.3/24 - data1: ETH static ip 10.10.20.3/24 - data1.200: VLAN static IP 10.10.21.3/24 Pre-existing routes - rt01: 10.20.1.0 -> 10.10.10.1 via data0 - rt02: 10.20.2.0 -> 10.10.10.1 via data0 - rt03: 10.20.3.0 -> 10.10.11.1 via data0.100 - rt04: 10.20.4.0 -> 10.10.11.1 via data0.100 - rt05: 10.20.5.0 -> 10.10.20.1 via data1 - rt06: 10.20.6.0 -> 10.10.20.1 via data1 - rt07: 10.20.7.0 -> 10.10.21.1 via data1.200 - rt08: 10.20.8.0 -> 10.10.21.1 via data1.200 Routes to be added - rt09: 10.20.9.0 -> 10.10.10.1 via data0 - rt10: 10.20.10.0 -> 10.10.11.1 via data0.100 - rt11: 10.20.11.0 -> 10.10.20.1 via data1 - rt12: 10.20.12.0 -> 10.10.21.1 via data1.200 Actions - Change interface: manually edit /etc/network/interfaces.d/ifcfg-* to change MTU value, this will cause the script to detect the difference and trigger an interface update. - Add route: manually edit /var/run/network-scripts.puppet/routes to add a new route, this will cause the script to detect the difference and trigger an update in the routes. - Remove route: manually edit /var/run/network-scripts.puppet/routes to remove a route. Procedure for direct tests 1. Perform action 2. Run /usr/local/bin/apply_network_config.sh as root 3. Validade that routes in /var/run/network-scripts.puppet/routes and in the kernel match Direct tests: PASS Change data0.100 PASS Change data0 PASS Change both data0.100 and data1.200 PASS Change both data0 and data1 PASS Add route rt09 PASS Remove route rt01 PASS Change data0.100 and add route rt10 PASS Change data0.100 and remove route rt03 PASS Change data0 and add routes rt09 and rt10 PASS Change data0 and remove routes rt01 and rt03 PASS Change data0 and data1, add routes rt09, rt10, rt11, rt12 PASS Change data0 and data1, remove routes rt01, rt03, rt05, rt07 PASS Repeat all the previous tests in an equivalent IPv6 setup Indirect tests: PASS Lock system, change MTU of data0 via 'system host-if-modify', unlock system PASS Lock system, erase /etc/network/interfaces.d/ifcfg-* and /etc/network/routes, unlock system Closes-Bug: #2036667 Signed-off-by: Lucas Ratusznei Fonseca <lucas.ratuszneifonseca@windriver.com> Change-Id: If7f301ff797a6644aaa3ff7f73de530d680c6b3b	2023-09-22 11:57:12 -03:00
Zuul	3cdbbcc3b4	Merge "Add crashdump template and parameter handling"	2023-09-21 16:50:33 +00:00
Enzo Candotti	f6b33579d9	Add crashdump template and parameter handling This commit aims to handle the new platform crashdump service parameters in order to create a crash-dump-manager configuration file. This file will be located in /etc/default/crash-dump-manager, and will be used as an EnvironmentFile for the crashDumpMgr service, in order to set the new parameters. Test Plan: PASS: TC1 add new parameters in the service-parameter table with valid value --> parameter added in service-parameter table --> crash-dump-manager updated with new values PASS: TC2 modify new parameters in the service-parameter table with valid value --> service-parameter table updated with new value --> crash-dump-manager updated with new value PASS: TC3 delete new parameters from the service-parameter table --> parameter deleted from service-parameter table --> crash-dump-manager updated with default value PASS: TC4 add/modify new parameters in the service-parameter table with no value --> msg: "The service parameter value is mandatory" --> parameter not added/modified in service-parameter table --> crash-dump-manager not updated PASS: TC5 a)add/modify max_files in the service-parameter table with wrong format --> msg: "Parameter 'max_files' must be an integer value." --> parameter not added/modified in service-parameter table b)add/modify the other parameters in the service-parameter table with wrong format --> msg: "Parameter <value> must be written in human readable format" --> parameter not added/modified in service-parameter table PASS: TC6 reboot host with new parameters configured --> service-parameter table keeps new parameters value --> crash-dump-manager keep configured values PASS: ISO installation Story: 2010893 Task: 48766 Signed-off-by: Enzo Candotti <enzo.candotti@windriver.com> Change-Id: Ia02e73462802c7831331bed6dec98a6b1cb37020	2023-09-21 16:06:54 +00:00
Zuul	ffd09b26de	Merge "When executing upgrade bootstrap for AIO-SX only update /etc/network/"	2023-09-20 15:18:04 +00:00
Andre Kantek	4a816dfa6c	When executing upgrade bootstrap for AIO-SX only update /etc/network/ During upgrade bootstrap, the runtime execution of the puppet class platform::network::runtime created a lock timeout with sysinv-agent to allow the interface configuration in the kernel. The lock exists to allow sysinv-agent to collect interface information for the system inventory. The optimized upgrade feature created this runtime execution to fill the contents in /etc/network/interface.d/ and /etc/network/routes to be available during the network bringup phase that happens after the system unlock in an earlier step than the regular puppet manifest execution. But as part of apply_network_config.sh execution it also brings up the interfaces, accessing the script protected section and causing lock timeout. This change uses the file /var/run/.network_upgrade_bootstrap to indicate a upgrade bootstrap is under way to just populate the /etc/network files and to not activate the interfaces. The interface activation is not needed as the bootstrap is still under way and the minimal network configuration is already provided prior to the bootstrap. After unlocking the files in /etc/network will provide a faster network availability as systemd's network service is one of the first to be executed during boot. Test Plan: [PASS] in 21.12 add an assorted network configuration with - vlan, bonded, and ethernet interfaces (the vlan interface on top of both base interfaces) - configure the bond interface with static address - add routes on all platform and data interfaces [PASS] execute lock/unlock in 21.12 to verify config in applied [PASS] execute AIO-SX upgrade to 22.12, at the bootstrap end check files in /etc/network/ [PASS] finish upgrade and after unlock verify the network access and address, interfaces, and route creation in the kernel Closes-Bug: 2036451 Change-Id: Ib04f72298252a52a8a05cf644671106ad6530e5f Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>	2023-09-19 18:34:53 +00:00
Zuul	8abb5796bc	Merge "Etcd service status: check for certs error"	2023-09-15 21:14:50 +00:00
Zuul	bfbf807a8f	Merge "Revert "Update kubelet system overrides on unlock""	2023-09-15 17:03:06 +00:00
Bruce Jones	22f341e5af	Revert "Update kubelet system overrides on unlock" This reverts commit 7d870177c6c01f5b47a33ef1cd7ce92c3c58694f. Reason for revert: blocking testing due to lock/unlock failures Change-Id: Idc746dfc1076ef2f45a9790c7fecf4bb848162e2	2023-09-15 16:40:12 +00:00
Zuul	2fde8b0872	Merge "In DC setup update firewall with the routes"	2023-09-14 23:33:10 +00:00
kaustubh.dhokte	3ffe8b7e1e	Etcd service status: check for certs error The script /etc/init.d/etcd is used by the service manager for management of the etcd service. The call '/etc/init.d/etcd status' uses etcdctl health API to determine if the service is running fine or not. In an event if etcd certs are replaced with new ones but the service has not yet been restarted to use new ones, the status call will fail even though the service is running fine and the service manager will treat that as service is failed. 'sm-audit' (which is run periodically) uses '/etc/init.d/etcd status' call to determine and maintain the service health. Service manager receiving false service status may introduce a lot bugs. One such scenario is that 'sm' ignores the 'service restart' call if it thinks service is disabled. This leads to etcd not being restarted with new certs during upgrade activate and not being reachable to the kube-apiserver (which may have started using new client certs). This change modifies '/etc/init.d/etcd status' call to not just rely on etcd health api to determine if the etcd service is running and checks for the existence of etcd runtime information in case the health api fails with the 'bad certificate' error. Test Plan: PASS: Replace old certs with new certs at /etc/etcd/ and do not restart the service. Check that the '/etc/init.d/etcd status' is 'running'. PASS: Replace old certs with new certs at /etc/etcd/ and restart the service. Check that the '/etc/init.d/etcd status' is 'running'. Closes-Bug: 2033942 Change-Id: Id30a262ca1bde6d8acb85de10882ca9bd4b59bdd Signed-off-by: kaustubh.dhokte <kaustubh.dhokte@windriver.com>	2023-09-14 22:59:10 +00:00
Andre Kantek	ed3a1d65a2	In DC setup update firewall with the routes The tests in DC labs intended to bringup up to 1000 subclouds showed problems to create all routes when it was necessary to add the firewall update in the same puppet manifest apply that handled platform::network::routes::runtime, generating an ever growing queue that generated timeout during subcloud creation by DC manager. This change adds the firewall classes to be run from inside platform::network::routes::runtime to allow both the route configuration and the firewall update Test PLan: [PASS] In a SysCtrl, add/remove a single route (via CLI/sysinv-API) if in a controller, verify route and firewall update [PASS] In a SysCtrl, add/remove a single route (via CLI/sysinv-API) if in a worker, verify only route update is executed [PASS] In a SysCtrl, add/remove 50 routes (via CLI/sysinv-API) using the parallel bash command: seq 1 50\|parallel --jobs 25 --eta '\ system host-route-add 1 mgmt0 51.{}.{}.0 24 192.168.0.1 1 && \ system host-route-add 2 mgmt0 51.{}.{}.0 24 192.168.0.1 1;' [PASS] In a SysCtrl, add a subcloud AIO-SX and validate route and firewall update [PASS] Install a subcloud and check the SysCtrl network is installed as a route and in the mgmt firewall Closes-Bug: 2033919 Change-Id: Ic5bf9bc84b8c583a39d8c91b72caef4d84240123 Signed-off-by: Andre Kantek <andrefernandozanella.kantek@windriver.com>	2023-09-13 17:10:08 -03:00
Zuul	8722928985	Merge "Set default value for private_dc_ip_address"	2023-09-12 18:06:01 +00:00
Zuul	75b0b669f0	Merge "Remove rabbitmq dependencies from sysinv puppet"	2023-09-12 14:51:47 +00:00
Steven Webster	34431bb0f7	Set default value for private_dc_ip_address It was found that when applying a patch containing the code for enabling the admin network on distributed cloud systems, a problem could occur after unlock because of an undefined value for 'private_dc_ip_address'. This is because the value of this puppet parameter is only set in the system.yaml hieradata, which is not written on normal host lock/unlock. The value is generated however, when a user assigns an interface to the admin network, or performs another action that leads to the system hieradata being generated. If the user has not assigned an interface to the admin network, or if the user has no interest in using the admin network, this commit ensures that the value of the private_dc_ip_address is that of the private_ip_address (ie. mgmt). Setting the private_dc_ip_address to the private_ip_address as default ensures that every component using the haproxy::params behaves as it did pre-patch. Testing: Success path: - Install, lock / unlock AIO-SX system (stx 8.0 + patch) - Install, lock / unlock an AIO-DX DC subcloud (stx 8.0 + patch) - Update a DC subcloud to use the admin network after the system has been patched. Regression: - Install, lock / unlock AIO-SX system (dev build) - Install, lock / unlock an AIO-DX DC subcloud (dev build) - Install a DC subcloud using the mgmt. network (dev build) - Install a DC subcloud using the admin network (dev build) Story: 2010319 Task: 46911 Signed-off-by: Steven Webster <steven.webster@windriver.com> Change-Id: Ice16a4a3a2d9faa461c1c0f98d1ea0b0c7ce751a	2023-09-12 09:36:32 -04:00
Zuul	94f80ab772	Merge "Revert "Revert Patch of puppet-manifest-apply.sh""	2023-09-12 00:58:16 +00:00
Zuul	545f251d3e	Merge "Update haproxy config to include keystone request retry."	2023-09-11 19:30:49 +00:00
Lucas Borges	e44ff4ecfe	Revert "Revert Patch of puppet-manifest-apply.sh" This reverts commit a1784deca9d30848f05d2ca53e66cf832d54b0da. Reason for revert: The white list created to ignore only for VM, however, the ignored warning was also seen in real server. This needs to be more extensively tested in different types of server. Story: 2010757 Task: 48644 Change-Id: I979b4269d0e8f68b5ea0c8471b14e666a437730d Signed-off-by: Lucas Borges <lucas.borges@windriver.com>	2023-09-11 15:21:03 +00:00
Zuul	68404747e7	Merge "Fix puppet network.pp to load vfio-pci driver with sriov enabled"	2023-09-11 13:06:58 +00:00
Lucas Borges	ded6c7a3e0	Update puppet bucket cache dir Every puppet apply generates some caches of files in /var/cache/puppet/clientbucket and the script is cleaning another directory /var/lib/puppet/clientbucket. This updates the cache path. The vardir setting in puppet.conf is removed in https://review.opendev.org/c/starlingx/integ/+/830542 this configuration sets the cache dir Without that, the puppet stores the cache under /var/cache/puppet/clientbucket. Test Plan: PASS: Ensure the cache directory is removed after the puppet apply (AIO-SX) Closes-Bug: 2034932 Change-Id: If564d00bc09030d0543669a11a646fcf502bf65b Signed-off-by: Lucas Borges <lucas.borges@windriver.com>	2023-09-08 17:38:59 +00:00
Bezerra Filho, Moacir	a3ceb3ea34	Fix puppet network.pp to load vfio-pci driver with sriov enabled - Modified manifests/network.pp enabled sriov when driver is vfio-pci plan test: 01. PASSED - system host-lock controller-0 02. PASSED - Configure any NIC that has status UP (check with the command: ip addr) 03. PASSED - system host-if-modify -m 1500 -n sriov0 -c pci-sriov \ -N 63 --vf-driver=vfio controller-0 <interface_name> 04. PASSED - system host-unlock controller-0 05. PASSED - system application-upload <sriov-fec-operator-version.tgz> 06. PASSED - system application-apply sriov-fec-operator 07. PASSED - cat /sys/module/vfio_pci/parameters/enable_sriov eq. "Y" 08. PASSES - cat /sys/module/vfio_pci/parameters/disable_idle_d3 eq. "Y" (In this example I'm using ACC200 but it can be ACC100 or N3000) 09. PASSED - kubectl apply -f sriov-fec-config-acc200-vfio-pci.yaml 10. PASSED - kubectl apply -f acc200.yml 11. PASSED - kubectl exec acc200 -it – bash 12. PASSED - echo $PCIDEVICE_INTEL_COM_INTEL_FEC_ACC200 13. PASSED - Run the script below in the container CPU_SET=$(taskset -pc 1 \| sed -e 's/pid 1.s current \ affinity list://g') for d in $(lspci -d 8086:57c1 \| cut -d' ' -f1) ; do /opt/sysroot/usr/local/bin/dpdk-test-bbdev \ --vfio-vf-token=02bddbbf-bbb0-4d79-886b-91bad3fbb510 \ -l $CPU_SET -a $d -- -c validation \ -v /opt/sysroot/usr/local/app/test-bbdev/ldpc_dec_default.data done 14. PASSED - DPDK Validation with testpmd was performed Closes-bug: #2033103 Signed-off-by: Bezerra Filho, Moacir <Moacir.BezerraFilho@windriver.com> Change-Id: I880551fafbb351c370acedeaf43f9de595fea0af	2023-09-08 16:47:47 +00:00
Bezerra Filho, Moacir	86c4ab043b	Update haproxy config to include keystone request retry. - Add keywork retry_on in haproxy::backend - Add values retry_on in keystone.pp - Modified keystone_http_connect_timeout 10 to 15 in api.pp, api_proxy.pp, certalarm.pp and certmon.pp this workaround solves: - DC Scale \| RR Patch Orchestration fails as it cannot retrieve patches for subcloud after the apply - DC Patch - Parallel patch orchestration fails to establish connection to MGMT interface of subclouds - Patch orchestration fail due to transient keystone errors Test plan: 1. (PASSED) Patch Creation: - Construct a "reboot required" RR patch that encompasses the specified changes. - Generate an "in-service test" NRR patch. 2. (PASSED) Initial Setup: - Commission a DC system with over 500 subclouds. - Assert that the patch encompassing the fix is applied successfully on the DC. 3. (PASSED) Strategy Creation and RR Patch Deployment (Max 250 Subclouds): - Created a RR patch strategy with max_parallel_subclouds set to 250 - Checked that the RR patch strategy is applied to all subclouds successfully. - Repeat this process in more 250 subclouds - Checked that the patch strategy is applied to all subclouds successfully. 4. (PASSED) Strategy Alteration and NRR Patch Deployment (Max 500 Subclouds): - Eliminate the existing patch strategy. - Initiate a NRR patch strategy, adjusting the max_parallel_subclouds parameter to 500. - Checked that the "in-service test" NRR patch is successfully applied across all subclouds and that no linked issues arise. Closes-Bug: #2025646 Change-Id: I95e9c8f3cd904d7f637da2ea69a83fd7fa5f03a1 Signed-off-by: Bezerra Filho, Moacir <Moacir.BezerraFilho@windriver.com>	2023-09-08 13:13:07 +00:00
Zuul	47eccd0e9f	Merge "Update kubelet system overrides on unlock"	2023-09-08 05:11:59 +00:00
Gleb Aronsky	7d870177c6	Update kubelet system overrides on unlock Move generation of kubelet's systemd override file, kube-stx-override.conf, from platform::kubernetes::master::init to platform::kubernetes::configuration so that the file will be generated on every host unlock. This facilitates delivery of systemd service changes via patches to existing installs. This change is needed by bug 2027810 to ensure that the orphan volume cleanup script is executed as part of the systemd ExecStartPre kubelet service override. Test Plan: Pass: - Update the kube-stx-override.conf.erb file - Lock the host - Unlock the host - Verify that kube-stx-override.conf has been updated - Verify AIO-SX - Verify Standard config Partial-Bug: 2027810 Change-Id: I3b496abc807bf75716d28079c62ef4700dcd3244 Signed-off-by: Gleb Aronsky <gleb.aronsky@windriver.com>	2023-09-06 13:50:27 -07:00
Joseph Vazhappilly	6c418b0441	Update dnsmasq conf file for host-record support Add option 'conf-file' in dnsmasq conf file to use additional config file in persistent store. This additional conf file is used to support host-record option of dnsmasq. Users can add A, AAAA and PTR records to the DNS with TTL support. Test Plan: PASS: Successful build PASS: Successful bootstrap and unlock PASS: Verify ping hostnames by adding host-record in new conf file Story: 2010835 Task: 48724 Change-Id: I583d1cf783a32a3dc6f2d1c9786287a7f64809a3 Signed-off-by: Joseph Vazhappilly <joseph.vazhappillypaily@windriver.com>	2023-09-05 05:06:40 -04:00
Samuel Toledo	3d5b46834a	Remove rabbitmq dependencies from sysinv puppet Continuing the efforts from [1], this review consists in removing all dependencies related to amqp classes as well as initializations for rabbitmq variables. This removal can be done because sysinv does not use rabbitmq. Test plan PASS - Perform fresh install and bootstrap in an AIO-SX successfully PASS - Perform fresh install and bootstrap in an AIO-DX successfully PASS - Run any system command successfully (system host-list, system application-list, etc) Story: 2010802 Task: 48578 [1] - https://storyboard.openstack.org/#!/story/2010802 Change-Id: I5da60b97ac8808d95d5b76ade065ea521e62e251 Signed-off-by: Samuel Toledo <samuel.presatoledo@windriver.com>	2023-08-31 19:43:12 +00:00
Luis Marquitti	a1784deca9	Revert Patch of puppet-manifest-apply.sh Script puppet-manifest-apply.sh was patched until the warnings in running the bootstrap, aio unlock and runtime manifests were resolved on Debian: https://review.opendev.org/c/starlingx/stx-puppet/+/844121 In addition to the patch reversal, a whitelist was created for virtual environments, where any warnings during the development process can be added. The whitelist was implemented with an initial warning ("Could not retrieve fact ipaddress"), that only affects virtual environments. Test Plan: PASS: Build & Install PASS: AIO-SX & AIO-DX Successful Bootstrap PASS: AIO-SX & AIO-DX Successful Unlock PASS: Verified that Warnings added to the whitelist doesn't affect manifests execution in virtual environments. Story: 2010757 Task: 48644 Change-Id: Id67facc82bed7e069efb5f52e0775a69da355de0 Signed-off-by: Luis Marquitti <luis.eduardoangelinimarquitti@windriver.com>	2023-08-31 09:46:35 -03:00
Zuul	cdefacfea6	Merge "Replace puppet template with shell script"	2023-08-29 17:28:09 +00:00
Heron Vieira	0c43be4cd1	Replace puppet template with shell script Based on the goal of the story 2010802, this review replaces the puppet template that calls the manage-partitions script with a shell script, saving time by not rendering the puppet template, just executing the shell script. Tests showed a time reduction of around 10%. Test plan PASS: AIO-SX fresh install, bootstrap and initial unlock. PASS: AIO-DX fresh install, bootstrap and initial unlock for all nodes. PASS: Standard (2+2) fresh install, bootstrap and initial unlock for all nodes. PASS: AIO-SX lock and unlock after install. PASS: AIO-DX lock and unlock after install. PASS: Standard (2+2) lock and unlock after install. PASS: Standard (2+2):Test manage-partitions script with modify, delete and create operations on all hosts. PASS: SX:Test manage-partitions script with modify, delete and create operations. Story: 2010802 Task: 48595 Signed-off-by: Heron Vieira <heron.vieira@windriver.com> Change-Id: I95847762b08d49d0fe8cf144691321489ea5b2c9	2023-08-29 14:48:44 +00:00

1 2 3 4 5 ...

1813 Commits