starlingx/ha - ha - OpenDev: Free Software Needs Free Tools

Author	SHA1	Message	Date
marvin	0d5e7e5409	Removing unused flag disable_worker_services The disable_worker_services file was originally created to prevent the (bare metal) nova-compute services from running on a newly upgraded controller in an AIO-DX configuration. This situation no longer exists because the bare metal nova-compute services do not exist after transiting to containers. this flag is no longer needed. Removing all references to the disable_worker_services file. Change-Id: Ic9555a36890f613f440e97f9090b22ff5ec8fd82 Partial-Bug: #1838432 Signed-off-by: marvin <weifei.yu@intel.com>	2019-11-02 08:34:56 +08:00
Zuul	df4d161b04	Merge "Build layering, add layer build config file"	2019-10-30 12:19:09 +00:00
Zuul	f69bba0b34	Merge "openSUSE: Runtime Dependencies"	2019-10-23 03:39:15 +00:00
Zuul	bb2cafa72c	Merge "openSUSE: Open Build Service Artifacts"	2019-10-23 03:35:40 +00:00
Scott Little	fa609c5ee7	Build layering, add layer build config file Story: 2006166 Task: 37121 Change-Id: I57708587d763b2f87b78eec1878b17a68e2a36c8 Signed-off-by: Scott Little <scott.little@windriver.com>	2019-10-21 10:53:26 +08:00
Zuul	ba726d9f3e	Merge "Change shebang to help rpm runtime dependency detector."	2019-10-18 13:36:17 +00:00
Al Bailey	6260cb0b74	Turn off devstack as a zuul job devstack is failing, most likely because StarlingX uses postgres, and postgres was dropped in devstack by: `cf1c847191` I am not removing the devstack job declaration, or the devstack files because in the future StarlingX could convert from postgres to another DB backend, at which point we might want to revisit using devstack. Change-Id: I3adec4669d9181d71421f43905f86bf2e7e211c2 Partial-Bug: 1848557 Signed-off-by: Al Bailey <Al.Bailey@windriver.com>	2019-10-17 16:08:14 -05:00
Zuul	d2e00bf56b	Merge "Remove not needed shebangs in sm_client"	2019-10-16 14:11:28 +00:00
Abraham Arce	03b3f990ea	openSUSE: Runtime Dependencies Resolve runtime dependencies for the following service manager components: - sm-client - sm-tools - sm-api High availability OBS workspace has been moved to xe1gyq home project [0], adding repository Cloud_StarlingX_2.0_openSUSE_Leap_15.1 [1] in order to: - allow all succesfull packages appear under xe1gyq repository [2]. - automatically include other flock dependencies (e.g. mtce-devel). Refer to the following OBS workspaces to verify all service management packages have built successfully under repository Cloud_StarlingX_2.0_openSUSE_Leap_15.1: - https://build.opensuse.org/package/show/home:xe1gyq/sm-db - https://build.opensuse.org/package/show/home:xe1gyq/sm-common - https://build.opensuse.org/package/show/home:xe1gyq/sm - https://build.opensuse.org/package/show/home:xe1gyq/sm-client - https://build.opensuse.org/package/show/home:xe1gyq/sm-tools - https://build.opensuse.org/package/show/home:xe1gyq/sm-api [0] https://build.opensuse.org/project/show/home:xe1gyq [1] https://build.opensuse.org/repositories/home:xe1gyq [2] https://download.opensuse.org/repositories/home:/xe1gyq/ Depends-On: https://review.opendev.org/#/c/679686 Story: 2006684 Task: 36968 Task: 36969 Task: 36970 Change-Id: I0a21652fff83b5da8acdfb0191df87165b88389e Signed-off-by: Abraham Arce <abraham.arce.moreno@intel.com>	2019-10-09 10:05:54 -05:00
Abraham Arce	f38de3f45f	openSUSE: Open Build Service Artifacts OBS is a generic system to build and distribute binary packages from sources [0], StarlingX OBS Project: - Cloud:StarlingX:2.0 [1] Build Service Management uses Open Build Service (OBS) with the following base artifacts under Service Management repository: - Specfiles - Changelogs - Rpmlintrcs The following components are included and succesfully building, (with their source OBS repository): - sm [2] - sm-common [3] - sm-db [4] - sm-api [5] - sm-client [6] - sm-tools [7] The following considerations are taken for Gerrit files: - Added %changelog directive to all specfiles The following considerations are taken for OBS _service files: - Added parameter "extract" to get spec, changes and rpmlintrc files. - All component version standardized to 1.0.0 [0] openbuildservice.org [1] https://build.opensuse.org/project/show/Cloud:StarlingX:2.0 [2] https://build.opensuse.org/package/show/home:xe1gyq:branches:Cloud:StarlingX:2.0/sm [3] https://build.opensuse.org/package/show/home:xe1gyq:branches:Cloud:StarlingX:2.0/sm-common [4] https://build.opensuse.org/package/show/home:xe1gyq:branches:Cloud:StarlingX:2.0/sm-db [5] https://build.opensuse.org/package/show/home:xe1gyq:branches:Cloud:StarlingX:2.0/sm-api [6] https://build.opensuse.org/package/show/home:xe1gyq:branches:Cloud:StarlingX:2.0/sm-client [7] https://build.opensuse.org/package/show/home:xe1gyq:branches:Cloud:StarlingX:2.0/sm-tools Story: 2006508 Task: 36495 Task: 36496 Task: 36497 Task: 36498 Task: 36534 Task: 36794 Change-Id: I06a7e132de4892b846d99977ff1bfc5bf240ade4 Co-authored-by: Erich Cordoba <erich.cordoba.malibran@intel.com> Signed-off-by: Abraham Arce <abraham.arce.moreno@intel.com>	2019-10-09 10:05:20 -05:00
Bin Qian	fc0828238f	Bug1845393 remove interface recovering state In the case of a switch recycle, the connected nic will go down and up but the communication will restore after the switch is up and running. This could take a few seconds (much longer than anticipated). This holds off the i/f state update to the peer. Also remove the batching interface failover state change. This is already handled in the failover fsm fail_pending state. Change-Id: Ia810927dbbc4b3821f7915e6a42bceeac43d9e46 Closes-Bug: 1845393 Signed-off-by: Bin Qian <bin.qian@windriver.com>	2019-10-07 09:04:08 -04:00
Erich Cordoba	cc401099d7	Change shebang to help rpm runtime dependency detector. Not having this change causes a linter error in opensuse. Change-Id: I52830fa64bdb5f1b5bb00c4052f3c047be728bb3 Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-10-04 10:25:58 -05:00
Erich Cordoba	c8735e882a	Remove version from sm folder The sm component had the 1.0.0 version in the folder name, this change removes that version and updates the centos_pkg_dirs. Story: 2006623 Task: 36827 Depends-On: https://review.opendev.org/#/c/685128/ Change-Id: I6725d1f961c2a82275da5fabbff8e89a8dd6f245 Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-09-26 14:11:31 -05:00
Erich Cordoba	54a16057ff	Remove version from sm-db folder The sm-db component had the 1.0.0 version in the folder name, this change removes that version and updates the centos_pkg_dirs. Story: 2006623 Task: 36829 Depends-On: https://review.opendev.org/#/c/685127 Change-Id: Ia6025337529f4f48a89c175bb524548d81bc993f Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-09-26 14:08:15 -05:00
Erich Cordoba	44f220a3b8	Remove version from sm-common folder The sm-common component had the 1.0.0 version in the folder name, this change removes that version and updates the centos_pkg_dirs. Story: 2006623 Task: 36828 Change-Id: I0e998a3e2482bc06f3a91f9494a3e5d21faa28e7 Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-09-26 12:00:43 -05:00
Zuul	1ec7911bfe	Merge "Fix LSB headers in sm"	2019-09-20 16:43:47 +00:00
Erich Cordoba	326ffc96f4	Fix LSB headers in sm The opensuse build system reported two linter issues regarding the LSB scripts in sm. The issues are: - For `sm`: Has `Should-Start` but no `Should-Stop`. - For `sm-shutdown`: LSB header not found. To fix this issues the `Should-Stop` line was added in `sm` and the LSB header was added in `sm.shutdown` script. In `sm.shutdown` the `Default-Start` and `Default-Stop` were set as the same as `sm`. `sm.shutdown` does nothing on the start stage so this change won't affect any functionality. Story: 2006508 Task: 36648 Change-Id: I4fac67a0a1c1abd82e47a3293aeae3036ee9722b Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-09-19 18:17:20 -05:00
Zuul	f1bc749aa4	Merge "Ensure in AIO-SX, i/f down does not block node going enabled"	2019-09-19 18:01:55 +00:00
Bin Qian	2966c89c1c	Ensure in AIO-SX, i/f down does not block node going enabled AIO-SX by design does not have a peer, so it never needs to communicate potential peer before determining its role. For AIO-SX even all network interfaces are down, the node should still go enabled based on the situation of the node. Closes-Bug: 1844427 Change-Id: Iafe0a8209cdbd3f83514c07041856cf6b6824f9c Signed-off-by: Bin Qian <bin.qian@windriver.com>	2019-09-19 13:59:00 +00:00
Erich Cordoba	59321f0438	Remove not needed shebangs in sm_client The linters in the Opensuse build service are failing because sm_client has unneeded python shebangs in the code. This is because a python source code file that is not intended to be executed shouldn't include this shebang. Also, the linter fails as `/usr/bin/env python` is used causing that the dependency discovery tool fails. It is safe to use `/usr/bin/python` as currently we don't provide any other python version. Story: 2006508 Task: 36647 Change-Id: If3f83b9562414c3392515828a3c716a5bc23015d Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-09-16 08:34:39 -05:00
Zuul	b5f7d066f3	Merge "Modify memory leak in some abnormal cases by adding free function."	2019-09-11 21:10:30 +00:00
Zuul	193e7c390a	Merge "Fix format-truncation warnings in sm"	2019-09-11 21:07:59 +00:00
Erich Cordoba	c8691c93d8	Fix format-truncation warnings in sm Building sm is not possible in opensuse as the code present format-truncation warnings and the opensuse's build system enforces the -Werror flag. The solution is to define the proper string lengths. - SM_INTERFACE_NAME_MAX_CHAR was set to IFNAMSIZ. - SM_SERVICE_ACTION_PLUGIN_EXIT_CODE_MAX_CHAR increase to 32. - SM_SERVICE_HEARTBEAT_ADDRESS_MAX_CHAR decrease to 108. These changes were updated in the database schema as well. Story: 2006523 Task: 36551 Change-Id: Icce1d912c147fc6caaf06cc93de3cddadbcb0720 Signed-off-by: Erich Cordoba <erich.cordoba.malibran@intel.com>	2019-09-11 12:34:54 -05:00
Zuul	ebba79607c	Merge "Extend timeout for the kubectl cmds in dbmon"	2019-09-10 13:07:35 +00:00
Zuul	986825c832	Merge "Fix IPv6 standby controller boot loop"	2019-09-10 01:49:48 +00:00
Bin Qian	7f52df37bd	Fix IPv6 standby controller boot loop IPv6 multicast should be sent to the interface that the socket binds to. Closes-Bug: 1842949 Change-Id: I14b6c5193c67a0ddd69e31d1044219c4e9fd6b94 Signed-off-by: Bin Qian <bin.qian@windriver.com>	2019-09-09 13:15:59 +00:00
Bin Qian	cfd3686d8e	Extend timeout for the kubectl cmds in dbmon In AIO-DX, during the swact, dbmon experiences kubectl commands respond slower than expected. dbmon reports error while the kubectl commands not responding within 5 seconds, the 5 seconds timeout is too short. Extend the timeout to 10 seconds, to avoid reporting unnecessary error. Change-Id: Ie07c84e0a53c00ac78970bf6b06e6cf0b19479e1 Closes-Bug: 1837919 Signed-off-by: Bin Qian <bin.qian@windriver.com>	2019-09-05 15:34:28 -04:00
Scott Little	7775f12ddd	Config file changes to add 'stx-ocf-scripts ' after relocation from 'stx-upstream' Story: 2006166 Task: 35687 Depends-On: I665dc7fabbfffc798ad57843eb74dca16e7647a3 Change-Id: Icadc524bca7b6027caebf3923fce8260e17d9ef1 Signed-off-by: Scott Little <scott.little@windriver.com> Depends-On: I363681d077eb5724982ca0e9d7d4fa17ac7298dd Depends-On: I814d35ca3e55fbfb9e0a462f3f05ff2db6a9cca5	2019-09-04 15:59:21 -04:00
Don Penney	e544061f67	Update barbican OCF scripts to enhance logging This commit updates the barbican OCF scripts to address logging issues: - barbican-api is updated to set permissions on the logfile to restrict access - barbican-keystone-listener and barbican-worker are updated to log via syslog Depends-On: I31b29bb8ffff28cd329b383704b88cf73199bcec Change-Id: I814d35ca3e55fbfb9e0a462f3f05ff2db6a9cca5 Partial-Bug: 1836632 Signed-off-by: Don Penney <don.penney@windriver.com>	2019-09-04 15:43:48 -04:00
Bin Qian	9de51a38bc	Fix dbmon warning mariadb secret name changed to mariadb-dbadmin-password, update the ocf script accordingly. Depends-On: I777895497300cc605762db002958a778cd204e49 Change-Id: I31b29bb8ffff28cd329b383704b88cf73199bcec Closes-Bug: 1826891 Signed-off-by: Bin Qian <bin.qian@windriver.com>	2019-09-04 15:42:59 -04:00
Chris Friesen	80e84d3b0d	Add dbmon timeouts to handle swact scenario It turns out that when swacting we can end up with kubernetes going down for a while, causing kubectl commands to hang. Accordingly, let's add some timeouts to critical commands to limit how long they can hang for. Depends-On: I8d91dc13cb9a9adb7f7a7a95faadad4339ddb466 Change-Id: I777895497300cc605762db002958a778cd204e49 Story: 2004712 Task: 30410 Signed-off-by: Chris Friesen <chris.friesen@windriver.com>	2019-09-04 15:42:41 -04:00
Chris Friesen	2f44c61d11	Add dbmon ocf script for containerized mariadb on AIO-DX On a two-node system the openstack-helm chart for mariadb has issues. If you run with a single replica then your failover times are very long due to internal timeouts in kubernetes which prevent accessing the backing volume on the newly-active node. At the same time, you can't run a "garbd" pod the way we do on a full lab configuration because there is no third node to run it on. The only viable option we've found is to trigger something to explicitly tell the mariadb pod on the active node to bootstrap a new primary cluster if it loses quorum due to the other mariadb pod going away unexpectedly. Accordingly, this commit creates a new "dbmon" OCF script which behaves basically as follows: start -- return $OCF_SUCCESS stop -- return $OCF_NOT_RUNNING standby -- return $OCF_SUCCESS or $OCF_NOT_RUNNING depending on whether mariadb on this node is a member of the primary cluster active -- if mariadb on this node is not a member of the primary cluster then tell it to bootstrap a new primary cluster. Then check again and return $OCF_SUCCESS or $OCF_NOT_RUNNING depending on whether mariadb on this node is a member of the primary cluster monitor -- If mariadb on this node is a member of the primary cluster then return $OCF_RUNNING_MASTER on the active controller and $OCF_SUCCESS on the standby controller. If mariadb is not a member of the primary cluster return $OCF_NOT_RUNNING. There are a few complicating factors. If openstack application or mariadb chart not installed then treat it like being a member of the primary cluster. If the mariadb pod is still initializing treat it like not being a member of the primary cluster. If we're in a standard lab (with garbd running on a compute node) then don't actually tell mariadb to bootstrap a new primary cluster but just report whether it's a member of the primary cluster or not. Story: 2004712 Task: 30410 Depends-On: I2667d56a71b7d3881c03b6a5c1e5ed61d4f0b902 Change-Id: I8d91dc13cb9a9adb7f7a7a95faadad4339ddb466 Signed-off-by: Chris Friesen <chris.friesen@windriver.com>	2019-09-04 15:42:18 -04:00
Alex Kozyrev	0e9618e96c	OCF scripts to manage Barbican processes as an HA resource. Create OCF scripts for controlling Barbican processes lifecycle. There are three Barican proceses that needs to be managed: barbican-api, barbican-keystone-listener and barbican-worker. Depends-On: I63a6fd3d112a98449ea22524bb2a83b5db8ce6d1 Change-Id: I2667d56a71b7d3881c03b6a5c1e5ed61d4f0b902 Story: 2003108 Task: 27700 Signed-off-by: Alex Kozyrev <alex.kozyrev@windriver.com>	2019-09-04 15:41:59 -04:00
Scott Little	490a99a667	Add OCF file for cinder-backup As part of switching to the upstream implementation of cinder B&R, we need an OCF script to manage the cinder-backup process. Depends-On: I6bec51c7401339f4c71f9558d73389d0c793093d Change-Id: I63a6fd3d112a98449ea22524bb2a83b5db8ce6d1 Story: 2003715 Task: 26375 Signed-off-by: Scott Little <scott.little@windriver.com>	2019-09-04 15:41:36 -04:00
Scott Little	8d887a37ae	Move StarlingX OCF scripts into a stand alone package. The following upstream projects did not have OCF scripts and these were created for StarlingX: aodh-api aodh-evaluator aodh-listener aodh-notifier ceilometer-agent-notification heat-api heat-api-cfn heat-api-cloudwatch ironic-api ironic-conductor magnum-api magnum-conductor murano-api murano-engine nova-conductor nova-placement-api nova-serialproxy panko-api Move these out of stx/git.openstack-ras and place them into a seperate package within the openstack/stx-upstream repo. Depends-On: I080b6e893d5f6ccff04951879eed71e8ccbe0b52 Change-Id: I6bec51c7401339f4c71f9558d73389d0c793093d Story: 2003715 Task: 26375 Signed-off-by: Scott Little <scott.little@windriver.com>	2019-09-04 15:41:10 -04:00
Zuul	e9fa094a82	Merge "Fix memory leak bugs in sm-provision."	2019-08-29 19:29:48 +00:00
ZhangQig	8cbee33774	Fix memory leak bugs in sm-provision. Also, add a free call in sm_service_group_member_deprovision() and sm_service_deprovision(). Change-Id: If6009ce9df3b2a133610e7ce74f5006ecfc99803 Closes-Bug: #1837975 Signed-off-by: ZhangQing <zqhsh527@163.com>	2019-08-29 00:50:06 +00:00
Zuul	3d052b977c	Merge "Enhance timer system to avoid double deregister"	2019-08-22 13:35:06 +00:00
Andreas Jaeger	13e42caf7b	Use Zuul templates Use templates instead of individual jobs so that these can be changed in one place. Depends-On: https://review.opendev.org/677606 Change-Id: Ic70832ed4e4fba3343381f7ead611085c0849994	2019-08-21 12:54:55 +00:00
Al Bailey	d7dc7b1eaa	Fixing failing devstack zuul job The glance devstack plugin is not working for us, and is not needed for our devstack to work, so updating the zuul job to use the "min" devstack version that is used by other repos such as 'fault' and avoid setting up the glance devstack plugin altogether. Change-Id: Id16671961e10962530d2eaff28387b4b206e0a3b Partial-Bug: 1840292 Signed-off-by: Al Bailey <Al.Bailey@windriver.com>	2019-08-15 13:45:26 -05:00
Bin Qian	66e0404217	Enhance timer system to avoid double deregister The bug reported was because the dbmon service audit timer was overwritten accidentally, therefore no audit was performed so the dbmon service was not actually being audit. Major change is to enhance timer system to use global unique timer id (not reused) to ensure timer is not double deregistered by 2 different mechanisms (disarm/deregister). Change the timer id to 64 bit integer to ensure id never overflow. Above change eliminates the double deregistering a timer issue which could accidentally deregister a new timer that reuses the same id. Also some cleaning to get rid of cases that could double deregister timer (although it is no longer harmful as above mentioned change is in place) Change-Id: I2603870d2eb2749d78456e406095ae543353963f Closes-Bug: 1837724 Signed-off-by: Bin Qian <bin.qian@windriver.com>	2019-08-13 13:53:39 +00:00
Zuul	2d04c4e428	Merge "dcdbsync for containerized openstack services - SM"	2019-08-12 16:24:09 +00:00
Andy Ning	b64d3a384b	dcdbsync for containerized openstack services - SM This update added the dcdbsync service for containerized openstack services into SM. Note that this second dcdbsync instance is also running on platform (not containerized) Story: 2004766 Task: 36099 Change-Id: If406127d26d6230771c0d44105da3a08facf3277 Signed-off-by: Andy Ning <andy.ning@windriver.com>	2019-08-06 16:36:01 -04:00
Kristine Bujold	4ef138fcf1	Collapse the glance filesystem into platform The filesystem /opt/cgcs is removed and the “helm_charts” and “keystone” folders now resides under /opt/platform. ls /opt/platform/ armada config helm nfv puppet sysinv ls /opt/cgcs/ helm_charts keystone Resources related to cgcs-drbd and /opt/cgcs are removed from puppet. SMS is no longer monitoring these resources. Tested in AIO-SX, AIO-DX and Standard hardware labs. Depends-On: https://review.opendev.org/674360 Partial-Bug: 1830142 Change-Id: I4be7a877efb89bb9e5c2b067bdc7e4259f2b0c0c Signed-off-by: Kristine Bujold <kristine.bujold@windriver.com>	2019-08-02 13:35:54 -04:00
YeHuiSheng	5696e1e56d	Modify memory leak in some abnormal cases by adding free function. Change-Id: I0076635d9c19d39637bf56a3404a4e4ad1f9506f Closes-Bug: #1837976 Signed-off-by: YeHuiSheng <hsye@fiberhome.com>	2019-07-26 15:20:58 +08:00
Stefan Dinescu	95367fd675	Increase SM timeout for ceph-mon Note: this only affects AIO-DX setups as that is the only kind of setup where ceph-mon is managed by SM In some edge-cases, during a swact, ceph-mon may take too long to be stopped on the active controller resulting in a failed swact. This change increases the timeout to account for those edge cases. Change-Id: I3ace73650e4fe9aafc84c82e2ffe048f2039305e Partial-bug: 1836075 Signed-off-by: Stefan Dinescu <stefan.dinescu@windriver.com> v2.0.0.rc0	2019-07-25 14:59:03 +03:00
Zuul	bb8d962771	Merge "Fix the error links for ha docs"	2019-07-23 13:57:33 +00:00
Zuul	026d5fd730	Merge "Add mtc-agent service dependency to fm-mgr"	2019-07-18 16:34:39 +00:00
Bin Qian	a729bbabc6	Add mtc-agent service dependency to fm-mgr Add mtc-agent service dependency to fm-mgr to ensure mtc-agent shuts down before fm-mgr does. An issue was found that in rare cases a swact occurs when mtc-agent try to clear an alarm, while fm-mgr has been disabled, clear alarm message went lost. The alarm therefor remained not being able to clear. Closes-bug 1829289 Change-Id: I39196d5f3ce764a14b4d1e0fb1a4f3344ddd6a1a Signed-off-by: Bin Qian <bin.qian@windriver.com>	2019-07-12 13:00:13 -04:00
Mingyuan Qi	85b0ec621b	Add floating ip for ironic network This commit adds ironic-ip service to sm_db for ironic floating ip. Story: 2004760 Task: 35689 Change-Id: I45039427cc5c96fd0639cf086d7e431244c4e1d9 Signed-off-by: Mingyuan Qi <mingyuan.qi@intel.com>	2019-07-09 10:08:55 +08:00

1 2 3 4 5

250 Commits