Add debug guide
Added new guide to developer resources and contributor pages. Alphasorted list of guides on developer resources page. Closes-Bug: 1873500 Change-Id: I2344a437ef85c1352ac9f6a5da0a8ac64ff78f1a Signed-off-by: MCamp859 <maryx.camp@intel.com>
This commit is contained in:
		| @@ -33,6 +33,7 @@ Additional StarlingX-specific resources are listed below. | |||||||
|  |  | ||||||
|    development_process |    development_process | ||||||
|    ../developer_resources/code-submission-guide |    ../developer_resources/code-submission-guide | ||||||
|  |    ../developer_resources/debug_issues | ||||||
|  |  | ||||||
| -------------------- | -------------------- | ||||||
| Additional resources | Additional resources | ||||||
|   | |||||||
							
								
								
									
										131
									
								
								doc/source/developer_resources/debug_issues.rst
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										131
									
								
								doc/source/developer_resources/debug_issues.rst
									
									
									
									
									
										Normal file
									
								
							| @@ -0,0 +1,131 @@ | |||||||
|  | ====================== | ||||||
|  | Debug StarlingX Issues | ||||||
|  | ====================== | ||||||
|  |  | ||||||
|  | This guide contains some basic steps for debugging issues on StarlingX. | ||||||
|  |  | ||||||
|  | .. contents:: | ||||||
|  |    :local: | ||||||
|  |    :depth: 1 | ||||||
|  |  | ||||||
|  | ---------------- | ||||||
|  | Record the issue | ||||||
|  | ---------------- | ||||||
|  |  | ||||||
|  | Record information about the issue so it can be reproduced during debugging. The | ||||||
|  | items below describe some issue characteristics to capture. | ||||||
|  |  | ||||||
|  | *   Deployment issue type, such as bootstrap failure, provisioning failure, or | ||||||
|  |     functional failures. | ||||||
|  |  | ||||||
|  | *   Check the StarlingX version with the command: | ||||||
|  |     :: | ||||||
|  |  | ||||||
|  |       cat /etc/build.info | ||||||
|  |  | ||||||
|  |  | ||||||
|  | *   Check the StarlingX deployment configuration, such as: Simplex, Duplex, | ||||||
|  |     Multi-node, by viewing the platform configuration file: | ||||||
|  |     :: | ||||||
|  |  | ||||||
|  |       cat /etc/platform/platform.conf | ||||||
|  |  | ||||||
|  | *   Server type, such as bare metal server(s) or VMs. | ||||||
|  |  | ||||||
|  | *   Hardware device types and characteristics, such as NICs, PCI cards, # of | ||||||
|  |     hard disks, and RAM size. | ||||||
|  |  | ||||||
|  | *   Other aspects of the issue include: steps for reproducing, expected results, | ||||||
|  |     actual results, and so on. | ||||||
|  |  | ||||||
|  | *   Can the issue be reproduced regularly or occasionally? | ||||||
|  |  | ||||||
|  | *   Gather log files and configuration files using the ``collect`` command. | ||||||
|  |  | ||||||
|  |  | ||||||
|  | --------------------- | ||||||
|  | Check status and logs | ||||||
|  | --------------------- | ||||||
|  |  | ||||||
|  | *   Log in to the active controller. | ||||||
|  |  | ||||||
|  | *   Check services using the ``sm-dump`` command: | ||||||
|  |     :: | ||||||
|  |  | ||||||
|  |       sudo sm-dump | ||||||
|  |  | ||||||
|  | *   Check services using the ``systemctl`` command. | ||||||
|  |  | ||||||
|  | *   Apply the platform environment for ``sysadmin`` using: | ||||||
|  |     :: | ||||||
|  |  | ||||||
|  |       source /etc/platform/openrc | ||||||
|  |  | ||||||
|  | *   Check alarms from Fault-Manager using: | ||||||
|  |     :: | ||||||
|  |  | ||||||
|  |       fm alarm-list --uuid | ||||||
|  |       fm alarm-show <uuid> | ||||||
|  |  | ||||||
|  | *   Search for errors in ``/var/log``. | ||||||
|  |  | ||||||
|  |     *   You **must** check ``/var/log/sysinv.log`` for errors. | ||||||
|  |     *   You can get hints from ``sysinv.log`` for many deployment failures. | ||||||
|  |     *   Look into other log files based on the functional area. | ||||||
|  |  | ||||||
|  | *   If a functional area log file includes errors, check the associated | ||||||
|  |     configuration file, which is typically located under the ``/etc/`` | ||||||
|  |     subdirectory. | ||||||
|  |  | ||||||
|  | *   You may need to enable the ``debug`` option in the configuration file. | ||||||
|  |  | ||||||
|  | ---------------- | ||||||
|  | Debug and triage | ||||||
|  | ---------------- | ||||||
|  |  | ||||||
|  | *   Check the Kubernetes status for: node, pod/job, endpoint, services, secret, | ||||||
|  |     configmap. | ||||||
|  |  | ||||||
|  | *   Check the two major namespaces: kube-system, openstack | ||||||
|  |  | ||||||
|  | *   If issues occur inside containerized components, you need to enter the | ||||||
|  |     service using the ``kubectl exec`` command. | ||||||
|  |  | ||||||
|  | --------------- | ||||||
|  | Implement fixes | ||||||
|  | --------------- | ||||||
|  |  | ||||||
|  | *   You can try to resolve the issue by manually making some online | ||||||
|  |     changes without rebooting Linux or even re-deploying StarlingX. For | ||||||
|  |     example, you can modify system config files or the StarlingX | ||||||
|  |     config/database. You can make the changes and restart the corresponding | ||||||
|  |     services using the ``systemctl`` command or the StarlingX ``sm`` (service | ||||||
|  |     management) command. | ||||||
|  |  | ||||||
|  | *   If the fixes must be put on certain nodes (controller, worker, storage), | ||||||
|  |     you can temporarily **lock** that node, make changes using StarlingX | ||||||
|  |     commands, and then **unlock** the lock, to make the changes take effect. | ||||||
|  |  | ||||||
|  | *   If the changes must be made in C/C++/Go code, you can: | ||||||
|  |  | ||||||
|  |     *   Make the changes in your *development workspace* with the StarlingX | ||||||
|  |         codebase. | ||||||
|  |     *   Build the related packages using ``build-pkgs <package_name>``. | ||||||
|  |     *   Create and apply the patch using the :doc:`starlingx_patching` guide. | ||||||
|  |     *   Restart the services using the ``systemctl`` command or the StarlingX | ||||||
|  |         ``sm`` (service management) command. | ||||||
|  |  | ||||||
|  | -------------------- | ||||||
|  | Additional resources | ||||||
|  | -------------------- | ||||||
|  |  | ||||||
|  | *   Review the `StarlingX Discuss list <http://lists.starlingx.io/pipermail/starlingx-discuss/>`_ | ||||||
|  |     for similar questions and workarounds from the community. | ||||||
|  |  | ||||||
|  | *   Check the `StarlingX Launchpad <https://launchpad.net/starlingx>`_ for | ||||||
|  |     similar issues and potential workarounds. | ||||||
|  |  | ||||||
|  | *   Open a new `StarlingX Launchpad <https://launchpad.net/starlingx>`_ item to | ||||||
|  |     report a bug. | ||||||
|  |  | ||||||
|  |  | ||||||
| @@ -10,16 +10,17 @@ Developer Resources | |||||||
|  |  | ||||||
|    build_guide |    build_guide | ||||||
|    Layered_Build |    Layered_Build | ||||||
|  |    backup_restore | ||||||
|  |    build_docker_image | ||||||
|    code-submission-guide |    code-submission-guide | ||||||
|  |    debug_issues | ||||||
|  |    stx_tsn_in_kata | ||||||
|  |    mirror_repo | ||||||
|  |    move_to_new_openstack_version_in_starlingx | ||||||
|    navigate_source_code |    navigate_source_code | ||||||
|  |    Project Specifications <https://docs.starlingx.io/specs/> | ||||||
|    architecture_docs |    architecture_docs | ||||||
|    starlingx_patching |    starlingx_patching | ||||||
|    build_docker_image |  | ||||||
|    move_to_new_openstack_version_in_starlingx |  | ||||||
|    mirror_repo |  | ||||||
|    backup_restore |  | ||||||
|    Project Specifications <https://docs.starlingx.io/specs/> |  | ||||||
|    stx_ipv6_deployment |    stx_ipv6_deployment | ||||||
|    stx_tsn_in_kata |  | ||||||
|  |  | ||||||
|  |  | ||||||
|   | |||||||
		Reference in New Issue
	
	Block a user
	 MCamp859
					MCamp859