332 lines
10 KiB
ReStructuredText
332 lines
10 KiB
ReStructuredText
.. raw:: pdf
|
|
|
|
PageBreak
|
|
|
|
HowTo Notes
|
|
===========
|
|
|
|
.. index:: HowTo: Create an XFS disk partition
|
|
|
|
.. _create-the-XFS-partition:
|
|
|
|
HowTo: Create an XFS disk partition
|
|
-----------------------------------
|
|
|
|
In most cases, Fuel creates the XFS partition for you. If for some reason you
|
|
need to create it yourself, use this procedure:
|
|
|
|
.. note:: Replace ``/dev/sdb`` with the appropriate block device you wish to
|
|
configure.
|
|
|
|
1. Create the partition itself::
|
|
|
|
fdisk /dev/sdb
|
|
n(for new)
|
|
p(for partition)
|
|
<enter> (to accept the defaults)
|
|
<enter> (to accept the defaults)
|
|
w(to save changes)
|
|
|
|
2. Initialize the XFS partition::
|
|
|
|
mkfs.xfs -i size=1024 -f /dev/sdb1
|
|
|
|
3. For a standard swift install, all data drives are mounted directly under
|
|
/srv/node, so first create the mount point::
|
|
|
|
mkdir -p /srv/node/sdb1
|
|
|
|
4. Finally, add the new partition to fstab so it mounts automatically, then
|
|
mount all current partitions::
|
|
|
|
echo "/dev/sdb1 /srv/node/sdb1 xfs
|
|
noatime,nodiratime,nobarrier,logbufs=8 0 0" >> /etc/fstab
|
|
mount -a
|
|
|
|
|
|
.. index:: HowTo: Redeploy a node from scratch
|
|
|
|
.. _Redeploy_node_from_scratch:
|
|
|
|
HowTo: Redeploy a node from scratch
|
|
------------------------------------
|
|
|
|
Compute and Cinder nodes can be redeployed in both multinode and multinode HA
|
|
configurations. However, controllers cannot be redeployed without compeltely
|
|
redeploying the environment. To do so, follow these steps:
|
|
|
|
1. Remove the node from your environment in the Fuel UI
|
|
2. Deploy Changes
|
|
3. Wait for the host to become available as an unallocated node
|
|
4. Add the node to the environment with the same role as before
|
|
5. Deploy Changes
|
|
|
|
.. _Enable_Disable_Galera_autorebuild:
|
|
|
|
.. index:: HowTo: Galera Cluster Autorebuild
|
|
|
|
HowTo: Enable/Disable Galera Cluster Autorebuild Mechanism
|
|
----------------------------------------------------------
|
|
|
|
By defaults Fuel reassembles Galera cluster automatically without need for any
|
|
user interaction.
|
|
|
|
To prevent `autorebuild feature` you shall do::
|
|
|
|
crm_attribute -t crm_config --name mysqlprimaryinit --delete
|
|
|
|
To re-enable `autorebuild feature` you should do::
|
|
|
|
crm_attribute -t crm_config --name mysqlprimaryinit --update done
|
|
|
|
.. index:: HowTo: Troubleshoot Corosync/Pacemaker
|
|
|
|
How To Troubleshoot Corosync/Pacemaker
|
|
--------------------------------------
|
|
|
|
Pacemaker and Corosync come with several CLI utilities that can help you
|
|
troubleshoot and understand what is going on.
|
|
|
|
crm - Cluster Resource Manager
|
|
++++++++++++++++++++++++++++++
|
|
|
|
This is the main pacemaker utility it shows you state of pacemaker cluster.
|
|
Several most popular commands that you can use to understand whether your
|
|
cluster is consistent:
|
|
|
|
**crm status**
|
|
|
|
This command shows you the main information about pacemaker cluster and state of
|
|
resources being managed::
|
|
|
|
crm(live)# status
|
|
============
|
|
Last updated: Tue May 14 15:13:47 2013
|
|
Last change: Mon May 13 18:36:56 2013 via cibadmin on fuel-controller-01
|
|
Stack: openais
|
|
Current DC: fuel-controller-01 - partition with quorum
|
|
Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
|
|
5 Nodes configured, 5 expected votes
|
|
3 Resources configured.
|
|
============
|
|
|
|
Online: [ fuel-controller-01 fuel-controller-02 fuel-controller-03
|
|
fuel-controller-04 fuel-controller-05 ]
|
|
|
|
p_quantum-plugin-openvswitch-agent (ocf::pacemaker:quantum-agent-ovs): Started fuel-controller-01
|
|
p_quantum-dhcp-agent (ocf::pacemaker:quantum-agent-dhcp): Started fuel-controller-01
|
|
p_quantum-l3-agent (ocf::pacemaker:quantum-agent-l3): Started fuel-controller-01
|
|
|
|
**crm(live)# resource**
|
|
|
|
Here you can enter resource-specific commands::
|
|
|
|
crm(live)resource# status`
|
|
|
|
p_quantum-plugin-openvswitch-agent (ocf::pacemaker:quantum-agent-ovs) Started
|
|
p_quantum-dhcp-agent (ocf::pacemaker:quantum-agent-dhcp) Started
|
|
p_quantum-l3-agent (ocf::pacemaker:quantum-agent-l3) Started
|
|
|
|
**crm(live)resource# start|restart|stop|cleanup <resource_name>**
|
|
|
|
These commands allow you to respectively start, stop, and restart resources.
|
|
|
|
**cleanup**
|
|
|
|
The pacemaker cleanup command resets a resource's state on the node if it is
|
|
currently in a failed state or due to some unexpected operation, such as some
|
|
side effects of a SysVInit operation on the resource. In such an event,
|
|
pacemaker will manage it by itself, deciding which node will run the resource.
|
|
|
|
Example::
|
|
|
|
3 Nodes configured, 3 expected votes
|
|
3 Resources configured.
|
|
============
|
|
|
|
3 Nodes configured, 3 expected votes
|
|
16 Resources configured.
|
|
|
|
|
|
Online: [ controller-01 controller-02 controller-03 ]
|
|
|
|
vip__management_old (ocf::heartbeat:IPaddr2): Started controller-01
|
|
vip__public_old (ocf::heartbeat:IPaddr2): Started controller-02
|
|
Clone Set: clone_p_haproxy [p_haproxy]
|
|
Started: [ controller-01 controller-02 controller-03 ]
|
|
Clone Set: clone_p_mysql [p_mysql]
|
|
Started: [ controller-01 controller-02 controller-03 ]
|
|
Clone Set: clone_p_quantum-openvswitch-agent [p_quantum-openvswitch-agent]
|
|
Started: [ controller-01 controller-02 controller-03 ]
|
|
Clone Set: clone_p_quantum-metadata-agent [p_quantum-metadata-agent]
|
|
Started: [ controller-01 controller-02 controller-03 ]
|
|
p_quantum-dhcp-agent (ocf::mirantis:quantum-agent-dhcp): Started controller-01
|
|
p_quantum-l3-agent (ocf::mirantis:quantum-agent-l3): Started controller-03
|
|
|
|
In this case there were residual OpenStack agent processes that were started by
|
|
pacemaker in case of network failure and cluster partitioning. After the
|
|
restoration of connectivity pacemaker saw these duplicate resources running on
|
|
different nodes. You can let it clean up this situation automatically or, if you
|
|
do not want to wait, cleanup them manually.
|
|
|
|
.. seealso::
|
|
|
|
crm interactive help and documentation resources for Pacemaker
|
|
(e.g. http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/cha.ha.manual_config.html).
|
|
|
|
In some network scenarios one can get cluster split into several parts and
|
|
``crm status`` showing something like this::
|
|
|
|
On ctrl1
|
|
============
|
|
….
|
|
Online: [ ctrl1 ]
|
|
|
|
On ctrl2
|
|
============
|
|
….
|
|
Online: [ ctrl2 ]
|
|
|
|
On ctrl3
|
|
============
|
|
….
|
|
Online: [ ctrl3 ]
|
|
|
|
You can troubleshoot this by checking corosync connectivity between nodes.
|
|
There are several points:
|
|
|
|
1) Multicast should be enabled in the network, IP address configured as
|
|
multicast should not be filtered. The mcast port, a single udp port should
|
|
be accepted on the management network among all controllers
|
|
|
|
2) Corosync should start after network interfaces are activated.
|
|
|
|
3) `bindnetaddr` should be located in the management network or at least in
|
|
the same multicast reachable segment
|
|
|
|
You can check this in output of ``ip maddr show``:
|
|
|
|
.. code-block:: none
|
|
:emphasize-lines: 1,8
|
|
|
|
5: br-mgmt
|
|
link 33:33:00:00:00:01
|
|
link 01:00:5e:00:00:01
|
|
link 33:33:ff:a3:e2:57
|
|
link 01:00:5e:01:01:02
|
|
link 01:00:5e:00:00:12
|
|
inet 224.0.0.18
|
|
inet 239.1.1.2
|
|
inet 224.0.0.1
|
|
inet6 ff02::1:ffa3:e257
|
|
inet6 ff02::1
|
|
|
|
**corosync-objctl**
|
|
|
|
This command is used to get/set runtime corosync configuration values including
|
|
status of corosync redundant ring members::
|
|
|
|
runtime.totem.pg.mrp.srp.members.134245130.ip=r(0) ip(10.107.0.8)
|
|
runtime.totem.pg.mrp.srp.members.134245130.join_count=1
|
|
...
|
|
runtime.totem.pg.mrp.srp.members.201353994.ip=r(0) ip(10.107.0.12)
|
|
runtime.totem.pg.mrp.srp.members.201353994.join_count=1
|
|
runtime.totem.pg.mrp.srp.members.201353994.status=joined
|
|
|
|
|
|
If IP of the node is 127.0.0.1 it means that corosync started when only loopback
|
|
interfaces was available and bound to it.
|
|
|
|
If there is only one IP in members list that means there is corosync connectivity
|
|
issue because the node does not see the other ones. The same stays for the case
|
|
when members list is incomplete.
|
|
|
|
.. index:: HowTo: Smoke Test HA
|
|
|
|
How To Smoke Test HA
|
|
--------------------
|
|
|
|
To test if NeutrnoHA is working, simply shut down the node hosting, e.g.
|
|
Neutron agents (either gracefully or hardly). You should see agents start on
|
|
the other node::
|
|
|
|
|
|
# crm status
|
|
|
|
Online: [ fuel-controller-02 fuel-controller-03 fuel-controller-04 fuel-controller-05 ]
|
|
OFFLINE: [ fuel-controller-01 ]
|
|
|
|
p_quantum-plugin-openvswitch-agent (ocf::pacemaker:quantum-agent-ovs): Started fuel-controller-02
|
|
p_quantum-dhcp-agent (ocf::pacemaker:quantum-agent-dhcp): Started fuel-controller-02
|
|
p_quantum-l3-agent (ocf::pacemaker:quantum-agent-l3): Started fuel-controller-02
|
|
|
|
and see corresponding Neutron interfaces on the new Neutron node::
|
|
|
|
# ip link show
|
|
|
|
11: tap7b4ded0e-cb: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
|
|
12: qr-829736b7-34: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
|
|
13: qg-814b8c84-8f: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
|
|
|
|
You can also check ``ovs-vsctl show output`` to see that all corresponding
|
|
tunnels/bridges/interfaces are created and connected properly::
|
|
|
|
ce754a73-a1c4-4099-b51b-8b839f10291c
|
|
Bridge br-mgmt
|
|
Port br-mgmt
|
|
Interface br-mgmt
|
|
type: internal
|
|
Port "eth1"
|
|
Interface "eth1"
|
|
Bridge br-ex
|
|
Port br-ex
|
|
Interface br-ex
|
|
type: internal
|
|
Port "eth0"
|
|
Interface "eth0"
|
|
Port "qg-814b8c84-8f"
|
|
Interface "qg-814b8c84-8f"
|
|
type: internal
|
|
Bridge br-int
|
|
Port patch-tun
|
|
Interface patch-tun
|
|
type: patch
|
|
options: {peer=patch-int}
|
|
Port br-int
|
|
Interface br-int
|
|
type: internal
|
|
Port "tap7b4ded0e-cb"
|
|
tag: 1
|
|
Interface "tap7b4ded0e-cb"
|
|
type: internal
|
|
Port "qr-829736b7-34"
|
|
tag: 1
|
|
Interface "qr-829736b7-34"
|
|
type: internal
|
|
Bridge br-tun
|
|
Port "gre-1"
|
|
Interface "gre-1"
|
|
type: gre
|
|
options: {in_key=flow, out_key=flow, remote_ip="10.107.0.8"}
|
|
Port "gre-2"
|
|
Interface "gre-2"
|
|
type: gre
|
|
options: {in_key=flow, out_key=flow, remote_ip="10.107.0.5"}
|
|
Port patch-int
|
|
Interface patch-int
|
|
type: patch
|
|
options: {peer=patch-tun}
|
|
Port "gre-3"
|
|
Interface "gre-3"
|
|
type: gre
|
|
options: {in_key=flow, out_key=flow, remote_ip="10.107.0.6"}
|
|
Port "gre-4"
|
|
Interface "gre-4"
|
|
type: gre
|
|
options: {in_key=flow, out_key=flow, remote_ip="10.107.0.7"}
|
|
Port br-tun
|
|
Interface br-tun
|
|
type: internal
|
|
ovs_version: "1.4.0+build0"
|
|
|