========================== 2020-01-29 - Albert Braden ========================== Here are the scaling issues I've encountered recently at Synopsys, in reverse chronological order: Thursday 12/19/2019: openstack server list –all-projects does not return all VMs. --------------------------------------------------------------------------------- In /etc/nova/nova.conf we have default: # max_limit = 1000 The recordset cleanup script depends on correct output from “openstack server list –all-projects" Fix: Increased max_limit to 2000 The recordset cleanup script will run “openstack server list –all-projects|wc –l" and compare the output to max_limit, and refuse to run if max_limit is too low. If this happens, increase max_limit so that it is greater than the number of VMs in the cluster. As time permits we need to look into paging results: https://docs.openstack.org/api-guide/compute/paginated_collections.html Friday 12/13/2019: Arp table got full on pod2 controllers --------------------------------------------------------- https://www.cyberciti.biz/faq/centos-redhat-debian-linux-neighbor-table-overflow/ Fix: Increase sysctl values: .. code-block:: console --- a/roles/openstack/controller/neutron/tasks/main.yml +++ b/roles/openstack/controller/neutron/tasks/main.yml @@ -243,6 +243,9 @@ with_items: - { name: 'net.bridge.bridge-nf-call-iptables', value: '1' } - { name: 'net.bridge.bridge-nf-call-ip6tables', value: '1' } + - { name: 'net.ipv4.neigh.default.gc_thresh3', value: '4096' } + - { name: 'net.ipv4.neigh.default.gc_thresh2', value: '2048' } + - { name: 'net.ipv4.neigh.default.gc_thresh1', value: '1024' } 12/10/2019: RPC workers were overloaded --------------------------------------- http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011465.html Fix: increase number of RPC workers. modify /etc/neutron/neutron.conf on controllers: .. code-block:: console 148c148 < #rpc_workers = 1 --- > rpc_workers = 8 October 2019: Rootwrap ---------------------- Neutron was timing out because rootwrap was taking too long to spawn. Fix: Run rootwrap daemon: Add line to /etc/neutron/neutron.conf on the controllers: root_helper_daemon = "sudo /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf" Add line to /etc/sudoers.d/neutron_sudoers on the controllers: neutron ALL = (root) NOPASSWD: /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf