By default, when instances are created, they are spread in the cluster. So the first instance will be in the first compute node, the second one in the second compute node and so on. This will ensure that all nodes are properly used up and filled up.
Suppose, H1 has 8 CPUs and H2 has 8 CPUs each.
Suppose you have the following flavors:
In default OpenStack setup (spreading)
|Add Hypervisor H1||8||8||8|
|Add Hypervisor H2||8||16||16|
|add new instance: my-4cpu-flavor||h1||4||12|
|add new instance: my-4cpu-flavor||h2||4||8|
|add new instance: my-6cpu-flavor||host not found|
As you see, the dashboard will show that you have 8 CPU remaining, but due to spreading, you cannot spawn a new instance.
I have seen similar problem even when you have 50+ nodes in the cluster, shows 200 vcpu free, but you are unable to launch an instance because they are evenly spread, so combined it has 200 vcpu free, but no node has enough cpu to handle a flavor that needs 6 vcpus.
This also makes your capacity planning tricky, because you need to constantly add new machines in the cluster to be able launch new instances with a higher vcpu count.
To change this behavior, you need to edit your nova.conf and enable this setting:
What this does is now the nova scheduler will start filling up on the first available node as much as possible before moving on to the next node. So in our above example, the first 2 4cpu flavor will end up in h1 and the next one 6cpu will be on h2.
For customers, you can point them to this document where they can setup anti-affinity rules to ensure that their instances do not land on the same node.