OpenStack Private Cloud Architecture
This architecture deploys a high-availability OpenStack 2024.2 private cloud with a 3-node control plane, spine-leaf networking, and Ceph storage.
Network Configuration
Configure four logically separated networks with VLANs and MTU 9000 for tunnel and storage traffic.
| Network | Purpose | VLAN/Subnet | MTU |
|---|---|---|---|
| Management | API, DB, Message Queue, SSH | VLAN 10 / 172.29.236.0/22 |
1500 |
| Tunnel/Overlay | VXLAN/Geneve (Nova/Neutron) | VLAN 20 / 172.29.240.0/22 |
9000 |
| Storage | Ceph replication, iSCSI | VLAN 30 / 172.29.244.0/22 |
9000 |
| Provider/External | Tenant external access, Floating IPs | VLAN 40 / Public Range | 1500 |
Hardware Requirements:
- Use spine-leaf topology with 2x 25 GbE bonded (LACP) uplinks per compute node.
- Enable MTU 9000 on all switches for Tunnel and Storage networks.
- Configure MLAG or VPC for switch-level redundancy.
Control Plane Design
Deploy three controller nodes behind an HAProxy VIP managed by Keepalived.
+-----------------------+
| HAProxy (VIP) |
| + Keepalived |
+-----------+-----------+
|
+-----------------------+-----------------------+
| | |
+-------+-------+ +-------+-------+ +-------+-------+
| ctrl-01 | | ctrl-02 | | ctrl-03 |
| Keystone | | Keystone | | Keystone |
| Nova-API | | Nova-API | | Nova-API |
| Neutron-API | | Neutron-API | | Neutron-API |
| Glance-API | | Glance-API | | Glance-API |
| Cinder-API | | Cinder-API | | Cinder-API |
| MariaDB | | MariaDB | | MariaDB |
| RabbitMQ | | RabbitMQ | | RabbitMQ |
+---------------+ +---------------+ +---------------+
Database (MariaDB Galera)
- Configure a 3-node Galera Cluster with synchronous replication.
- Set
wsrep_cluster_addressandwsrep_node_addressin/etc/mysql/mariadb.conf.d/50-galera.cnf.
Message Queue (RabbitMQ)
- Cluster three nodes with mirrored quorum queues for
novaandneutron.
# /etc/rabbitmq/rabbitmq.conf
cluster_formation.peer_discovery_backend = classic
cluster_formation.classic_nodes = ["rabbit@ctrl-01", "rabbit@ctrl-02", "rabbit@ctrl-03"]
API Load Balancing
- Terminate SSL at HAProxy and perform active TCP/HTTP health checks on ports
:8774(Nova) and:9696(Neutron).
Compute Node Design
Install nova-compute, ovn-controller, openvswitch, libvirtd, qemu-kvm, ceph-common, and telegraf.
Hardware Specifications:
- CPU: 2x Intel Xeon Scalable or AMD EPYC (64+ cores total).
- RAM: 512 GB – 1 TB DDR5 ECC.
- Boot Disk: 2x 480 GB NVMe SSD (RAID 1).
- NIC: 2x 25 GbE (Bonded LACP).
Nova Configuration
# /etc/nova/nova.conf
[DEFAULT]
cpu_allocation_ratio = 4.0
ram_allocation_ratio = 1.0
reserved_host_memory_mb = 8192
reserved_host_cpus = 4
compute_driver = libvirt.LibvirtDriver
libvirt_type = qemu
vncserver_listen = 0.0.0.0
vncserver_proxyclient_address = <management_ip>
network_api_class = nova.network.ovs.network.OVSNetworkAPI
Storage Architecture
Deploy Ceph with 3 MON/MGR nodes and 5+ OSD nodes using NVMe for hot data and HDD for archival.
Storage Tiers (CRUSH Rules)
# Define CRUSH rules for NVMe (Fast) and HDD (Bulk)
ceph osd crush rule create-replicated fast-rule default host ssd
cceph osd pool create fast-volumes 128 128 replicated fast-rule
ceph osd crush rule create-replicated bulk-rule default host hdd
cceph osd pool create bulk-volumes 128 128 replicated bulk-rule
# Map to Cinder Volume Types
# /etc/cinder/cinder.conf
[DEFAULT]
volume_driver = cinder.volume.drivers.ceph.CephDriver
ceph_use_rbd_pool = true
ceph_rbd_pool = fast-volumes
# Create volume type in Horizon or CLI
openstack volume type create fast-ssd --property volume_backend_name=ceph-fast
openstack volume type create bulk-hdd --property volume_backend_name=ceph-bulk
Security Architecture
- Enforce TLS 1.3 at HAProxy and internal mTLS between services.
- Use Keystone with LDAP/AD federation or OIDC for authentication.
- Implement Security Groups (iptables/eBPF) and Project isolation via OVN.
- Store encryption keys and secrets in Barbican.
Monitoring and Operations
- Deploy Prometheus, Grafana, Alertmanager, and Loki for metrics, visualization, alerting, and logs.
- Monitor critical metrics:
nova_hypervisor_vcpus_used,neutron_agent_state,ceph_osd_op_latency,rabbitmq_queue_messages_ready, andhaproxy_frontend_queue_len.
Capacity Planning
- vCPUs:
(Physical Cores × Allocation Ratio) - Reserved Host CPUs - RAM:
(Physical RAM × Allocation Ratio) - Reserved Host RAM - Storage:
(Total OSD Capacity / Replication Factor) × 0.85 - Instances/Host:
MIN(vCPU Available / Flavor vCPU, RAM Available / Flavor RAM)
Deployment Tools
- Use Kolla-Ansible for containerized deployments and rapid upgrades.