P24 - High Availability with Ceph On Proxmox 9 Failover Test
Proxmox P24 – High Availability with Ceph Failover Test
Step-by-Step Ceph HA Configuration on Proxmox VE 9
High Availability (HA) combined with Ceph storage is one of the most powerful features in Proxmox VE 9. In this tutorial, we demonstrate how to build a resilient Proxmox cluster using Ceph distributed storage and perform a real failover test when a node goes down.
You will learn how to configure a 3-node Proxmox cluster, install Ceph, create OSDs and pools, move VM disks to Ceph storage, configure HA resource rules, and simulate node failure.
If you manage production workloads, enterprise infrastructure, or serious home labs, understanding Proxmox HA with Ceph is essential for minimizing downtime and ensuring data integrity.
3.1️⃣ Preparation
Before configuring High Availability with Ceph, proper preparation is required.
🔧 Lab Infrastructure
Prepare 3 Proxmox nodes.
Each node has 3 disks:
• 1 disk → OS (Proxmox VE)
• 2 disks → Ceph OSD
Configuration:
• Pve01: Disk 2,3: 30Gb : 192.168.16.200
• Pve02: Disk 2,3: 40Gb : 192.168.16.201
• Pve03: Disk 2,3: 45Gb : 192.168.16.202
🔹 Step 1 — Set Disk Serial (If Using Proxmox VM)
If installing inside Proxmox VM, set disk serial manually (physical disks already have serial).
nano /etc/pve/qemu-server/102.conf
serial=DISK05
serial=DISK06
nano /etc/pve/qemu-server/103.conf
serial=DISK03
serial=DISK04
nano /etc/pve/qemu-server/104.conf
serial=DISK01
serial=DISK02
🔹 Step 2 — Prepare Windows 10 VM on PVE01
PVE01 node contains Windows 10 VM for HA testing.
🔹 Step 3 — Ensure Same Time on All Nodes
Cluster nodes must have synchronized time.
timedatectl status
🔹 Step 4 — Verify Disks Before Creating Ceph OSD
List disks carefully to avoid accidental deletion:
lsblk
fdisk -l
Always double-check disk identity before creating OSD.
3.2️⃣ Install Ceph
🔵 Step 1 — Create 3-Node Cluster
On Pve01:
pvecm create tsf
Get IP info of pve01 and paste into hosts file of pve02 and pve03:
192.168.16.200 pve01zfs.tsf.id.vn pve01zfs
On pve02 and pve03:
pvecm add pve01zfs.tsf.id.vn
Cluster must be fully healthy before proceeding.
🔵 Step 2 — Install Ceph
In GUI:
Datacenter → Ceph → Install Ceph
Repeat installation on the remaining two PVE nodes.
All nodes must run the same Ceph version.
🔵 Step 3 — Create Ceph MON
Add MON (Monitor).
Add Manager (Administrator).
Ceph MON ensures cluster state consistency.
🔵 Step 4 — Create Ceph OSD
Create OSD on each node using prepared disks.
Each node contributes storage to the Ceph distributed system.
🔵 Step 5 — Create Ceph Pool
Create the pool once on one node only.
The pool will automatically be available cluster-wide.
3.3️⃣ Create Ceph HA Configuration
Now we integrate Ceph storage with Proxmox HA.
🔹 Step 1 — Move VM Disk to Ceph Storage
Move Windows VM disk to Ceph pool.
👉 Important Notes:
• Moving disk loses optional capacity over time.
• VM can remain powered ON during move (online move supported).
Ceph shared storage allows VM to run on any node without disk replication delay.
🔹 Step 2 — Add VM to HA Manager
Add HA resource.
Add HA preference rule.
HA resource: select VM HA
Priority:
• pve01 = 3
• pve02 = 2
• pve03 = 1
This ensures VM prefers running on pve01 but can failover to other nodes automatically.
3.4️⃣ Simulate HA Failover Test
Now perform real failover testing.
Down pve01
When pve01 is offline:
→ HA manager detects node failure
→ VM automatically starts on pve02 (based on priority rule)
→ Ceph ensures disk availability across cluster
→ No data loss
Once pve01 returns:
→ VM can migrate back depending on HA policy
This demonstrates true High Availability using shared distributed storage.
🔐 Why Proxmox HA with Ceph Is Powerful
Using Ceph instead of ZFS replication provides:
• True shared storage
• Immediate failover (no snapshot promotion required)
• No dependency on scheduled replication
• Real-time distributed data consistency
• Higher availability in production environments
Ceph distributes data across multiple nodes and replicates blocks automatically, ensuring redundancy and integrity.
🚀 Final Thoughts
Proxmox VE 9 combined with Ceph storage delivers enterprise-grade High Availability without expensive licensing costs. By configuring a proper 3-node cluster, installing Ceph MON and OSD correctly, and setting HA priority rules, you can build a fully resilient virtualization environment.
This architecture is ideal for:
Enterprise infrastructure
Critical application hosting
Virtualized production workloads
Advanced home labs
IT professionals preparing for real-world deployment
Mastering Proxmox HA with Ceph significantly enhances your virtualization expertise and prepares you for advanced infrastructure management.
See also related articles
P21 – How to Schedule Automatic Shutdown and Startup of VMs in Proxmox VE
P21 – How to Schedule Automatic Shutdown and Startup of VMs in Proxmox VE ⏰ Proxmox VE – How to Schedule Automatic VM Start and Shutdown Using Cron (Step-by-Step Guide) Automating virtual machine operations is an essential skill for every Proxmox administrator. In many real-world environments, you may need virtual...
Read MoreP15 – Backup and Restore VM in Proxmox VE
P15 – Backup and Restore VM in Proxmox VE 🚀 Proxmox VE P15 – Backup and Restore VMs (Full Step-by-Step Guide) Data protection is one of the most critical responsibilities of any system administrator.In Proxmox VE, having a proper backup and restore strategy ensures your infrastructure can quickly recover from...
Read MoreP14 – How to Remove Cluster Group Safely on Proxmox
Proxmox VE 9 P14: How to Remove Cluster Group Safely In Proxmox (Step-by-Step Guide) 🚀 Proxmox VE 9 – How to Remove Cluster Group (Step-by-Step) In some scenarios, you may need to remove a Proxmox cluster configuration completely, especially when: ❌ A node failed permanently ❌ The cluster was misconfigured...
Read More