P25 - Ceph HA Cluster Replace Failed Node on Proxmox
🚀 Proxmox – P25 Ceph HA Cluster: Replace Failed Node on Proxmox (Full Demo)
🔎 Introduction
In this tutorial, we demonstrate how to replace a failed node in a Proxmox Ceph High Availability (HA) cluster step by step. When a Proxmox node crashes or becomes unreachable, your Ceph cluster may enter a degraded state. However, thanks to Ceph replication and HA mechanisms, your virtual machines can continue running without downtime — if the cluster is properly configured.
This guide shows you how to:
Safely remove a dead Proxmox node
Cleanly remove MON and OSD services from Ceph
Update the CRUSH map properly
Add a replacement node into the cluster
Reinstall Ceph services on the new node
Rebalance data automatically
Restore HA functionality
This full demo is ideal for IT professionals managing production environments and home lab enthusiasts learning Proxmox PVE 9 with Ceph HA.
🧪 5. Simulate a Dead Node
Due to limited lab equipment, the VM built for testing runs slowly.
This demo focuses on explaining each replacement step clearly.
In real-world production environments, physical Proxmox servers will perform significantly faster.
⚠️ 5.1 Symptoms
When a node fails (example: pve01zfs):
Multiple OSDs appear down
Ceph reports OSDs as down/out
If replication factor is sufficient (e.g. 3), VMs continue running on remaining nodes
Cluster health becomes degraded
🛠 5.2 Troubleshooting Procedure
🔹 Step 1: Delete the Dead Node
Remove node from cluster:
pvecm delnode pve01zfs
Delete leftover configuration files:
rm -rf /etc/pve/nodes/pve01zfs
🔹 Step 2: Remove MON pve01 from Ceph
First mark MON as down:
ceph mon down pve01zfs
Then remove it:
ceph mon remove pve01zfs
This completely removes the MON service from the Ceph cluster.
🔹 Step 3: Delete OSD pve01zfs
Identify OSD IDs:
ceph osd tree
Example:
osd.0
osd.1
Mark OSD as down:
ceph osd down osd.0
ceph osd down osd.1
Mark OSD as out:
ceph osd out osd.0
ceph osd out osd.1
Remove from CRUSH map:
ceph osd crush remove osd.0
ceph osd crush remove osd.1
Remove authentication:
ceph auth del osd.0
ceph auth del osd.1
Remove OSD completely:
ceph osd rm osd.0
ceph osd rm osd.1
🔹 Step 4: Remove Host from CRUSH Map
Run exactly one command:
ceph osd crush remove pve01zfs
If unsure about the hostname:
ceph osd tree
Then restart Ceph services on remaining nodes.
Ceph will redistribute data to remaining OSDs.
Rebalancing speed depends on disk performance and network bandwidth.
(Lab environment will be slower.)
🆕 Step 5: Prepare Replacement Node (pve04zfs)
Edit disk serial configuration:
nano /etc/pve/qemu-server/105.conf
serial=DISK07
serial=DISK08
Disable enterprise repository.
Set IP in same network class as pve02 & pve03.
Update hosts file:
nano /etc/hosts
192.168.16.201 pve02zfs.tsf.id.vn pve02zfs
192.168.16.202 pve03zfs.tsf.id.vn pve03zfs
Check disks:
lsblk
ls -l /dev/disk/by-id/
🔗 Step 6: Join pve04 into Cluster
pvecm add pve02zfs.tsf.id.vn
💾 Step 7: Install Ceph on New Node (pve04)
From GUI (Node pve04):
Ceph → Install Ceph
Select same Ceph version
Reboot if required
Then add services:
➤ Add MON + MGR
Ceph → Monitor → Add
Ceph → Manager → Add
➤ Add OSD
Ceph → OSD → Create OSD
Select
/dev/sdbor empty diskRepeat as needed
⚖️ Step 8: Rebalance Ceph
When the new node joins, Ceph automatically rebalances data.
Check cluster status:
ceph -s
Healthy state:
HEALTH_OK
Note:
In small lab environments, you may see:
slow IO warnings
BlueStore slow operations
Data redistribution takes time depending on disk speed.
Degraded data redundancy will gradually decrease until cluster becomes fully active + clean.
🏷 Step 9: Add New Node to HA Group
Navigate:
Datacenter → HA → Groups → Select Group → Add pve04
Now HA can use the new node for failover operations.
✅ Final Thoughts
Replacing a failed node in a Proxmox Ceph HA cluster requires proper order:
Remove node from cluster
Clean MON & OSD services
Update CRUSH map
Add replacement node
Reinstall Ceph
Allow automatic rebalancing
Reconfigure HA
By following best practices, you can maintain data integrity, minimize downtime, and ensure business continuity in both production and lab environments.
This tutorial demonstrates how Ceph replication and Proxmox HA work together to provide true high availability infrastructure.
See also related articles
P21 – How to Schedule Automatic Shutdown and Startup of VMs in Proxmox VE
P21 – How to Schedule Automatic Shutdown and Startup of VMs in Proxmox VE ⏰ Proxmox VE – How to Schedule Automatic VM Start and Shutdown Using Cron (Step-by-Step Guide) Automating virtual machine operations is an essential skill for every Proxmox administrator. In many real-world environments, you may need virtual...
Read MoreP15 – Backup and Restore VM in Proxmox VE
P15 – Backup and Restore VM in Proxmox VE 🚀 Proxmox VE P15 – Backup and Restore VMs (Full Step-by-Step Guide) Data protection is one of the most critical responsibilities of any system administrator.In Proxmox VE, having a proper backup and restore strategy ensures your infrastructure can quickly recover from...
Read MoreP14 – How to Remove Cluster Group Safely on Proxmox
Proxmox VE 9 P14: How to Remove Cluster Group Safely In Proxmox (Step-by-Step Guide) 🚀 Proxmox VE 9 – How to Remove Cluster Group (Step-by-Step) In some scenarios, you may need to remove a Proxmox cluster configuration completely, especially when: ❌ A node failed permanently ❌ The cluster was misconfigured...
Read More