P13 - Setup HA Two Node Cluster Proxmox Ve - Node Failure Disaster Simulation
🚀 Proxmox VE P13 – How to Setup High Availability (2-Node Cluster + NAS) | Failover Test
High Availability (HA) is one of the most powerful features in Proxmox VE, allowing virtual machines to automatically restart on another node if a failure occurs.
In this tutorial, we guide you through the complete process of setting up Proxmox HA on a two-node cluster using NAS shared storage.
You will learn how to:
Configure HA in a 2-node Proxmox cluster
Use NAS (NFS) as shared storage
Set quorum votes correctly
Add VM to HA group
Simulate a real hardware failure
Test automatic failover behavior
By the end of this guide, you will have a fully functional HA environment capable of handling node failures automatically.
🧪 Lab Environment
PVE01: 192.168.11.200 (main)
PVE02: 192.168.11.201 (backup)
Join cluster: TSF
NAS TSF: 192.168.11.30:5001
VM Windows10 on PVE01
I/ HA Configuration – 2 Nodes
Step 0: Mount storage NFS
Both nodes must use shared storage.
In this lab, we use:
NAS Synology with shared NFS folder
Storage mounted on both PVE01 and PVE02
Alternative shared storage options:
SMB
OneDrive
Other centralized storage
Video guide to mount NFS:
https://youtu.be/oXagwrTRzM8
Step 1: Move disk VM to storage NFS
Move the Windows10 VM disk from local storage to the mounted NFS storage.
This ensures the VM can run on either node.
Step 2: Set Vote for PVE02
Initially, both nodes have a vote number of 1 (Cluster Information).
In a 2-node cluster, quorum must be adjusted manually.
In clusters with 3 or more nodes, the system automatically handles votes.
Open the shell of pve01 and run:
ls /etc/pve
Backup and edit corosync configuration:
cd /etc/pve
cp corosync.conf corosync.new.conf
nano corosync.new.conf
Edit:
Config_version:3
quorum_votes:2 (PVE02)
Save the file:
Ctrl + O → Enter
Exit: Ctrl + X
Replace original file:
mv corosync.conf corosync.bak.conf
mv corosync.new.conf corosync.conf
Check vote number after setup in Cluster Information.
Step 3: Create HA
Add VM resource:
ha-manager add vm:100
Add node HA (configure via GUI or group policy as needed).
Remove VM from HA (optional):
ha-manager remove vm:100
Restart HA service:
systemctl restart pve-ha-crm
At this stage, HA is enabled.
II/ Simulate PVE01 FAIL Disaster
Now we simulate a real hardware failure.
Demo STOP pve01 (simulate main node hardware failure).
Login to pve02 (backup) and monitor.
After approximately 3–5 minutes:
VM Windows10 automatically moves to PVE02
VM automatically starts
This demonstrates Proxmox HA failover capability.
Step 1: Handle Physical Server PVE01
After repairing hardware, reconnect and power on Server PVE01.
If HA node priority was configured:
PVE01 priority = 2
PVE02 lower priority
VM will automatically migrate back to PVE01 once it becomes available.
This behavior ensures workload runs on preferred primary node.
Step 2: Restart service system PVE01 (main)
Restart services:
pve-cluster
pve-ha-crm
pve-ha-lm
Ensure cluster and HA manager services are fully operational.
Step 3: Migrate VM Windows10 back to PVE01
If both nodes have equal priority (priority set 1):
Manual migration can be performed during low-traffic hours.
This ensures minimal service disruption.
Step 4: Restart service system PVE02 (backup)
Restart services:
pve-cluster
pve-ha-crm
pve-ha-lm
Both nodes should now be fully synchronized.
III/ If PVE01 Cannot Be Repaired – Replace with PVE03
In case the main node cannot be recovered:
Step 1: Shutdown all VMs of PVE02
Ensure no running workloads before cluster modification.
Step 2: Remove cluster group
Remove existing cluster configuration.
Step 3: Share NFS
On NAS:
Add IP permission for PVE03
Or configure PVE03 to use the same IP as original PVE01
Ensure NFS access is identical.
Step 4: On PVE03, add storage NFS NAS
Mount NFS storage on new node.
Step 5: Create cluster group for PVE02 and PVE03
Recreate cluster.
Create HA group.
Set number of votes.
Cluster is restored with new hardware.
🔐 Best Practices for 2-Node HA
✔ Always use shared storage
✔ Configure vote manually for 2-node cluster
✔ Set priority for primary node
✔ Test failover before production deployment
✔ Monitor HA status regularly
For production:
Consider adding QDevice for better quorum stability
Use dedicated network for corosync
Monitor HA logs for anomalies
🎯 Conclusion
Setting up Proxmox HA in a 2-node cluster with NAS storage provides enterprise-level resilience even in small infrastructure environments.
In this guide, you have:
Configured HA
Adjusted quorum votes
Added VM resource
Simulated real disaster failover
Tested recovery process
Rebuilt cluster after hardware replacement
Understanding HA failover behavior is critical for any system administrator managing virtualization infrastructure.
This tutorial not only teaches configuration steps but also demonstrates real-world disaster recovery scenarios.
See also related articles
P21 – How to Schedule Automatic Shutdown and Startup of VMs in Proxmox VE
P21 – How to Schedule Automatic Shutdown and Startup of VMs in Proxmox VE ⏰ Proxmox VE – How to Schedule Automatic VM Start and Shutdown Using Cron (Step-by-Step Guide) Automating virtual machine operations is an essential skill for every Proxmox administrator. In many real-world environments, you may need virtual...
Read MoreP15 – Backup and Restore VM in Proxmox VE
P15 – Backup and Restore VM in Proxmox VE 🚀 Proxmox VE P15 – Backup and Restore VMs (Full Step-by-Step Guide) Data protection is one of the most critical responsibilities of any system administrator.In Proxmox VE, having a proper backup and restore strategy ensures your infrastructure can quickly recover from...
Read MoreP14 – How to Remove Cluster Group Safely on Proxmox
Proxmox VE 9 P14: How to Remove Cluster Group Safely In Proxmox (Step-by-Step Guide) 🚀 Proxmox VE 9 – How to Remove Cluster Group (Step-by-Step) In some scenarios, you may need to remove a Proxmox cluster configuration completely, especially when: ❌ A node failed permanently ❌ The cluster was misconfigured...
Read More