Giám sát ổ đĩa Proxmox bằng PRTG | Disk Health & Cảnh báo lỗi

PRTG - Proxmox Disk Monitoring | Physical Disk Health & Alerts

In this tutorial, you’ll learn how to monitor physical disk health in Proxmox using PRTG Network Monitor.
This step-by-step guide explains how to collect SMART disk data and configure disk health sensors.
Disk failures often happen silently before causing serious downtime.
PRTG helps IT administrators detect disk issues early with real-time monitoring and alerts.
In this video, we demonstrate how to monitor physical disks on Proxmox hosts effectively.
You’ll also learn how to configure alert notifications for disk warnings and critical states.
This tutorial is designed for system administrators and virtualization engineers.
Follow this guide to protect your Proxmox infrastructure from unexpected disk failures.

Step 1: Create a disk check script

Create a folder (using SSH Advanced Scripts)
mkdir -p /var/prtg/scriptsxml

Create a script file
nano /var/prtg/scriptsxml/prtg_check_all_storage.sh
Contents

💻

prtg_check_all_storage.sh

#!/bin/bash

BASELINE_FILE="/var/prtg/storage_baseline.txt"
STATUS=0
MSG="All disks OK"

################################
# Detect current disks
################################
CURRENT_DISKS=$(lsblk -ndo NAME,TYPE | awk '$2=="disk"{print $1}')
CURRENT_COUNT=$(echo "$CURRENT_DISKS" | wc -w)

################################
# BASELINE DISK CHECK (ALWAYS)
################################
if [ -f "$BASELINE_FILE" ]; then
    MISSING=""

    while IFS="|" read -r WWN DEV MODEL; do
        if [ "$WWN" = "NO_WWN" ]; then
            lsblk -ndo NAME | grep -qw "$DEV" || \
            MISSING="$MISSING $MODEL($DEV)"
        else
            lsblk -P -o WWN | grep -qw "WWN=\"$WWN\"" || \
            MISSING="$MISSING $MODEL($DEV,WWN:$WWN)"
        fi
    done < "$BASELINE_FILE"

    if [ -n "$MISSING" ]; then
        STATUS=2
        MSG="Disk missing:$MISSING"
    fi
else
    STATUS=1
    MSG="Baseline not initialized"
fi

################################
# ZFS HEALTH CHECK (OPTIONAL)
################################
if command -v zpool >/dev/null 2>&1; then
    if zpool list >/dev/null 2>&1; then
        ZFS_BAD=$(zpool list -H -o name,health | awk '$2!="ONLINE"{print $1"("$2")"}')
        if [ -n "$ZFS_BAD" ]; then
            STATUS=2
            MSG="ZFS issue: $ZFS_BAD"
        fi
    fi
fi

################################
# SMART CHECK
################################
BAD_SMART=""
if command -v smartctl >/dev/null 2>&1; then
    for d in $CURRENT_DISKS; do
        smartctl -H /dev/$d &>/dev/null
        [ $? -eq 2 ] && BAD_SMART="$BAD_SMART /dev/$d"
    done
fi

if [ -n "$BAD_SMART" ]; then
    STATUS=2
    MSG="SMART failure:$BAD_SMART"
fi

################################
# PRTG XML OUTPUT
################################
cat <<EOF
<prtg>
  <result>
    <channel>Storage Health</channel>
    <value>$STATUS</value>
    <LimitMaxError>1</LimitMaxError>
    <LimitMode>1</LimitMode>
  </result>

  <result>
    <channel>Disk Count</channel>
    <value>$CURRENT_COUNT</value>
  </result>

  <text>$MSG</text>
</prtg>
EOF

exit 0

grant permission and run

chmod +x /var/prtg/scriptsxml/prtg_check_all_storage.sh
/var/prtg/scriptsxml/prtg_check_all_storage.sh

Step 2: Add the SSH root certificate to the Proxmox device.

Step 3: When the system is OK, run the baseline.
Run the baseline only once.
Run it manually on Proxmox:

💻

filename.sh

lsblk -P -o NAME,TYPE,WWN,MODEL | \
awk '/TYPE="disk"/{
  for(i=1;i<=NF;i++){
    if($i ~ /^NAME=/){gsub(/NAME=|"/,"",$i); name=$i}
    if($i ~ /^WWN=/){gsub(/WWN=|"/,"",$i); wwn=$i}
    if($i ~ /^MODEL=/){
      sub(/^MODEL="/,"",$i)
      model=$i
      for(j=i+1;j<=NF;j++){
        if($j ~ /"$/){sub(/"$/,"",$j); model=model" "$j; break}
        model=model" "$j
      }
    }
  }
  if(wwn=="") wwn="NO_WWN"
  print wwn "|" name "|" model
}' > /var/prtg/storage_baseline.txt

No need to add the sensor before or after

The important thing is to run it when the disk is full

With ZFS, ZFS manages disk itself, so:

• No baseline needed

• No need to worry about disk name changes

Script will check:

zpool list -o health

________________________________________

Just need

✔ ZFS pool ONLINE → 🟢

✔ DEGRADED / FAULTED → 🔴

Step 4: Add sensor

SSH Script Advanced

📊 Attach to PRTG

• Sensor type: SSH Script

• Result handling:

o 0 → OK

o 1 → Down

This script can be used for proxmox ext or zfs nodes, regardless of the number of disks.

Demo: Trying to remove one disk.

The baseline needs to be rerun when:

Replacing the disk

Adding a new disk

Replacing the USB enclosure

PRTG - Proxmox Disk Monitoring | Physical Disk Health & Alerts

TSF Services

liên kết

chính sách

Bản quyền được thiết kế bởi TSF Services @2025