Goal
The idea is to have an alert in zabbix when a particular VM backup failed, with the alert showing on the VM host, rather than the Veeam server host.
This can be achieved in 2 parts; a powershell script on the Veeam server and a template on the VM hosts in zabbix.
Veeam server
I’ve created (read: Frankensteined from various stackoverflow posts) the following powershell script on the Veaam B&R server. Note that this server currently only has one backup job, so I’ve simply hardcoded the job name.
This script will loop through all the VM’s in the backup job and use the zabbix sender to push a statuscode (0,1,2) to each host in zabbix. It’s important that the hostname in zabbix matches the VM names in Veeam.
Import-Module -name veeam.backup.powershell 3>$null
Connect-VBRServer
$vmlist = get-vbrjobobject -job "Backup job 1"
$Job = Get-VBRJob -Name "Backup job 1"
foreach ($VMName in $vmlist.name){
$Session = $Job.FindLastSession()
$Tasks = $Session.GetTaskSessions()
if($($Tasks | ? {$_.Name -eq $VMName -and $_.Status -eq "Failed"}) -ne $null) {
$code= 1
} elseif($($Tasks | ? {$_.Name -eq $VMName -and $_.Status -eq "Warning"}) -ne $null) {
$code= 2
} else {
$code = 0
}
C:\zabbix\zabbix_sender.exe -z ZABBIX_SERVER -s $vmname -k backup.status -o $code -v
}
Disconnect-VBRServer
Veeam automatically calls this script at the end of the backup job:
Zabbix
The Zabbix end is pretty straightforward; create a new template with the trapper item ‘backup.status’ that the powershell script pushes the status code to. Some value mappings and a couple triggers and we’re done. The template can be found here.
First, a value mapping. There are only 3 status codes, so it’s a rather short one:
= 0 => succes
= 1 => Failed
= 2 => Warning
The Item itself is simply a zabbix trapper with the key ‘backup.status’ of type ‘numeric (unsigned)’, and the value mapping linked.
Triggers are quite simple as well; a ‘Warning’ trigger if the last backup.status value is ‘2’; an ‘Average’ trigger if it is ‘1’, and another ‘Average’ of there is no data for >24hrs (meaning the backup hasn’t ran at all and the script isn’t triggered).
This template has to be applied to all the hosts in the backup job and we’re all done.