This is the first part of a two part series.

icinga2 offers already in its basic version many tools that are needed for the monitoring of a Windows server. But what if you want to monitor something for which there is no plugin?

In this case you have to write a plugin yourself.

Basic structure

Icinga works with status codes from 0 to 3.

  • 0 means that everything is ok (green is displayed in the monitoring).
  • 1 indicates a warning. There are deviations and you should take a look at it (color yellow in icingaweb2)
  • 2 means that the status is critical. Here you have to act immediately (color red in monitoring).
  • 3 undefined status, i.e. unknown if something is wrong or not (color purple in monitoring).

These numbers are not taken out of the air, but are based on the exit codes in the shell.

If you now want to write your own plugin, you have to decide which state should have which exit code with you.
It is important that you intercept all of them.

You don't have to implement all exit codes, e.g. you can skip the warning level and always output critical on error. If you are sure that there is no undefined state, you can even omit the unknown code (3).
However, do not make this a habit.

Our scenario

On terminal servers rarely only one user is logged in and if the local system administrator was lazy, the rights structure is so chaotic that everyone has access somewhere and somehow.

In our scenario we assume that only a certain user is allowed to start a certain program and the program must not be started more than once (e.g. because a database synchronization is performed).

Our plugin

Start defining your return stages so that Windows and outsiders can do something with the numbers.

$returnStateOK = 0
$returnStateWarning = 1
$returnStateCritical = 2
$returnStateUnknown = 3

Since we do not want to use the plugin only once, it makes sense that it receives an input from the monitoring:

# name of process
# critical processcount
# desired user

args[0] denotes the first element of a string array, which is passed to the program as a parameter. args[1] the second, etc.

Since we may want to make the process independent of the user to be executed, we make the third parameter ("desired user") optional. All others are mandatory.

Our plugin call should look like this:
Plugin name <input1: process name> <input2: process number> [optionally input3: user].

Critical cases

The simple case (the process is not running)

Regardless of whether we include a user or not, we always need to check if the background process is running. For this we count the number of processes with the name.

If the process is not found, the Powershell pipeline issues an error and the plugin may not continue. Therefore we use the attribute "-ErrorAction SilentlyContinue".

We do this as follows:

$countProcs = Get-Process -Name $param1 -IncludeUserName -ErrorAction SilentlyContinue| Measure-Object | Select -Expand count

If zero were counted, then of course the process is not running. In this case, we issue a "critical" (exit code 2).

if ($countProcs -eq 0){
    Write-Host "CRITICAL: Could not find any process named $param1"
    exit 2

Process does not run from the desired user

If the executing user is to be checked (param3 is set) we have to count all processes with the user and all processes without this user.

We make the distinction with the command "Where-Object UserName -Like $param3*" resp. "Where-Object UserName -Notlike $param3*". Both are stored in separate variables.

$userProcs = Get-Process -Name $param1 -IncludeUserName -ErrorAction SilentlyContinue| Where-Object UserName -Like $param3* |  Measure-Object | Select -Expand count
$userProcsFalse = Get-Process -Name $param1 -IncludeUserName -ErrorAction SilentlyContinue| Where-Object UserName -Notlike $param3* | Select-Object -Expand UserName
$countingFalse = $userProcsFalse | Measure-Object | Select -Expand count

The first variable we need to check if the right number are executed by the right user.
The second variable we need to output a critical error with the wrong user name.
We also count with countingFalse the number of processes executed by the wrong user.

With this we have everything to output the critical message:

elseif ($countingFalse -ge 0 ) {
Write-Host "CRITICAL: Found $overallProcs $param1 process running.  $countingFalse User which shouldn't run this program: $userProcsFalse"
exit 2

Too many processes are executed

This should be the last if query in the query chain (the order matters) and is also just a simple if query.

elseif ($countProcs -gt $param2){
 	Write-Host "CRITICAL: Found more than $param2 process of $param1"
	exit 2

Warning case

Now we have always spent only critical. What about warnings?

Basically, there is only one case here. Fewer processes are executed by the right user than entered.

elseif ($countProcs -lt $param2){
 	Write-Host "Warning: Found less than $param2 process of $param1"
	exit 1

OK case

Theoretically, all other issues should meet the conditions. However, we are in the computer world. The output could be NotANumber or something else, so check again explicitly.

if ($countProcs -eq $param2){
   Write-Host "OK: Found $param2 process running of $param1. All procs are run by user $param3"
   exit 0

Outputs without "desired user" input proceed in the same way, but without querying the third variable.

Unknown case

Since programmers can always forget something and we might not have caught any cases, we set an Unknown status for all uncaught branches. Here an implicit else branch is indeed appropriate.

    Write-Host "Unknown: Something went wrong"
    exit 3

You can look up the full code in Github.