Monitoring-plugins-keepalived
Nagios plugin
Vendor: SUSE Linux Products GmbH
Developer: lrupp
License: BSD-4-Clause
Web: monitoring-plugins-keepalived
About
check_keepalived informs you about the current state of your keepalived cluster.
At the moment, there are the following ways to get the needed information:
- via SNMP (running keepalived with the --snmp option)
- via notify script configuration
- via dbus (currently not covered by this script)
Using the SNMP option
Configuration / pre-requirements
To get information via SNMP, you need the following pre-requirements:
- keepalived has to be compiled with SNMP support (the openSUSE package is)
- you need a running SNMP server with agentx enabled
- keepalived needs to successful connect to agentx (see logs)
Setting up a basic snmp server is not that hard - just install the package net-snmp on your machine and use a configuration file (/etc/snmp/snmpd.conf) similar to this one:
syslocation Rack A Row B Unit 32 syscontact SUSE-IT <my@email.com> rocommunity public 127.0.0.1 master agentx
Starting snmp now via systemctl start snmpd.service should give you an snmp daemon listening on localhost.
The only thing left to do is to enable snmp support in keepalived. Simply add the '-x' or '--snmp' option in /etc/sysconfig/keepalived and restart your keepalive daemon.
To control that everything works as expected, you can use snmpwalk:
~> snmpwalk -v2c -cpublic localhost KEEPALIVED-MIB::vrrpInstanceState KEEPALIVED-MIB::vrrpInstanceState.1 = INTEGER: master(2)
If this works as expected, you are ready to use check_keepalived with the SNMP backend:
/usr/lib/nagios/plugins/check_keepalived -s OK: localhost is in wanted state (state is: MASTER)
Using 'notify'-script option
First: in your configuration file /etc/keepalived/keepalived.conf, insert a line in the vrrp_instance section:
notify /usr/bin/keepalived_notify_monitoring.sh
After a restart of your keepalive daemon, the script above will be called on every state change, including some information that is printed into a state file (if you want to execute additional scripts, please add them in /etc/keepalived/keepalived_notify_monitoring.conf).
The check_keepalived script will send a SIGUSR2 to the running keepalived process to get the current state dumped out to the state file (so be warned, if you use additional notify scripts), parses it and prints out the result.
As the script does not know about the wanted state of the local keepalived, the options '-M' (master) or '-S' (slave) are required in this case.
Options
-F keepalived_statefile : URI to the status file (default: /tmp/keepalived.stats) -p keepalived_pidfile : URI to the pidfile of the keepalived process (default: /var/run/keepalived.pid) -S : expect the machine to run in SLAVE state -M : expect the machine to run in MASTER state -h : print this usage -V : print version information -s : use SNMP -H <hostname> : SNMP host to query -v <snmp_version> : SNMP version (currently only 2c is supported, therefor default) -c <snmp_community> : SNMP v2 community string
Detailed options description
-M or -S is required if you run the script via the notify-script option (in which case, you might also want to set the -F and -p options).
With -s you enable the alternative way to use SNMP to gather the relevant data. While the script will check the "wanted" state and compares it with the real state, you can optionally also use the -M or -S options to bind the check to expect keepalived to either run as slave or master.
In case you configured your SNMP server to request a different community string, you can make use of the -c option. It might also be possible to execute this script on a different machine than the one you run your keepalived service in this case: just use the -H option to define the hostname or address where your SNMP server is running.
Check definition
If you run the script locally (via NRPE or similar), you could have something like:
command[check_keepalived]=/usr/lib/nagios/plugins/check_keepalived -s
In this case, the command definition on your Nagios/Icinga/Naemon/... machine could look like like:
define command { command_name check_keepalived command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$ }
As you can run the script remotely via the SNMP functionality, your command definition on your Nagios/Icinga/Naemon/... machine could also look like:
define command { command_name check_keepalived command_line $USER1$/check_keepalived -s -H $HOSTADDRESS$ -c$USER20$ }
Anyway, here is a simple template for a service definition:
define service{ use generic-service host_name my_host service_description keepalived display_name keepalived check_command check_keepalived }