Nagios monitors your network in real-time, and raises alarms, but if you want an history of the values returned by plug-ins, you have to use an add-on. You can find a few of those on NagiosExchange.
The easiest add-on to install and configure is, as far as I'm concerned, Nagiosgraph. No need for a database, all values are stored in a RRD file, thus making it easy to create graphs. It can extract data either from the plug-ins' perfdata or output.
Its configuration is way easier than NagiosGrapher's. NagiosGrapher has prettier graphs, and can generate automagically the proper configuration of Nagios when new services are added. But it needs a Perl service running permanently in the background, exotic Perl packages, heavier configuration. And automatically changing Nagios' configuration is a risky task.
But if you're not afraid, I guess you will be more satisfied with NagioGrapher. If you're as lazy as I am, keep on reading how to run Nagiosgraph.
Prerequisites
- Apache
- CGI
- Perl
- Nagios
- RRDTool
Installation
Grab the latest version of Nagiosgraph on Sourceforge. The following tutorial is based on version 0.8, Nagios being version 2.3.1.
Untar an unzip the archive in Nagios home directory (generally /usr/local/nagios/) in a directory named nagiosgraph for example.
Configuration
Simply follow the instructions contained in INSTALL file.
Pay attention to where to activate performance data in Nagios. Not only shall it be activated globally in nagios.cfg (process_performance_data = 1), but also for each service you wish to collect data from (process_perf_data 1). I found a way to activate it automatically for each and every service, by creating a generic service template from which all actual services inherit (see below).
The "heartbeat" parameter of nagiosgraph.conf is important too: it is sent to RRDTool as the maximum delay between two data collection for all counters. As my default check interval is 15 minutes, and not 5 minutes as supposes Nagiosgraph's author, my "heartbeat" is not 600 (600 seconds = 2 x 5 minutes), but 1800 (2 x 15 minutes). As a matter of fact, I increased it to 3000 so I won't loose any data after a Nagios restart. As you certainly noticed, Nagios can take up to 15 minutes before scheduling its first check after a restart.
nagiosgraph.conf
Here is my configuration file:
# File: $Id: nagiosgraph.conf,v 1.6 2005/10/08 05:55:08 sauber Exp $
# Author: (c) Soren Dossing, 2005
# License: OSI Artistic License
# http://www.opensource.org/licenses/artistic-license.php
# Debug levels
# 0 = None
# 1 = Critical
# 2 = Error
# 3 = Warn
# 4 = Info
# 5 = Debug
debug = 2
# Location of debug log file
logfile = /usr/local/nagios/nagiosgraph/log/nagiosgraph.log
# Directory to store rrd database files
rrddir = /usr/local/nagios/nagiosgraph/rrd
# File containing regular expressions to identify service and perf data
mapfile = /usr/local/nagios/nagiosgraph/map
# Color scheme for graphs. Choose a number between 1 and 8.
colorscheme = 1
# Heartbeat. In seconds, twice the size of servicecheck intervals
heartbeat = 3000
# Location of performance data log file. Comment out it not used.
perflog = /usr/local/nagios/var/perfdata.log
The "map" file
The file named "map" tells Nagiosgraph what data to collect and how to store it.
This is an excerpt from my configuration files, showing the use of perfdata or output as data sources, with NSClient, and with Linux classic checks.
If NSClient link is dead, you still can find it on NagiosExchange.
Nagios service definition | map file corresponding definition |
---|---|
# Generic service definition template define service{ name generic-service register 0 check_period 24x7 max_check_attempts 3 normal_check_interval 15 retry_check_interval 5 active_checks_enabled 1 passive_checks_enabled 0 parallelize_check 1 obsess_over_service 0 check_freshness 0 event_handler_enabled 0 flap_detection_enabled 0 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 notification_interval 60 notification_period 24x7 notification_options w,u,c,r notifications_enabled 1 } | |
define service{ use generic-service name Server-Cpu register 0 service_description Charge CPU contact_groups nt-admins check_command nsclient_cpuload!5,50,80 } | # Service type: nsclient CPU # check command: check_nt -H Address -v CPULOAD -l5,50,80 # output: CPU Load 9% (5 min average) # perfdata: '5 min avg Load'=9%;70;80;0;100 /perfdata:.*5 min avg Load'=(\d+)%;.*/ and push @s, [ ntload, [ 'avg05min', GAUGE, $1 ] ]; |
define service{ use generic-service name Server-Mem register 0 service_description Occupation memoire contact_groups nt-admins check_command nsclient_memuse!70!80 } | # Service type: nsclient Memory # check command: check_nt -H Address -v MEMUSAGE # output: Memory usage: total:4195.81 Mb - used: 1987.96 Mb (47%) - free: 2207.85 Mb (53%) # perfdata: 'Memory usage'=1987.96Mb;2937.07;3356.65;0.00;4195.81 /output:Memory usage:.* - used:.*\((\d+)%\) - free:.*/ and push @s, [ ramuse, [ 'percent', GAUGE, $1 ] ]; |
define service{ use generic-service name nrpe-check-load register 0 service_description Charge CPU normal_check_interval 5 retry_check_interval 1 contact_groups linux-admins check_command check_nrpe!check_load } | # Service type: linux remote load # check command: check_nrpe -H Address -c check_load # output: OK - load average: 1.69, 1.07, 0.83 # perfdata: load1=1.690000;3.000000;5.000000;0.000000 load5=1.070000;3.000000;5.000000;0.000000 load15=0.8300 00;3.000000;5.000000;0.000000 /output:.*load average: (\d+\.\d+), (\d+\.\d+), (\d+\.\d+)/ and push @s, [ linuxload, [ 'avg01min', GAUGE, $1 ], [ 'avg05min', GAUGE, $2 ], [ 'avg15min', GAUGE, $3 ] ]; |
This configuration will generate RRD files for all hosts using these services. To write your own map file, you can:
- tail -f /usr/local/nagios/var/perfdata.log, to see live perfdata output, and infer appropriate Perl regexp
- Set debug level to 5 in nagiosgraph.conf, to see if collection goes well
Now that your map file is ready, you just need a link in Nagios to the URL displaying the graphs.
Nagios link to graphs
Nagiosgraph's INSTALL file mentions an icon to be displayed in Nagios. It is not part of the package, so you can have this one: , copy it to Nagios directory "share/images/logos".
To ease configuration, I created a hostgroup in Nagios containing all the hosts that use, say, service Server-Cpu:
define hostgroup{
hostgroup_name x-nsclient
alias Pour affichage icone graphe
members SERVEUR1,SERVEUR2,SERVEUR3
}
Graph icon insertion is done via nagios configuration file serviceextinfo.cfg :
define serviceextinfo {
service_description Charge CPU
hostgroup x-nsclient
notes Graph
notes_url /nagiosgraph/show.cgi?host=$HOSTNAME$&service=$SERVICEDESC$&db=ntload,avg05min&rrdopts=%2Dl%200%20%2Du%20100
icon_image graph.gif
icon_image_alt View graphs
}
Now you can restart Nagios, it should work (cross your fingers).
See also
If you're interested in using NagiosGraph in a Windows environment, you can visit this site.