Smokeping Project Page
Smokeping is a tool which can be used to track uptime and latency of
paths over time. It produces nice graphical output. NETS uses it
to attempt to track long term uptime statistics of the network.
Smokeping
is based upon
RRDTool. Both are written by Tobi Oetiker. The NETS installation can be
found here. The
FRGP installation can be found
here.
Installation
Smokeping is included in Debian archives, so installation on the NETS and
FRGP statistics servers is as simple as apt-get install smokeping.
In addition, a small patch needs to be applied. Since NETS is keeping
much more history (described below in Configuration) than the default configuration
a small change is needed to speed graph rendering. This change will eventually
show up in the released versions of Smokeping.
diff /usr/share/perl5/smokeping/Smokeping.pm ~mitchell/Smokeping.pm-patched
601c601
< ("dummy", '--start', -$start,
---
> ("dummy", '--start', -$start,'--end','-'.int($start / $cfg->{Presentation}{detail}{width}),
807a808
> '--end','-'.int($start / $cfg->{Presentation}{detail}{width}),
Configuration
The smokeping configuration file is found in /etc/smokeping/config. Most of the configuration
can be left in it's default setting. This is a rundown of what does change.
- Consider changing the datadir to a filesystem larger than /var.
- Update the owner, contact and cgiurl to point to NETS/FRGP as appropriate.
- Update the 'to' parameter for Alerts.
- Update the 'Database' to keep more data. Since NETS will be post-processing the data
in order to generate accurate uptime/downtime/unknown numbers in the future, the RRD
files have been configured to keep more data. This is described in more detail below.
- Update the Targets section to include the actual hosts to monitor. We specify hosts
by IP address to avoid monitoring outages which are really DNS problems. It also ensures
that we don't end up pinging different interfaces in the future by accident.
Database Configuration
Some time was spent researching how to save a lot of data while at the same time keeping
the graph rendering quick. This resulted in a somewhat 'optimal' RRD configuration. A
single RRD file keeps a set of time-ordered data. The file as a whole has a base resolution,
and the various archives (termed RRA's) inside of the file keep various copies of the
data at various multiples of the base resolution.
In our design, the base resolution is set to 60 seconds. We run test pings once a minute
and keep the data. Additionally, Smokeping is configured to keep all of the data for
five years. If this was all that was kept, then drawing long term graphs would result
in the program having to read the entire data file to generate the graph. It would read
over 2 million data points and then draw a graph which is only 600 pixels wide. To
ease this job, the RRD file is configured to keep additional RRA sections which are matched
to the timespan and size of the graphs we want to produce. For example, a graph showing
the last 10 days in 600 pixels of width includes (10 days * 24 hours * 60 minutes) = 14,400
minutes of data. 14,400 minutes / 600 pixels = 24 minutes / pixel. To accomodate this
graph we create an RRD file which keeps on data sample every 24 minutes. A similar
equation is done for each time span we want to graph. The resulting RRD configuration
is shown below.
The patch applied above is to ensure that the appropriate RRA is used to generate the
graphs. By default, Smokeping generates graphs from some point in the past until the
current second. Since the RRA for the 5 year graph only gets a new data point about
once every three days, most of the time the primary 1-minute resolution RRA has more
complete data and Smokeping tries to use it instead. The patch pushes the 'end' time
of the graph back far enough to ensure that the low-resolution (and fast) RRA will
be complete.
*** Database ***
step = 60
pings = 20
# Try keeping data for five years at high res
AVERAGE 0.5 1 2628000
# RRA for 3 day graph
AVERAGE 0.5 3 1200
MIN 0.5 3 1200
MAX 0.5 3 1200
# RRA for 10 day graph
AVERAGE 0.5 24 1200
MIN 0.5 24 1200
MAX 0.5 24 1200
# RRA for 400 day graph
AVERAGE 0.5 960 1200
MIN 0.5 960 1200
MAX 0.5 960 1200
# RRA for 5 year graph
AVERAGE 0.5 4380 1200
MIN 0.5 4380 1200
MAX 0.5 4380 1200
Caveats and Bugs
The current version of Smokeping is written such that the cgi-bin script will re-create the RRD
files if they are not present. Unfortunately, they are created with the wrong ownership and
the smokeping daemon can't update them. If this happens, stop the smokeping daemon with
/etc/init.d/smokeping stop. Delete the RRD files. Restart the daemon with
/etc/init.d/smokeping start. Hopefully it will create the files with the right
ownership. You could also try doing a 'chown' on the files as well. This should rarely be
an issue since new targets are not added very often.
Address comments or questions about this Web page to the
Network Engineering & Telecommunications Section
at
nets-www@ncar.ucar.edu.
The NETS is part of the
Computational & Information Systems Laboratory
of the
National Center for Atmospheric Research,
which is sponsored by the
National Science Foundation
and managed by the
University Corporation for Atmospheric Research.
This website follows the
UCAR General Privacy Policy
and the
NCAR/UCAR/UOP Terms of Use.