2016-08-24 08:12:20 +01:00
|
|
|
source: Extensions/Distributed-Poller.md
|
2018-10-27 23:04:34 +01:00
|
|
|
path: blob/master/doc/
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2015-03-15 16:31:22 +00:00
|
|
|
# Distributed Poller
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2015-03-15 16:29:59 +00:00
|
|
|
LibreNMS has the ability to distribute polling of devices to other machines.
|
|
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
These machines can be in a different physical location and therefore
|
|
|
|
|
minimize network latency for devices that are a considerable distance
|
|
|
|
|
away or are behind NAT firewalls.
|
2015-03-15 16:29:59 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
Devices can be grouped together into a `poller_group` to pin these
|
|
|
|
|
devices to a single or a group of designated pollers.
|
2015-03-15 16:29:59 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
All pollers need to share their RRD-folder, for example via NFS or a
|
|
|
|
|
combination of NFS and rrdcached.
|
2015-08-23 19:36:22 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
It is a requirement that all pollers can access the central memcached
|
|
|
|
|
to communicate with each other.
|
2015-08-23 19:36:22 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
# Requirements
|
2016-07-07 23:42:11 +01:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
These requirements are above the normal requirements for a full LibreNMS install.
|
2016-07-07 23:42:11 +01:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
- rrdtool version 1.4 or above
|
|
|
|
|
- python-memcached package
|
|
|
|
|
- a memcached install
|
|
|
|
|
- a rrdcached install
|
2016-07-07 23:42:11 +01:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
By default, all hosts are shared and have the `poller_group = 0`. To
|
|
|
|
|
pin a device to a poller, set it to a value greater than 0 and set the
|
|
|
|
|
same value in the poller's config with
|
|
|
|
|
`$config['distributed_poller_group']`. One can also specify a comma
|
|
|
|
|
separated string of poller groups in
|
|
|
|
|
$config['distributed_poller_group']. The poller will then poll
|
|
|
|
|
devices from any of the groups listed. If new devices get added from
|
|
|
|
|
the poller they will be assigned to the first poller group in the list
|
|
|
|
|
unless the group is specified when adding the device.
|
2015-03-15 16:29:59 +00:00
|
|
|
|
2017-02-01 15:13:34 +00:00
|
|
|
A standard configuration for a distributed poller would look like:
|
2015-03-15 16:29:59 +00:00
|
|
|
|
|
|
|
|
```php
|
|
|
|
|
// Distributed Poller-Settings
|
2016-06-16 15:57:14 +01:00
|
|
|
$config['distributed_poller'] = true;
|
2019-09-09 05:48:35 -05:00
|
|
|
// optional: defaults to hostname
|
|
|
|
|
#$config['distributed_poller_name'] = 'custom';
|
2015-03-15 16:29:59 +00:00
|
|
|
$config['distributed_poller_group'] = 0;
|
|
|
|
|
$config['distributed_poller_memcached_host'] = 'example.net';
|
|
|
|
|
$config['distributed_poller_memcached_port'] = '11211';
|
|
|
|
|
```
|
2015-03-20 22:34:50 +00:00
|
|
|
|
|
|
|
|
## Example Setup
|
2019-09-09 05:48:35 -05:00
|
|
|
|
|
|
|
|
Below is an example setup based on a real deployment which at the time
|
|
|
|
|
of writing covers over 2,500 devices and 50,000 ports. The setup is
|
|
|
|
|
running within an OpenStack environment with some commodity hardware
|
|
|
|
|
for remote pollers. Here's a diagram of how you can scale LibreNMS
|
|
|
|
|
out:
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2015-10-13 14:19:17 +00:00
|
|
|

|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
## Architecture
|
|
|
|
|
|
|
|
|
|
How you setup the distribution is entirely up to you, you can choose
|
|
|
|
|
to host the majority of the required services on a single virtual
|
|
|
|
|
machine or server and then a poller to actually query the devices
|
|
|
|
|
being monitored all the way through to having a dedicated server for
|
|
|
|
|
each of the individual roles. Below are notes on what you need to
|
|
|
|
|
consider both from the software layer but also connectivity.
|
|
|
|
|
|
|
|
|
|
### Web / API Layer
|
|
|
|
|
|
|
|
|
|
This is typically Apache but we have setup guides for both Nginx and
|
|
|
|
|
Lighttpd which should work perfectly fine. There is nothing unique
|
|
|
|
|
about the role this service is providing except that if you are adding
|
|
|
|
|
devices from this layer then the web service will need to be able to
|
|
|
|
|
connect to the end device via SNMP and perform an ICMP test.
|
|
|
|
|
|
|
|
|
|
It is advisable to run RRDCached within this setup so that you don't
|
|
|
|
|
need to share the rrd folder via a remote file share such as NFS. The
|
|
|
|
|
web service can then generate rrd graphs via RRDCached. If RRDCached
|
|
|
|
|
isn't an option then you can mount the rrd directory to read the RRD
|
|
|
|
|
files directly.
|
|
|
|
|
|
|
|
|
|
### Database Server
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
MySQL / MariaDB - At the moment these are the only database servers
|
|
|
|
|
that are supported, work is being done to ensure MySQL Strict mode is
|
|
|
|
|
also supported but this should be considered to be incomplete still
|
|
|
|
|
and therefor disabled.
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
The pollers, web and API layers should all be able to access the
|
|
|
|
|
database server directly.
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
### RRD Storage
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
Central storage should be provided so all RRD files can be read from
|
|
|
|
|
and written to in one location. As suggested above, it's recommended
|
|
|
|
|
that RRD Cached is configured and used.
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
For this example, we are running RRDCached to allow all pollers and
|
|
|
|
|
web/api servers to read/write to the rrd files with the rrd directory
|
|
|
|
|
also exported by NFS for simple access and maintenance.
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
### Memcache
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
Memcache is required for the distributed pollers to be able to
|
|
|
|
|
register to a central location and record what devices are
|
|
|
|
|
polled. Memcache can run from any of the servers so long as it is
|
|
|
|
|
accessible by all pollers.
|
|
|
|
|
|
|
|
|
|
### Pollers
|
2015-03-20 22:34:50 +00:00
|
|
|
|
|
|
|
|
Pollers can be installed and run from anywhere, the only requirements are:
|
|
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
- They can access the Memcache instance
|
|
|
|
|
- They can create RRD files via some method such as a shared
|
|
|
|
|
filesystem or RRDTool >=1.5.5
|
|
|
|
|
- They can access the MySQL server
|
|
|
|
|
|
|
|
|
|
You can either assign pollers into groups and set a poller group
|
|
|
|
|
against certain devices, this will mean that those devices will only
|
|
|
|
|
be processed by certain pollers (default poller group is 0) or you can
|
|
|
|
|
assign all pollers to the default poller group for them to process any
|
|
|
|
|
and all devices.
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
This will provide the ability to have a single poller behind a NAT
|
|
|
|
|
firewall monitor internal devices and report back to your central
|
|
|
|
|
system. You will then be able to monitor those devices from the Web UI
|
|
|
|
|
as normal.
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
Another benefit to this is that you can provide N+x pollers, i.e if
|
|
|
|
|
you know that you require three pollers to process all devices within
|
|
|
|
|
300 seconds then adding a 4th poller will mean that should any one
|
|
|
|
|
single poller fail then the remaining three will complete polling in
|
|
|
|
|
time. You could also use this to take a poller out of service for
|
|
|
|
|
maintenance, i.e OS updates and software updates.
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
It is extremely advisable to either run a central recursive dns server
|
|
|
|
|
such as pdns-recursor and have all of your pollers use this or install
|
|
|
|
|
a recursive dns server on each poller - the volume of DNS requests on
|
|
|
|
|
large installs can be significant and will slow polling down enough to
|
|
|
|
|
cause issues with a large number of devices.
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
### Discovery
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
It's not necessary to run discovery services on all pollers. In fact,
|
|
|
|
|
you should only run one discovery process per poller group. Designate
|
|
|
|
|
a single poller to run discovery (or a separate server if required).
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2019-09-09 05:48:35 -05:00
|
|
|
### Config sample
|
|
|
|
|
|
|
|
|
|
The following config is taken from a live setup which consists of a
|
|
|
|
|
Web server, DB server, RRDCached server and 3 pollers.
|
2015-03-20 22:38:57 +00:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
Web Server:
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
Running Apache and an install of LibreNMS in /opt/librenms
|
2019-09-09 05:48:35 -05:00
|
|
|
|
|
|
|
|
- config.php
|
|
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
```php
|
|
|
|
|
$config['distributed_poller'] = true;
|
|
|
|
|
$config['rrdcached'] = "example.com:42217";
|
|
|
|
|
```
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
Database Server:
|
|
|
|
|
Running Memcache and MariaDB
|
2019-09-09 05:48:35 -05:00
|
|
|
|
|
|
|
|
- Memcache
|
2016-06-16 15:57:14 +01:00
|
|
|
|
|
|
|
|
Ubuntu (/etc/memcached.conf)
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
```conf
|
|
|
|
|
-d
|
|
|
|
|
-m 64
|
|
|
|
|
-p 11211
|
|
|
|
|
-u memcache
|
|
|
|
|
-l ip.ip.ip.ip
|
|
|
|
|
```
|
2015-03-20 22:38:57 +00:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
RRDCached Server:
|
|
|
|
|
Running RRDCached
|
2019-09-09 05:48:35 -05:00
|
|
|
|
|
|
|
|
- RRDCached
|
2016-06-16 15:57:14 +01:00
|
|
|
|
|
|
|
|
Ubuntu (/etc/default/rrdcached)
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
```conf
|
|
|
|
|
OPTS="-l 0:42217"
|
|
|
|
|
OPTS="$OPTS -j /var/lib/rrdcached/journal/ -F"
|
|
|
|
|
OPTS="$OPTS -b /opt/librenms/rrd -B"
|
|
|
|
|
OPTS="$OPTS -w 1800 -z 900"
|
|
|
|
|
```
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2017-09-27 14:51:44 -05:00
|
|
|
Ubuntu (/etc/default/rrdcached) - RRDCached 1.5.5 and above.
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2017-09-27 14:51:44 -05:00
|
|
|
```
|
|
|
|
|
OPTS="-l 0:42217"
|
|
|
|
|
OPTS="$OPTS -R -j /var/lib/rrdcached/journal/ -F"
|
|
|
|
|
OPTS="$OPTS -b /opt/librenms/rrd -B"
|
|
|
|
|
OPTS="$OPTS -w 1800 -z 900"
|
|
|
|
|
```
|
2015-03-20 22:34:50 +00:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
Poller 1:
|
|
|
|
|
Running an install of LibreNMS in /opt/librenms
|
|
|
|
|
|
|
|
|
|
`config.php`
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
```php
|
2018-05-19 15:02:22 -05:00
|
|
|
$config['distributed_poller_name'] = php_uname('n');
|
2016-06-16 15:57:14 +01:00
|
|
|
$config['distributed_poller_group'] = '0';
|
|
|
|
|
$config['distributed_poller_memcached_host'] = "example.com";
|
|
|
|
|
$config['distributed_poller_memcached_port'] = 11211;
|
|
|
|
|
$config['distributed_poller'] = true;
|
|
|
|
|
$config['rrdcached'] = "example.com:42217";
|
|
|
|
|
$config['update'] = 0;
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
`/etc/cron.d/librenms`
|
2019-09-09 05:48:35 -05:00
|
|
|
|
|
|
|
|
Runs discovery and polling for group 0, daily.sh to deal with
|
|
|
|
|
notifications and DB cleanup and alerts.
|
|
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
```conf
|
2017-11-09 17:19:47 +00:00
|
|
|
33 */6 * * * librenms /opt/librenms/cronic /opt/librenms/discovery-wrapper.py 1
|
|
|
|
|
*/5 * * * * librenms /opt/librenms/discovery.php -h new >> /dev/null 2>&1
|
|
|
|
|
*/5 * * * * librenms /opt/librenms/cronic /opt/librenms/poller-wrapper.py 16
|
|
|
|
|
15 0 * * * librenms /opt/librenms/daily.sh >> /dev/null 2>&1
|
|
|
|
|
* * * * * librenms /opt/librenms/alerts.php >> /dev/null 2>&1
|
2016-06-16 15:57:14 +01:00
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Poller 2:
|
|
|
|
|
Running an install of LibreNMS in /opt/librenms
|
|
|
|
|
|
|
|
|
|
`config.php`
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2015-04-27 15:17:02 +01:00
|
|
|
```php
|
2018-05-19 15:02:22 -05:00
|
|
|
$config['distributed_poller_name'] = php_uname('n');
|
2016-06-16 15:57:14 +01:00
|
|
|
$config['distributed_poller_group'] = '0';
|
|
|
|
|
$config['distributed_poller_memcached_host'] = "example.com";
|
|
|
|
|
$config['distributed_poller_memcached_port'] = 11211;
|
|
|
|
|
$config['distributed_poller'] = true;
|
|
|
|
|
$config['rrdcached'] = "example.com:42217";
|
|
|
|
|
$config['update'] = 0;
|
2015-04-27 15:17:02 +01:00
|
|
|
```
|
|
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
`/etc/cron.d/librenms`
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
Runs billing as well as polling for group 0.
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
```conf
|
2017-11-09 17:19:47 +00:00
|
|
|
*/5 * * * * librenms /opt/librenms/poller-wrapper.py 16 >> /opt/librenms/logs/wrapper.log
|
2016-06-16 15:57:14 +01:00
|
|
|
*/5 * * * * librenms /opt/librenms/poll-billing.php >> /dev/null 2>&1
|
|
|
|
|
01 * * * * librenms /opt/librenms/billing-calculate.php >> /dev/null 2>&1
|
2018-08-07 08:43:57 +02:00
|
|
|
15 0 * * * librenms /opt/librenms/daily.sh >> /dev/null 2>&1
|
2016-06-16 15:57:14 +01:00
|
|
|
```
|
2015-08-23 19:36:22 +00:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
Poller 3:
|
|
|
|
|
Running an install of LibreNMS in /opt/librenms
|
|
|
|
|
|
|
|
|
|
`config.php`
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2015-08-23 19:36:22 +00:00
|
|
|
```php
|
2018-05-19 15:02:22 -05:00
|
|
|
$config['distributed_poller_name'] = php_uname('n');
|
2016-06-16 15:57:14 +01:00
|
|
|
$config['distributed_poller_group'] = '2,3';
|
|
|
|
|
$config['distributed_poller_memcached_host'] = "example.com";
|
|
|
|
|
$config['distributed_poller_memcached_port'] = 11211;
|
|
|
|
|
$config['distributed_poller'] = true;
|
|
|
|
|
$config['rrdcached'] = "example.com:42217";
|
|
|
|
|
$config['update'] = 0;
|
2015-08-23 19:36:22 +00:00
|
|
|
```
|
|
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
`/etc/cron.d/librenms`
|
|
|
|
|
Runs discovery and polling for groups 2 and 3.
|
2019-09-09 05:48:35 -05:00
|
|
|
|
2016-06-16 15:57:14 +01:00
|
|
|
```conf
|
2018-08-07 08:43:57 +02:00
|
|
|
33 */6 * * * librenms /opt/librenms/cronic /opt/librenms/discovery-wrapper.py 1
|
|
|
|
|
*/5 * * * * librenms /opt/librenms/discovery.php -h new >> /dev/null 2>&1
|
|
|
|
|
*/5 * * * * librenms /opt/librenms/cronic /opt/librenms/poller-wrapper.py 16
|
|
|
|
|
15 0 * * * librenms /opt/librenms/daily.sh >> /dev/null 2>&1
|
2016-06-16 15:57:14 +01:00
|
|
|
```
|