14 Commits

Author SHA1 Message Date
9f591c0aa2 Update Documentation (#12411)
* Update Documentation

Most work in Graylog and SNMP
Fixed many code tags, spelling and wording.
Added SNMP PowerShell example.

* Update SNMP-Configuration-Examples.md

Fix TOC brackets
2020-12-30 15:38:14 +01:00
d23ed0dd6e Update to incorporate new locking mechanisms (#12388) 2020-12-12 18:02:11 +01:00
1e4702fa4f Support multiple daily process locking backends with distributed polling (#11896)
* Implement locks in the file cache

* Replace custom locks

* implement restore lock
Used when re-hydrating

* remove legacy use statements

* Add class descriptions

* Fix style

* Default to database cache driver

* missed cache_locks table
prevent chicken-egg issue

* style fixes

* Remove custom file lock implementation

* missed items from file cache

* Update schema definition
hmm, other schema noise must be from manual modification as this is generated from a freshly migrated db.

* require predis, it is pure python, so no harm in adding

* and set predis as the default client
2020-10-07 07:36:35 -05:00
41ed0537b4 Fix midnight poller data loss (#11582)
* Handle more signals

* Flush buffers before exiting process
This ensures log messages aren't lost

* Restart process before jobs have finished
If there is a very log running job it can cause service restart to
take over 5 minutes.

We tweak the order of things to make sure that running processes
continue, but nothing more is scheduled.

The worst case impact is that a pollling/discovery job gets
scheduled twice, but this should not be a big issue - this should
only occur at most once per day.

* Remove python 3.8 feature

* Ensure that processes from the previous invocation are reaped

* Correct typo's

* Attach subprocess descriptors to /dev/null

Occasionally, PHP would throw a fit and crash when its stdout went
away. To avoid this, we attach stdout to devnull.

This means we lost output of daily.sh - but this is already recorded
in $LOGDIR/daily.log

* Don't immediately schedule long running jobs

To avoid the situation where the maintenance reload happens or a sighup,
then a second long running job is immediately started, we wait
(`last_[poll/discovery]_timetaken` * 1.25) seconds before scheduling
any jobs.

* Add `psutil` to requirements

* Add support for "systemctl reload" to the unit files

* Add a fallback for systems that don't have psutil

* Reduce CPU load when psutil is not installed

* Don't avoid double polling by extending the timeout

This shouldn't happen due to locks

* Remove fallback option

* Remove extra variable

* Fix issue introduced during rebase

* Fix issue introduced when fixing issue introduced during rebase

* Make psutil optional
2020-09-29 23:50:40 -05:00
bffa46f34a Update the options for distributed_poller (#11655)
Added $config[] and enabled the remi repository for redis.
2020-05-22 01:16:56 +02:00
a9f6c935a4 Removed reference to deprecated poller-service.py (#11598)
* Removed reference to deprecated poller-service.py

* Update Dispatcher-Service.md

Co-authored-by: Tony Murray <murraytony@gmail.com>
2020-05-12 15:39:44 -05:00
cdb6a74dc8 implement watchdog to librenms-service (#11353)
* add watchdog to librenms-service to check log file
add Redis timeout to librenms-service

* updated docs

* fixed logfile_watchdog() indentation in service.py

* indentation fix

* code climate patch

* updated default redis timeout if alerting frequency is 0
2020-03-31 23:10:45 -05:00
299ae36cd1 Update Dispatcher-Service.md (#11297)
Added a few lines about tuning the number of worker threads. Hope I got it right.
2020-03-19 06:41:54 -05:00
b460644c23 Updated Spelling (#10922)
Memecached -> Memcached
2019-12-10 09:03:56 +01:00
38febff1ec Add memcached to DS-docs (#10715)
* Add memcached to DS-docs

As per request from @murrant in Discord, here is a small update on the Dispatcher service still needing a central memcached

* Update Dispatcher-Service.md

* Update Dispatcher-Service.md
2019-10-20 23:15:25 +00:00
74724a4618 Add redis sentinel support to dispatcher service (#10598)
* Add redis sentinel support to dispatcher service

* Update docs for redis sentinel support

* Don't re-raise python exception in service
2019-10-01 06:51:07 +00:00
92837e5c2b Dispatcher Service: Documentation Typo (#10620) 2019-09-23 06:26:53 -05:00
e4c9153d16 more documentation clean up (#10577)
* fix a few bare URLs

* make mdl happy

* make Weathermap.md as mdl happy as possible

* make Varnish.md as mdl happy as possible

* make Two-Factor-Auth.md mdl happy

* touch one header for Syslog.md, but little can be done about the rest

* make Sub-Directory.md as mdl happy as possible

* make SNMP-Trap-Handler.md lint happy

* make SNMP-Proxy.md mdl happy

* make Smokeping.md as mdl happy as possible

* make Services.md mdl happy

* make RRDTune.md mdl happy

* cleanup RRDCached.md as much as possible

* make RRDCached-Security.md mdl happy

* make Rancid.md as mdl happy as possible

* make Proxmox.md mdl happy

* make Plugin-System.md as mdl happy as possible

* make PeeringDB.md mdl happy

* make Oxidized.md more lint happy

* make Network-Map.md mdl happy

* make MIB-based-polling.md as mdl happy as possible

* make Metric-Storage.md mdl happy

* make IRC-Bot.md as mdl happy as possible

* make IRC-Bot-Extensions.md as mdl happy as possible

* make

* make Graylog.md mdl happy

* make Gateone.md mdl happy

* make Fast-Ping-Check.md mdl happy

* make Distributed-Poller.md as mdl happy as possible

* make Dispatcher-Service.md as mdl happy as possible

* make Device-Groups.md mdl happy

* make Dell-OpenManage.md mdl happy

* make Dashboard.md mdl happy

* make Customizing-the-Web-UI.md as mdl happy as possible

* make Component.md mdl happy

* make Billing-Module.md mdl happy

* make Auto-Discovery.md mostly mdl happy

* make Authentication.md as mdl happy as possible

* tidy up a few lines in Applications.md

* make Agent-Setup.md as mdl happy as possible

* make metrics/OpenTSDB.md mdl happy

* spelling fix
2019-09-09 12:48:35 +02:00
604a200891 Python dispatcher service v2 (#10050)
* Refactor LibreNMS service
add ping

* services ported
remote legacy stats collection

* alerting

* implement unique queues

* update discovery queue manager

* remove message

* more cleanup

* Don't shuffle queue

* clean up imports

* don't try to discover ping only devices

* Fix for discovery not running timer

* Update docs a bit and and add some additional config options.
Intentionally undocumented.

* Wait until the device is marked up by the poller before discovering

* Handle loosing connection to db gracefully

* Attempt to release master after 5 db failures

* Sleep to give other nodes a chance to acquire

* Update docs and rename the doc to Dispatcher Service to more accurately reflect its function.

* add local notification
2019-05-20 11:35:47 -05:00