Use dpinger defaults for send_interval and time_period

Use gcc by default
Add explicit cast for assignment of alarm_hold_periods
2024-05-19 06:50:01 +00:00 · 2023-11-08 10:40:52 -08:00 · 2023-11-08 10:22:58 -08:00 · 2023-11-08 10:08:07 -08:00 · 2023-11-08 08:29:46 -08:00 · 2023-11-08 08:07:39 -08:00
14 changed files with 1215 additions and 167 deletions
--- a/13
+++ b/13
@@ -1,15 +1,15 @@
-Copyright (c) 2015-2016, Denny Page
+Copyright (c) 2015-2022, Denny Page
 All rights reserved.

 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions are met:

-* Redistributions of source code must retain the above copyright notice, this
-  list of conditions and the following disclaimer.
+1. Redistributions of source code must retain the above copyright notice, this
+   list of conditions and the following disclaimer.

-* Redistributions in binary form must reproduce the above copyright notice,
-  this list of conditions and the following disclaimer in the documentation
-  and/or other materials provided with the distribution.
+2. Redistributions in binary form must reproduce the above copyright notice,
+   this list of conditions and the following disclaimer in the documentation
+   and/or other materials provided with the distribution.

 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
 AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
@@ -21,4 +21,3 @@ SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
 CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
 OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
--- a/8
+++ b/8
@@ -1,5 +1,9 @@
-WARNINGS=-Wall -Wextra -Wformat=2
+CC=gcc
+WARNINGS=-Wall -Wextra -Wformat=2 -Wno-unused-result

-CFLAGS=${WARNINGS} -pthread -g
+#CC=clang
+#WARNINGS=-Weverything -Wno-unsafe-buffer-usage -Wno-cast-function-type-strict -Wno-padded -Wno-disabled-macro-expansion -Wno-reserved-id-macro
+
+CFLAGS=${WARNINGS} -pthread -g -O2

 all: dpinger
--- a/NOTES.md
+++ b/NOTES.md
@@ -0,0 +1,13 @@
+<b>Loss accuracy</b>
+
+In general, dpinger works a bit differently than other latency monitors. Rather than a "probe" that fires off and processes a handful of echo request/replies all at once, dpinger maintains a rolling array of echo requests spaced on the send interval. In other words, instead of waking up every second and sending 4 echo requests at once, dpinger sends an echo request every 250 milliseconds. When dpinger receives an echo reply, the time difference between the request packet and reply packet (latency) is recorded. There is nothing that times out an echo request/reply and records it as permanently lost.
+
+When the alert check is made, or a report is generated, dpinger goes through the array and examines each echo request. If a reply has been received, it is used as part of the overall latency calculation. If a reply has not yet been received, the amount of time since the request is compared against the loss interval. If it is greater than the loss interval, the request/reply is counted as lost in the current report. However the concept of the request/reply being lost is not a permanent decision. In subsequent reports, if a the missing reply has been received, its latency will be used instead of being counted as lost.
+
+It's important to keep in mind that latency and loss are reported as averages across the entire request set. The default time period for dpinger is 60 seconds, with an echo request being sent every 500 milliseconds. This means that the latency and loss will be reported as averages across 116-120 samples. The alert check runs every second by default. So each time, the 4 oldest entries in the set have been replaced by the 4 newest ones.
+
+Note that if you want accurate loss reporting, it is important that the number of samples be sufficient. In order to achieve 1% loss resolution, you have need more than 100 samples in the set. The calculation for loss resolution is:
+
+  100 / ((time_period - loss_interval) / send_interval)
+
+The default settings for dpinger report loss with an accuracy of 0.87%.
--- a/dpinger.c
+++ b/dpinger.c
@@ -1,6 +1,6 @@

 //
-// Copyright (c) 2015-2016, Denny Page
+// Copyright (c) 2015-2023, Denny Page
 // All rights reserved.
 //
 // Redistribution and use in source and binary forms, with or without
@@ -27,6 +27,11 @@
 // SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
 //

+
+// Silly that this is required for accept4 on Linux
+#define _GNU_SOURCE
+
+
 #include <stdio.h>
 #include <errno.h>
 #include <string.h>
@@ -39,10 +44,11 @@
 #include <signal.h>

 #include <netdb.h>
-#include <net/if.h>
 #include <sys/socket.h>
+#include <net/if.h>
 #include <sys/un.h>
 #include <sys/stat.h>
+#include <sys/file.h>
 #include <netinet/in.h>
 #include <netinet/ip.h>
 #include <netinet/ip_icmp.h>
@@ -52,17 +58,6 @@
 #include <pthread.h>
 #include <syslog.h>

-// TODO:
-//
-// After December 31st, 2016, review use of fcntl() for setting non blocking
-// and close on exec. It would be preferable to use accept4(), SOCK_CLOEXEC
-// and SOCK_NONBLOCK. These are currently avoided to allow use on older
-// systems such as FreeBSD 9.3, Linux 2.6.26.
-// For Linux accept4() currently requires defining _GNU_SOURCE which we would
-// like to avoid.
-// For FreeBSD, these definitions were introduced with FreeBSD 10.0 and are
-// not present in 9.3 which is supported through 2016.
-

 // Who we are
 static const char *             progname;
@@ -74,16 +69,17 @@ static const char *             pidfile_name = NULL;
 // Flags
 static unsigned int             flag_rewind = 0;
 static unsigned int             flag_syslog = 0;
+static unsigned int             flag_priority = 0;

 // String representation of target
 #define ADDR_STR_MAX            (INET6_ADDRSTRLEN + IF_NAMESIZE + 1)
 static char                     dest_str[ADDR_STR_MAX];

 // Time period over which we are averaging results in ms
-static unsigned long            time_period_msec = 30000;
+static unsigned long            time_period_msec = 60000;

 // Interval between sends in ms
-static unsigned long            send_interval_msec = 250;
+static unsigned long            send_interval_msec = 500;

 // Interval before a sequence is initially treated as lost
 // Input from command line in ms and used in us
@@ -108,8 +104,9 @@ static unsigned long            loss_alarm_threshold_percent = 0;
 static char *                   alert_cmd = NULL;
 static size_t                   alert_cmd_offset;

-// Number of periods to wait to declare an alarm as cleared
-#define ALARM_DECAY_PERIODS     10
+// Interval before an alarm is cleared (hold time)
+static unsigned long            alarm_hold_msec = 0;
+#define DEFAULT_HOLD_PERIODS    10

 // Report file
 static const char *             report_name = NULL;
@@ -191,33 +188,14 @@ static uint16_t                 echo_id;
 static uint16_t                 next_sequence = 0;
 static uint16_t                 sequence_limit;

-
-//
-// Termination handler
-//
-static void
-term_handler(void)
-{
-    // NB: This function may be simultaneously invoked by multiple threads
-    if (usocket_name)
-    {
-        (void) unlink(usocket_name);
-    }
-    if (pidfile_name)
-    {
-        (void) unlink(pidfile_name);
-    }
-    exit(0);
-}
+// Receive thread ready
+static unsigned int             recv_ready = 0;


 //
 // Log for abnormal events
 //
-#ifdef __GNUC__
-static void logger(const char * format, ...) __attribute__ ((format (printf, 1, 2)));
-#endif
-
+__attribute__ ((format (printf, 1, 2)))
 static void
 logger(
    const char *                format,
@@ -238,6 +216,28 @@ logger(
 }


+//
+// Termination handler
+//
+__attribute__ ((noreturn))
+static void
+term_handler(
+    int                         signum)
+{
+    // NB: This function may be simultaneously invoked by multiple threads
+    if (usocket_name)
+    {
+        (void) unlink(usocket_name);
+    }
+    if (pidfile_name)
+    {
+        (void) unlink(pidfile_name);
+    }
+    logger("exiting on signal %d\n", signum);
+    exit(0);
+}
+
+
 //
 // Compute checksum for ICMP
 //
@@ -288,7 +288,7 @@ llsqrt(
        }
    }

-    return s;
+    return (unsigned long) s;
 }


@@ -300,7 +300,7 @@ ts_elapsed_usec(
    const struct timespec *     old,
    const struct timespec *     new)
 {
-    unsigned long               r_usec;
+    long                        r_usec;

    // Note that we are using monotonic clock and time cannot run backwards
    if (new->tv_nsec >= old->tv_nsec)
@@ -312,18 +312,21 @@ ts_elapsed_usec(
        r_usec = (new->tv_sec - old->tv_sec - 1) * 1000000 + (1000000000 + new->tv_nsec - old->tv_nsec) / 1000;
    }

-    return r_usec;
+    return (unsigned long) r_usec;
 }


 //
 // Send thread
 //
+__attribute__ ((noreturn))
 static void *
 send_thread(
+    __attribute__ ((unused))
    void *                      arg)
 {
    struct timespec             sleeptime;
+    ssize_t                     len;
    int                         r;

    // Set up our echo request packet
@@ -332,18 +335,23 @@ send_thread(
    echo_request->code = 0;
    echo_request->id = echo_id;

+    // Give the recv thread a moment to initialize
+    sleeptime.tv_sec = 0;
+    sleeptime.tv_nsec = 10000; // 10us
+    do {
+        r = nanosleep(&sleeptime, NULL);
+        if (r == -1)
+        {
+            logger("nanosleep error in send thread waiting for recv thread: %d\n", errno);
+        }
+    } while (recv_ready == 0);
+
    // Set up the timespec for nanosleep
    sleeptime.tv_sec = send_interval_msec / 1000;
    sleeptime.tv_nsec = (send_interval_msec % 1000) * 1000000;

    while (1)
    {
-        r = nanosleep(&sleeptime, NULL);
-        if (r == -1)
-        {
-            logger("nanosleep error in send thread: %d\n", errno);
-        }
-
        // Set sequence number and checksum
        echo_request->sequence = htons(next_sequence);
        echo_request->cksum = 0;
@@ -351,43 +359,51 @@ send_thread(

        array[next_slot].status = PACKET_STATUS_EMPTY;
        sched_yield();
-        clock_gettime(CLOCK_MONOTONIC, &array[next_slot].time_sent);

-        r = sendto(send_sock, echo_request, echo_request_len, 0, (struct sockaddr *) &dest_addr, dest_addr_len);
-        if (r == -1)
+        clock_gettime(CLOCK_MONOTONIC, &array[next_slot].time_sent);
+        array[next_slot].status = PACKET_STATUS_SENT;
+        len = sendto(send_sock, echo_request, echo_request_len, 0, (struct sockaddr *) &dest_addr, dest_addr_len);
+        if (len == -1)
        {
            logger("%s%s: sendto error: %d\n", identifier, dest_str, errno);
        }
-        array[next_slot].status = PACKET_STATUS_SENT;

        next_slot = (next_slot + 1) % array_size;
        next_sequence = (next_sequence + 1) % sequence_limit;
-    }

-    // notreached
-    return arg;
+        r = nanosleep(&sleeptime, NULL);
+        if (r == -1)
+        {
+            logger("nanosleep error in send thread: %d\n", errno);
+        }
+    }
 }


 //
 // Receive thread
 //
+__attribute__ ((noreturn))
 static void *
 recv_thread(
+    __attribute__ ((unused))
    void *                      arg)
 {
    struct sockaddr_storage     src_addr;
    socklen_t                   src_addr_len;
-    size_t                      len;
+    ssize_t                     len;
    icmphdr_t *                 icmp;
    struct timespec             now;
    unsigned int                array_slot;

+    // Thread startup complete
+    recv_ready = 1;
+
    while (1)
    {
        src_addr_len = sizeof(src_addr);
        len = recvfrom(recv_sock, echo_reply, echo_reply_len, 0, (struct sockaddr *) &src_addr, &src_addr_len);
-        if (len == (unsigned int) -1)
+        if (len == -1)
        {
            logger("%s%s: recvfrom error: %d\n", identifier, dest_str, errno);
            continue;
@@ -400,7 +416,7 @@ recv_thread(
            size_t              ip_len;

            // With IPv4, we get the entire IP packet
-            if (len < sizeof(struct ip))
+            if (len < (ssize_t) sizeof(struct ip))
            {
                logger("%s%s: received packet too small for IP header\n", identifier, dest_str);
                continue;
@@ -418,7 +434,7 @@ recv_thread(
        }

        // This should never happen
-        if (len < sizeof(icmphdr_t))
+        if (len < (ssize_t) sizeof(icmphdr_t))
        {
            logger("%s%s: received packet too small for ICMP header\n", identifier, dest_str);
            continue;
@@ -440,9 +456,6 @@ recv_thread(
        array[array_slot].latency_usec = ts_elapsed_usec(&array[array_slot].time_sent, &now);
        array[array_slot].status = PACKET_STATUS_RECEIVED;
    }
-
-    // notreached
-    return arg;
 }


@@ -474,7 +487,7 @@ report(
            packets_received++;
            latency_usec = array[slot].latency_usec;
            total_latency_usec += latency_usec;
-            total_latency_usec2 += latency_usec * latency_usec;
+            total_latency_usec2 += (unsigned long long) latency_usec * latency_usec;
        }
        else if (array[slot].status == PACKET_STATUS_SENT &&
                 ts_elapsed_usec(&array[slot].time_sent, &now) > loss_interval_usec)
@@ -492,7 +505,7 @@ report(

        // stddev = sqrt((sum(rtt^2) / packets) - (sum(rtt) / packets)^2)
        *average_latency_usec = avg;
-        *latency_deviation = llsqrt(avg2 - (avg * avg));
+        *latency_deviation = llsqrt(avg2 - ((unsigned long long) avg * avg));
    }
    else
    {
@@ -514,8 +527,10 @@ report(
 //
 // Report thread
 //
+__attribute__ ((noreturn))
 static void *
 report_thread(
+    __attribute__ ((unused))
    void *                      arg)
 {
    char                        buf[OUTPUT_MAX];
@@ -523,7 +538,8 @@ report_thread(
    unsigned long               average_latency_usec;
    unsigned long               latency_deviation;
    unsigned long               average_loss_percent;
-    int                         len;
+    ssize_t                     len;
+    ssize_t                     rs;
    int                         r;

    // Set up the timespec for nanosleep
@@ -546,39 +562,39 @@ report_thread(
            logger("error formatting output in report thread\n");
        }

-        r = write(report_fd, buf, len);
-        if (r == -1)
+        rs = write(report_fd, buf, (size_t) len);
+        if (rs == -1)
        {
            logger("write error in report thread: %d\n", errno);
        }
-        else if (r != len)
+        else if (rs != len)
        {
-            logger("short write in report thread: %d/%d\n", r, len);
+            logger("short write in report thread: %zd/%zd\n", rs, len);
        }

        if (flag_rewind)
        {
-            ftruncate(report_fd, len);
-            lseek(report_fd, SEEK_SET, 0);
+            (void) ftruncate(report_fd, len);
+            (void) lseek(report_fd, SEEK_SET, 0);
        }
    }
-
-    // notreached
-    return arg;
 }


 //
 // Alert thread
 //
+__attribute__ ((noreturn))
 static void *
 alert_thread(
+    __attribute__ ((unused))
    void *                      arg)
 {
    struct timespec             sleeptime;
    unsigned long               average_latency_usec;
    unsigned long               latency_deviation;
    unsigned long               average_loss_percent;
+    unsigned int                alarm_hold_periods;
    unsigned int                latency_alarm_decay = 0;
    unsigned int                loss_alarm_decay = 0;
    unsigned int                alert = 0;
@@ -589,6 +605,9 @@ alert_thread(
    sleeptime.tv_sec = alert_interval_msec / 1000;
    sleeptime.tv_nsec = (alert_interval_msec % 1000) * 1000000;

+    // Set number of alarm hold periods
+    alarm_hold_periods = (unsigned int) ((alarm_hold_msec + alert_interval_msec - 1) / alert_interval_msec);
+
    while (1)
    {
        r = nanosleep(&sleeptime, NULL);
@@ -608,7 +627,7 @@ alert_thread(
                    alert = 1;
                }

-                latency_alarm_decay = ALARM_DECAY_PERIODS;
+                latency_alarm_decay = alarm_hold_periods;
            }
            else if (latency_alarm_decay)
            {
@@ -629,7 +648,7 @@ alert_thread(
                    alert = 1;
                }

-                loss_alarm_decay = ALARM_DECAY_PERIODS;
+                loss_alarm_decay = alarm_hold_periods;
            }
            else if (loss_alarm_decay)
            {
@@ -666,16 +685,15 @@ alert_thread(
            }
        }
    }
-
-    // notreached
-    return arg;
 }

 //
 // Unix socket thread
 //
+__attribute__ ((noreturn))
 static void *
 usocket_thread(
+    __attribute__ ((unused))
    void *                      arg)
 {
    char                        buf[OUTPUT_MAX];
@@ -683,14 +701,20 @@ usocket_thread(
    unsigned long               latency_deviation;
    unsigned long               average_loss_percent;
    int                         sock_fd;
-    int                         len;
+    ssize_t                     len;
+    ssize_t                     rs;
    int                         r;

    while (1)
    {
+#if defined(DISABLE_ACCEPT4)
+        // Legacy
        sock_fd = accept(usocket_fd, NULL, NULL);
        (void) fcntl(sock_fd, F_SETFL, FD_CLOEXEC);
        (void) fcntl(sock_fd, F_SETFL, fcntl(sock_fd, F_GETFL, 0) | O_NONBLOCK);
+#else
+        sock_fd = accept4(usocket_fd, NULL, NULL, SOCK_NONBLOCK | SOCK_CLOEXEC);
+#endif

        report(&average_latency_usec, &latency_deviation, &average_loss_percent);

@@ -700,14 +724,14 @@ usocket_thread(
            logger("error formatting output in usocket thread\n");
        }

-        r = write(sock_fd, buf, len);
-        if (r == -1)
+        rs = write(sock_fd, buf, (size_t) len);
+        if (rs == -1)
        {
            logger("write error in usocket thread: %d\n", errno);
        }
-        else if (r != len)
+        else if (rs != len)
        {
-            logger("short write in usocket thread: %d/%d\n", r, len);
+            logger("short write in usocket thread: %zd/%zd\n", rs, len);
        }

        r = close(sock_fd);
@@ -716,9 +740,6 @@ usocket_thread(
            logger("close error in usocket thread: %d\n", errno);
        }
    }
-
-    // notreached
-    return arg;
 }


@@ -731,10 +752,10 @@ get_time_arg_msec(
    const char *                arg,
    unsigned long *             value)
 {
-    unsigned long               t;
+    long                        t;
    char *                      suffix;

-    t = strtoul(arg, &suffix, 10);
+    t = strtol(arg, &suffix, 10);
    if (*suffix == 'm')
    {
        // Milliseconds
@@ -747,13 +768,13 @@ get_time_arg_msec(
        suffix++;
    }

-    // Garbage in the number?
-    if (*suffix != 0)
+    // Invalid specification?
+    if (t < 0 || *suffix != 0)
    {
        return 1;
    }

-    *value = t;
+    *value = (unsigned long) t;
    return 0;
 }

@@ -766,22 +787,22 @@ get_percent_arg(
    const char *                arg,
    unsigned long *             value)
 {
-    unsigned long               t;
+    long                        t;
    char *                      suffix;

-    t = strtoul(arg, &suffix, 10);
+    t = strtol(arg, &suffix, 10);
    if (*suffix == '%')
    {
        suffix++;
    }

-    // Garbage in the number?
-    if (*suffix != 0 || t > 100)
+    // Invalid specification?
+    if (t < 0 || t > 100 || *suffix != 0)
    {
        return 1;
    }

-    *value = t;
+    *value = (unsigned long) t;
    return 0;
 }

@@ -794,10 +815,10 @@ get_length_arg(
    const char *                arg,
    unsigned long *             value)
 {
-    unsigned long               t;
+    long                        t;
    char *                      suffix;

-    t = strtoul(arg, &suffix, 10);
+    t = strtol(arg, &suffix, 10);
    if (*suffix == 'b')
    {
        // Bytes
@@ -810,13 +831,13 @@ get_length_arg(
        suffix++;
    }

-    // Garbage in the number?
-    if (*suffix != 0)
+    // Invalid specification?
+    if (t < 0 || *suffix != 0)
    {
        return 1;
    }

-    *value = t;
+    *value = (unsigned long) t;
    return 0;
 }

@@ -827,22 +848,25 @@ get_length_arg(
 static void
 usage(void)
 {
+    fprintf(stderr, "Dpinger version 3.3\n\n");
    fprintf(stderr, "Usage:\n");
-    fprintf(stderr, "  %s [-f] [-R] [-S] [-B bind_addr] [-s send_interval] [-l loss_interval] [-t time_period] [-r report_interval] [-d data_length] [-o output_file] [-A alert_interval] [-D latency_alarm] [-L loss_alarm] [-C alert_cmd] [-i identifier] [-u usocket] [-p pidfile] dest_addr\n\n", progname);
+    fprintf(stderr, "  %s [-f] [-R] [-S] [-P] [-B bind_addr] [-s send_interval] [-l loss_interval] [-t time_period] [-r report_interval] [-d data_length] [-o output_file] [-A alert_interval] [-D latency_alarm] [-L loss_alarm] [-H hold_interval] [-C alert_cmd] [-i identifier] [-u usocket] [-p pidfile] dest_addr\n\n", progname);
    fprintf(stderr, "  options:\n");
    fprintf(stderr, "    -f run in foreground\n");
    fprintf(stderr, "    -R rewind output file between reports\n");
    fprintf(stderr, "    -S log warnings via syslog\n");
+    fprintf(stderr, "    -P priority scheduling for receive thread (requires root)\n");
    fprintf(stderr, "    -B bind (source) address\n");
-    fprintf(stderr, "    -s time interval between echo requests (default 250ms)\n");
-    fprintf(stderr, "    -l time interval before packets are treated as lost (default 5x send interval)\n");
-    fprintf(stderr, "    -t time period over which results are averaged (default 30s)\n");
+    fprintf(stderr, "    -s time interval between echo requests (default 500ms)\n");
+    fprintf(stderr, "    -l time interval before packets are treated as lost (default 4x send interval)\n");
+    fprintf(stderr, "    -t time period over which results are averaged (default 60s)\n");
    fprintf(stderr, "    -r time interval between reports (default 1s)\n");
    fprintf(stderr, "    -d data length (default 0)\n");
    fprintf(stderr, "    -o output file for reports (default stdout)\n");
    fprintf(stderr, "    -A time interval between alerts (default 1s)\n");
    fprintf(stderr, "    -D time threshold for latency alarm (default none)\n");
    fprintf(stderr, "    -L percent threshold for loss alarm (default none)\n");
+    fprintf(stderr, "    -H time interval to hold an alarm before clearing it (default 10x alert interval)\n");
    fprintf(stderr, "    -C optional command to be invoked via system() for alerts\n");
    fprintf(stderr, "    -i identifier text to include in output\n");
    fprintf(stderr, "    -u unix socket name for polling\n");
@@ -854,29 +878,28 @@ usage(void)
    fprintf(stderr, "    the output format is \"latency_avg latency_stddev loss_pct\"\n");
    fprintf(stderr, "    latency values are output in microseconds\n");
    fprintf(stderr, "    loss percentage is reported in whole numbers of 0-100\n");
-    fprintf(stderr, "    resolution of loss calculation is: 100 * send_interval / (time_period - loss_interval)\n\n");
-    fprintf(stderr, "    the alert_cmd is invoked as \"alert_cmd dest_addr alarm_flag latency_avg loss_avg\"\n");
+    fprintf(stderr, "    resolution of loss calculation is: 100 / ((time_period - loss_interval) / send_interval)\n\n");
+    fprintf(stderr, "    the alert_cmd is invoked as \"alert_cmd dest_addr alarm_flag latency_avg latency_stddev loss_pct\"\n");
    fprintf(stderr, "    alarm_flag is set to 1 if either latency or loss is in alarm state\n");
-    fprintf(stderr, "    alarm_flag will return to 0 when both have have cleared alarm state\n\n");
+    fprintf(stderr, "    alarm_flag will return to 0 when both have have cleared alarm state\n");
+    fprintf(stderr, "    alarm hold time begins when the source of the alarm retruns to normal\n\n");
 }


 //
 // Fatal error
 //
+__attribute__ ((noreturn, format (printf, 1, 2)))
 static void
 fatal(
    const char *                format,
    ...)
 {
-    if (format)
-    {
-        va_list                 args;
+    va_list                 args;

-        va_start(args, format);
-        vfprintf(stderr, format, args);
-        va_end(args);
-    }
+    va_start(args, format);
+    vfprintf(stderr, format, args);
+    va_end(args);

    exit(EXIT_FAILURE);
 }
@@ -900,7 +923,7 @@ parse_args(

    progname = argv[0];

-    while((opt = getopt(argc, argv, "fRSB:s:l:t:r:d:o:A:D:L:C:i:u:p:")) != -1)
+    while((opt = getopt(argc, argv, "fRSPB:s:l:t:r:d:o:A:D:L:H:C:i:u:p:")) != -1)
    {
        switch (opt)
        {
@@ -916,6 +939,10 @@ parse_args(
            flag_syslog = 1;
            break;

+        case 'P':
+            flag_priority = 1;
+            break;
+
        case 'B':
            bind_arg = optarg;
            break;
@@ -974,7 +1001,7 @@ parse_args(

        case 'D':
            r = get_time_arg_msec(optarg, &latency_alarm_threshold_msec);
-            if (r || latency_alarm_threshold_msec == 0)
+            if (r)
            {
                fatal("invalid latency alarm threshold %s\n", optarg);
            }
@@ -983,12 +1010,20 @@ parse_args(

        case 'L':
            r = get_percent_arg(optarg, &loss_alarm_threshold_percent);
-            if (r || loss_alarm_threshold_percent == 0)
+            if (r)
            {
                fatal("invalid loss alarm threshold %s\n", optarg);
            }
            break;

+        case 'H':
+            r = get_time_arg_msec(optarg, &alarm_hold_msec);
+            if (r)
+            {
+                fatal("invalid alarm hold interval %s\n", optarg);
+            }
+            break;
+
        case 'C':
            alert_cmd_offset = strlen(optarg);
            alert_cmd = malloc(alert_cmd_offset + OUTPUT_MAX);
@@ -1003,7 +1038,7 @@ parse_args(
            len = strlen(optarg);
            if (len >= sizeof(identifier) - 1)
            {
-                fatal("identifier argument too large (max %u bytes)\n", sizeof(identifier) - 1);
+                fatal("identifier argument too large (max %u bytes)\n", (unsigned) sizeof(identifier) - 1);
            }
            // optarg with a space appended
            memcpy(identifier, optarg, len);
@@ -1021,7 +1056,7 @@ parse_args(

        default:
            usage();
-            fatal(NULL);
+            exit(EXIT_FAILURE);
        }
    }

@@ -1029,7 +1064,7 @@ parse_args(
    if (argc != optind + 1)
    {
        usage();
-        fatal(NULL);
+        exit(EXIT_FAILURE);
    }
    dest_arg = argv[optind];

@@ -1039,17 +1074,17 @@ parse_args(
        fatal("no activity enabled\n");
    }

-    // Ensure we have something to average over
-    if (time_period_msec < send_interval_msec)
+    // Ensure there is a minimum of one resolved slot at all times
+    if (time_period_msec <= send_interval_msec * 2 + loss_interval_msec)
    {
-        fatal("time period cannot be less than send interval\n");
+        fatal("the time period must be greater than twice the send interval plus the loss interval\n");
    }

    // Ensure we don't have sequence space issues. This really should only be hit by
    // complete accident. Even a ratio of 16384:1 would be excessive.
    if (time_period_msec / send_interval_msec > 65536)
    {
-        fatal("ratio of time period to send interval cannot exceed 65536:1\n");
+        fatal("the ratio of time period to send interval cannot exceed 65536:1\n");
    }

    // Check destination address
@@ -1105,14 +1140,14 @@ parse_args(
        {
            if (echo_data_len > IPV4_ICMP_DATA_MAX)
            {
-                fatal("data length too large for IPv4 - maximum is %u bytes\n", IPV4_ICMP_DATA_MAX);
+                fatal("data length too large for IPv4 - maximum is %u bytes\n", (unsigned) IPV4_ICMP_DATA_MAX);
            }
        }
        else
        {
            if (echo_data_len > IPV6_ICMP_DATA_MAX)
            {
-                fatal("data length too large for IPv6 - maximum is %u bytes\n", IPV6_ICMP_DATA_MAX);
+                fatal("data length too large for IPv6 - maximum is %u bytes\n", (unsigned) IPV6_ICMP_DATA_MAX);
            }
        }

@@ -1130,10 +1165,14 @@ main(
    char                        *argv[])
 {
    char                        bind_str[ADDR_STR_MAX] = "(none)";
+    char                        pidbuf[64];
    int                         pidfile_fd = -1;
+    pid_t                       pid;
    pthread_t                   thread;
    struct                      sigaction act;
    int                         buflen = PACKET_BUFLEN;
+    ssize_t                     len;
+    ssize_t                     rs;
    int                         r;

    // Handle command line args
@@ -1176,8 +1215,68 @@ main(
    }

    // Drop privileges
-    r = setgid(getgid());
-    r = setuid(getuid());
+    (void) setgid(getgid());
+    (void) setuid(getuid());
+
+    // Create pid file
+    if (pidfile_name)
+    {
+        pidfile_fd = open(pidfile_name, O_WRONLY | O_CREAT | O_EXCL | O_CLOEXEC, 0644);
+        if (pidfile_fd != -1)
+        {
+            // Lock the pid file
+            r = flock(pidfile_fd, LOCK_EX | LOCK_NB);
+            if (r == -1)
+            {
+                perror("flock");
+                fatal("error locking pid file\n");
+            }
+        }
+        else
+        {
+            // Pid file already exists?
+            pidfile_fd = open(pidfile_name, O_RDWR | O_CREAT | O_CLOEXEC, 0644);
+            if (pidfile_fd == -1)
+            {
+                perror("open");
+                fatal("cannot create/open pid file %s\n", pidfile_name);
+            }
+
+            // Lock the pid file
+            r = flock(pidfile_fd, LOCK_EX | LOCK_NB);
+            if (r == -1)
+            {
+                fatal("pid file %s is in use by another process\n", pidfile_name);
+            }
+
+            // Check for existing pid
+            rs = read(pidfile_fd, pidbuf, sizeof(pidbuf) - 1);
+            if (rs > 0)
+            {
+                pidbuf[rs] = 0;
+
+                pid = (pid_t) strtol(pidbuf, NULL, 10);
+                if (pid > 0)
+                {
+                    // Is the pid still alive?
+                    r = kill(pid, 0);
+                    if (r == 0)
+                    {
+                        fatal("pid file %s is in use by process %u\n", pidfile_name, (unsigned int) pid);
+                    }
+                }
+            }
+
+            // Reset the pid file
+            (void) lseek(pidfile_fd, 0, 0);
+            r = ftruncate(pidfile_fd, 0);
+            if (r == -1)
+            {
+                perror("ftruncate");
+                fatal("cannot write pid file %s\n", pidfile_name);
+            }
+        }
+    }

    // Create report file
    if (report_name)
@@ -1211,7 +1310,6 @@ main(
            fatal("cannot create unix domain socket\n");
        }
        (void) fcntl(usocket_fd, F_SETFL, FD_CLOEXEC);
-
        (void) unlink(usocket_name);

        memset(&uaddr, 0, sizeof(uaddr));
@@ -1239,31 +1337,20 @@ main(
        }
    }

-    // Create pid file
-    if (pidfile_name)
-    {
-        pidfile_fd = open(pidfile_name, O_WRONLY | O_CREAT | O_TRUNC | O_CLOEXEC, 0644);
-        if (pidfile_fd == -1)
-        {
-            perror("open");
-            fatal("cannot open/create pid file %s\n", pidfile_name);
-        }
-    }
-
    // End of general errors from command line options

    // Self background
    if (foreground == 0)
    {
-        r = fork();
+        pid = fork();

-        if (r == -1)
+        if (pid == -1)
        {
            perror("fork");
            fatal("cannot background\n");
        }

-        if (r)
+        if (pid)
        {
            _exit(EXIT_SUCCESS);
        }
@@ -1274,23 +1361,20 @@ main(
    // Termination handler
    memset(&act, 0, sizeof(act));
    act.sa_handler = (void (*)(int)) term_handler;
-    sigaction(SIGTERM, &act, NULL);
-    sigaction(SIGINT, &act, NULL);
+    (void) sigaction(SIGTERM, &act, NULL);
+    (void) sigaction(SIGINT, &act, NULL);

    // Write pid file
    if (pidfile_fd != -1)
    {
-        char                    buf[64];
-        int                     len;
-
-        len = snprintf(buf, sizeof(buf), "%u\n", (unsigned) getpid());
-        if (len < 0 || (size_t) len > sizeof(buf))
+        len = snprintf(pidbuf, sizeof(pidbuf), "%u\n", (unsigned) getpid());
+        if (len < 0 || (size_t) len > sizeof(pidbuf))
        {
            fatal("error formatting pidfile\n");
        }

-        r = write(pidfile_fd, buf, len);
-        if (r == -1)
+        rs = write(pidfile_fd, pidbuf, (size_t) len);
+        if (rs == -1)
        {
            perror("write");
            fatal("error writing pidfile\n");
@@ -1323,7 +1407,7 @@ main(
    // Set the default loss interval
    if (loss_interval_msec == 0)
    {
-        loss_interval_msec = send_interval_msec * 5;
+        loss_interval_msec = send_interval_msec * 4;
    }
    loss_interval_usec = loss_interval_msec * 1000;

@@ -1334,6 +1418,12 @@ main(
        fatal("getnameinfo of destination address failed\n");
    }

+    // Default alarm hold if not explicitly set
+    if (alarm_hold_msec == 0)
+    {
+        alarm_hold_msec = alert_interval_msec * DEFAULT_HOLD_PERIODS;
+    }
+
    if (bind_addr_len)
    {
        r = getnameinfo((struct sockaddr *) &bind_addr, bind_addr_len, bind_str, sizeof(bind_str), NULL, 0, NI_NUMERICHOST);
@@ -1343,13 +1433,13 @@ main(
        }
    }

-    logger("send_interval %lums  loss_interval %lums  time_period %lums  report_interval %lums  data_len %lu  alert_interval %lums  latency_alarm %lums  loss_alarm %lu%%  dest_addr %s  bind_addr %s  identifier \"%s\"\n",
+    logger("send_interval %lums  loss_interval %lums  time_period %lums  report_interval %lums  data_len %lu  alert_interval %lums  latency_alarm %lums  loss_alarm %lu%%  alarm_hold %lums  dest_addr %s  bind_addr %s  identifier \"%s\"\n",
           send_interval_msec, loss_interval_msec, time_period_msec, report_interval_msec, echo_data_len,
-           alert_interval_msec, latency_alarm_threshold_msec, loss_alarm_threshold_percent,
+           alert_interval_msec, latency_alarm_threshold_msec, loss_alarm_threshold_percent, alarm_hold_msec,
           dest_str, bind_str, identifier);

    // Set my echo id
-    echo_id = htons(getpid());
+    echo_id = htons((uint16_t) getpid());

    // Set the limit for sequence number to ensure a multiple of array size
    sequence_limit = (uint16_t) array_size;
@@ -1366,6 +1456,27 @@ main(
        fatal("cannot create recv thread\n");
    }

+    // Set priority on recv thread if requested
+    if (flag_priority)
+    {
+        struct sched_param          thread_sched_param;
+
+        r = sched_get_priority_min(SCHED_RR);
+        if (r == -1)
+        {
+            perror("sched_get_priority_min");
+            fatal("cannot determin minimum shceduling priority for SCHED_RR\n");
+        }
+        thread_sched_param.sched_priority = r;
+
+        r = pthread_setschedparam(thread, SCHED_RR, &thread_sched_param);
+        if (r != 0)
+        {
+            perror("pthread_setschedparam");
+            fatal("cannot set receive thread priority\n");
+        }
+    }
+
    // Create send thread
    r = pthread_create(&thread, NULL, &send_thread, NULL);
    if (r != 0)
--- a/influx/README.md
+++ b/influx/README.md
@@ -0,0 +1,19 @@
+Examples for dpinger logging/monitoring with InfluxDB and Grafana
+
+<br>
+
+Files:
+
+    dpinger_influx_logger
+
+Python script for logging dpinger data in InfluxDB
+
+
+    dpinger_start.sh
+
+Sample start script for dpinger influx logging
+
+
+    dpinger_grafana_dashboard.json
+
+Example Grafana dashboard for monitoring dpinger data
--- a/influx/dpinger_grafana_dashboard.json
+++ b/influx/dpinger_grafana_dashboard.json
@@ -0,0 +1,456 @@
+{
+  "annotations": {
+    "list": [
+      {
+        "builtIn": 1,
+        "datasource": {
+          "type": "datasource",
+          "uid": "grafana"
+        },
+        "enable": true,
+        "hide": true,
+        "iconColor": "rgba(0, 211, 255, 1)",
+        "name": "Annotations & Alerts",
+        "target": {
+          "limit": 100,
+          "matchAny": false,
+          "tags": [],
+          "type": "dashboard"
+        },
+        "type": "dashboard"
+      }
+    ]
+  },
+  "editable": true,
+  "fiscalYearStartMonth": 0,
+  "graphTooltip": 0,
+  "id": 3,
+  "iteration": 1652309379625,
+  "links": [],
+  "liveNow": false,
+  "panels": [
+    {
+      "datasource": {
+        "uid": "$source"
+      },
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "drawStyle": "line",
+            "fillOpacity": 10,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "viz": false
+            },
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "never",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "min": 0,
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green",
+                "value": null
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "ms"
+        },
+        "overrides": [
+          {
+            "matcher": {
+              "id": "byName",
+              "options": "loss"
+            },
+            "properties": [
+              {
+                "id": "unit",
+                "value": "percent"
+              }
+            ]
+          },
+          {
+            "matcher": {
+              "id": "byName",
+              "options": "loss"
+            },
+            "properties": [
+              {
+                "id": "color",
+                "value": {
+                  "fixedColor": "#e00000",
+                  "mode": "fixed"
+                }
+              },
+              {
+                "id": "custom.fillOpacity",
+                "value": 100
+              },
+              {
+                "id": "custom.lineWidth",
+                "value": 0
+              },
+              {
+                "id": "unit",
+                "value": "percent"
+              },
+              {
+                "id": "max",
+                "value": 100
+              }
+            ]
+          }
+        ]
+      },
+      "gridPos": {
+        "h": 19,
+        "w": 24,
+        "x": 0,
+        "y": 0
+      },
+      "id": 2,
+      "options": {
+        "legend": {
+          "calcs": [
+            "mean",
+            "lastNotNull",
+            "max",
+            "min"
+          ],
+          "displayMode": "table",
+          "placement": "bottom"
+        },
+        "tooltip": {
+          "mode": "multi",
+          "sort": "none"
+        }
+      },
+      "pluginVersion": "8.3.5",
+      "targets": [
+        {
+          "alias": "latency",
+          "groupBy": [
+            {
+              "params": [
+                "$intervals"
+              ],
+              "type": "time"
+            },
+            {
+              "params": [
+                "null"
+              ],
+              "type": "fill"
+            }
+          ],
+          "measurement": "dpinger",
+          "orderByTime": "ASC",
+          "policy": "default",
+          "query": "SELECT mean(\"latency\") FROM \"wan\" WHERE $timeFilter GROUP BY time($__interval) fill(null)",
+          "queryType": "randomWalk",
+          "rawQuery": false,
+          "refId": "A",
+          "resultFormat": "time_series",
+          "select": [
+            [
+              {
+                "params": [
+                  "latency"
+                ],
+                "type": "field"
+              },
+              {
+                "params": [],
+                "type": "mean"
+              }
+            ]
+          ],
+          "tags": [
+            {
+              "key": "name",
+              "operator": "=~",
+              "value": "/^$name$/"
+            }
+          ]
+        },
+        {
+          "alias": "stddev",
+          "groupBy": [
+            {
+              "params": [
+                "$intervals"
+              ],
+              "type": "time"
+            },
+            {
+              "params": [
+                "null"
+              ],
+              "type": "fill"
+            }
+          ],
+          "measurement": "dpinger",
+          "orderByTime": "ASC",
+          "policy": "default",
+          "queryType": "randomWalk",
+          "refId": "B",
+          "resultFormat": "time_series",
+          "select": [
+            [
+              {
+                "params": [
+                  "stddev"
+                ],
+                "type": "field"
+              },
+              {
+                "params": [],
+                "type": "mean"
+              }
+            ]
+          ],
+          "tags": [
+            {
+              "key": "name",
+              "operator": "=~",
+              "value": "/^$name$/"
+            }
+          ]
+        },
+        {
+          "alias": "loss",
+          "groupBy": [
+            {
+              "params": [
+                "$intervals"
+              ],
+              "type": "time"
+            },
+            {
+              "params": [
+                "null"
+              ],
+              "type": "fill"
+            }
+          ],
+          "measurement": "dpinger",
+          "orderByTime": "ASC",
+          "policy": "default",
+          "queryType": "randomWalk",
+          "refId": "C",
+          "resultFormat": "time_series",
+          "select": [
+            [
+              {
+                "params": [
+                  "loss"
+                ],
+                "type": "field"
+              },
+              {
+                "params": [],
+                "type": "mean"
+              }
+            ]
+          ],
+          "tags": [
+            {
+              "key": "name",
+              "operator": "=~",
+              "value": "/^$name$/"
+            }
+          ]
+        }
+      ],
+      "title": "$name - ${intervals} intervals",
+      "transformations": [],
+      "type": "timeseries"
+    }
+  ],
+  "refresh": "1m",
+  "schemaVersion": 36,
+  "style": "dark",
+  "tags": [],
+  "templating": {
+    "list": [
+      {
+        "current": {
+          "selected": false,
+          "text": "dpinger",
+          "value": "dpinger"
+        },
+        "hide": 0,
+        "includeAll": false,
+        "label": "Source",
+        "multi": false,
+        "name": "source",
+        "options": [],
+        "query": "influxdb",
+        "queryValue": "",
+        "refresh": 1,
+        "regex": "",
+        "skipUrlSync": false,
+        "type": "datasource"
+      },
+      {
+        "current": {
+          "selected": false,
+          "text": "wan",
+          "value": "wan"
+        },
+        "datasource": {
+          "type": "influxdb",
+          "uid": "$source"
+        },
+        "definition": "SHOW TAG VALUES  WITH KEY = \"name\"",
+        "hide": 0,
+        "includeAll": false,
+        "label": "Name",
+        "multi": false,
+        "name": "name",
+        "options": [],
+        "query": "SHOW TAG VALUES  WITH KEY = \"name\"",
+        "refresh": 1,
+        "regex": "",
+        "skipUrlSync": false,
+        "sort": 0,
+        "tagValuesQuery": "",
+        "tagsQuery": "",
+        "type": "query",
+        "useTags": false
+      },
+      {
+        "auto": true,
+        "auto_count": 500,
+        "auto_min": "10s",
+        "current": {
+          "selected": false,
+          "text": "auto",
+          "value": "$__auto_interval_intervals"
+        },
+        "hide": 0,
+        "label": "Intervals",
+        "name": "intervals",
+        "options": [
+          {
+            "selected": true,
+            "text": "auto",
+            "value": "$__auto_interval_intervals"
+          },
+          {
+            "selected": false,
+            "text": "10s",
+            "value": "10s"
+          },
+          {
+            "selected": false,
+            "text": "30s",
+            "value": "30s"
+          },
+          {
+            "selected": false,
+            "text": "1m",
+            "value": "1m"
+          },
+          {
+            "selected": false,
+            "text": "2m",
+            "value": "2m"
+          },
+          {
+            "selected": false,
+            "text": "5m",
+            "value": "5m"
+          },
+          {
+            "selected": false,
+            "text": "10m",
+            "value": "10m"
+          },
+          {
+            "selected": false,
+            "text": "15m",
+            "value": "15m"
+          },
+          {
+            "selected": false,
+            "text": "30m",
+            "value": "30m"
+          },
+          {
+            "selected": false,
+            "text": "1h",
+            "value": "1h"
+          },
+          {
+            "selected": false,
+            "text": "6h",
+            "value": "6h"
+          },
+          {
+            "selected": false,
+            "text": "12h",
+            "value": "12h"
+          },
+          {
+            "selected": false,
+            "text": "1d",
+            "value": "1d"
+          },
+          {
+            "selected": false,
+            "text": "7d",
+            "value": "7d"
+          }
+        ],
+        "query": "10s,30s,1m,2m,5m,10m,15m,30m,1h,6h,12h,1d,7d",
+        "queryValue": "",
+        "refresh": 2,
+        "skipUrlSync": false,
+        "type": "interval"
+      }
+    ]
+  },
+  "time": {
+    "from": "now-24h",
+    "to": "now"
+  },
+  "timepicker": {
+    "refresh_intervals": [
+      "1m",
+      "5m"
+    ]
+  },
+  "timezone": "",
+  "title": "WAN Latency",
+  "uid": "ThwrgHYMk",
+  "version": 46,
+  "weekStart": ""
+}
--- a/influx/dpinger_influx_logger
+++ b/influx/dpinger_influx_logger
@@ -0,0 +1,70 @@
+#!/usr/bin/python
+
+dpinger_path = "/usr/local/bin/dpinger"
+
+import os
+import sys
+import signal
+import requests
+from subprocess import Popen, PIPE
+from requests import post
+
+# Handle SIGINT
+def signal_handler(signal, frame):
+    try:
+        dpinger.kill()
+    except:
+        pass
+    sys.exit(0)
+
+signal.signal(signal.SIGINT, signal_handler)
+
+# Handle command line ars
+progname = sys.argv.pop(0)
+if (len(sys.argv) < 4):
+    print('Usage: {0} influx_url influx_db host name target [additional dpinger options]'.format(progname))
+    print('  influx_url  URL of the Influx server')
+    print('  influx_db   name of the Influx database')
+    print('  host        value of "host" tag (example: output of hostname command)')
+    print('  name        value of "name" tag (example: a circuit name such as "wan")')
+    print('  target      IP address to monitor (also the value of the "target" tag)')
+    sys.exit(1)
+influx_url = sys.argv.pop(0)
+influx_db = sys.argv.pop(0)
+host = sys.argv.pop(0)
+name = sys.argv.pop(0)
+target = sys.argv.pop(0)
+
+influx_user = os.getenv('INFLUX_USER')
+influx_pass = os.getenv('INFLUX_PASS')
+
+# Set up dpinger command
+cmd = [dpinger_path, "-f"]
+cmd.extend(sys.argv)
+cmd.extend(["-r", "10s", target])
+
+# Set up formats
+url = '{0}/write?db={1}'.format(influx_url, influx_db)
+datafmt = "dpinger,host={0},name={1},target={2} latency={{0:.3f}},stddev={{1:.3f}},loss={{2}}i".format(host, name, target)
+
+# Start up dpinger
+try:
+    dpinger = Popen(cmd, stdout=PIPE, text=True, bufsize=0)
+except:
+    print("failed to start dpinger")
+    sys.exit(1)
+
+# Start the show
+while True:
+    line = dpinger.stdout.readline()
+    if (len(line) == 0):
+        print("dpinger exited")
+        sys.exit(1)
+
+    [latency, stddev, loss] = line.split()
+    data = datafmt.format(float(latency) / 1000, float(stddev) / 1000, loss)
+    #print(data)
+    try:
+        post(url = url, auth = (influx_user, influx_pass), data = data)
+    except:
+        print("post failed")
--- a/influx/dpinger_start.sh
+++ b/influx/dpinger_start.sh
@@ -0,0 +1,7 @@
+#!/bin/sh
+
+INFLUX_URL="http://myinfluxhost:8086"
+export INFLUX_USER="dpinger"
+export INFLUX_PASS="myinfluxpass"
+
+exec /usr/local/dpinger_influx_logger $INFLUX_URL dpinger `hostname` wan 8.8.8.8
--- a/rrd/README.md
+++ b/rrd/README.md
@@ -0,0 +1,25 @@
+Example scripts for creating RRD graphs with dpinger
+
+<br>
+
+Files and Usage:
+
+    dpinger_rrd_create <name>
+
+Create the rrd initial file.
+
+    dpinger_rrd_update <name> <target> <additional dpinger options>
+
+Daemon updater script. Runs dpinger and feeds the rrd file.
+
+    dpinger_rrd_gencgi <name>
+
+Generate a cgi script that displays graphs.
+
+    dpinger_rrd_graph <name>
+
+Generate png files for use with static html
+
+    sample.html
+
+Sample static html to display graphs.
--- a/rrd/dpinger_rrd_create
+++ b/rrd/dpinger_rrd_create
@@ -0,0 +1,30 @@
+#!/bin/sh
+
+if [ $# -ne 1 ]
+then
+    echo "usage: $0 name"
+    exit 1
+fi
+name="$1"
+
+rrdfile="${name}.rrd"
+echo "Creating rrd file ${rrdfile}"
+
+
+# Time duration method doesn't work in all versions of rrdtool
+#rrdtool create "${rrdfile}" --step 1m \
+#    DS:latency:GAUGE:5m:0:U \
+#    DS:stddev:GAUGE:5m:0:U \
+#    DS:loss:GAUGE:5m:0:100 \
+#    RRA:AVERAGE:0.5:1m:15d \
+#    RRA:AVERAGE:0.5:5m:90d \
+#    RRA:AVERAGE:0.5:1h:3y
+
+# This method works in all versions
+rrdtool create "${rrdfile}" --step 60 \
+    DS:latency:GAUGE:300:0:U \
+    DS:stddev:GAUGE:300:0:U \
+    DS:loss:GAUGE:300:0:100 \
+    RRA:AVERAGE:0.5:1:21600 \
+    RRA:AVERAGE:0.5:5:25920 \
+    RRA:AVERAGE:0.5:60:26352
--- a/rrd/dpinger_rrd_gencgi
+++ b/rrd/dpinger_rrd_gencgi
@@ -0,0 +1,140 @@
+#!/bin/sh
+
+if [ $# -ne 2 ]
+then
+    echo "usage: $0 rrdname pngname"
+    exit 1
+fi
+rrdname="${1}"
+pngname="${2}"
+
+# Prefixes for rrd and png files. Note that if the prefix is a directory, it must incldue the trailing slash
+# If no value is set, the files are located in the current directory when the cgi script runs
+rrdprefix=
+pngprefix=/tmp/
+
+
+# Graph dimensions
+#graph_height=240
+#graph_width=720
+graph_height=280
+graph_width=840
+
+# Preferred font
+font="DejaVuSansMono"
+
+# Latency breakpoints in milliseconds
+latency_s0=20
+latency_s1=40
+latency_s2=80
+latency_s3=160
+latency_s4=320
+
+# Latency colors
+latency_c0="dddddd"
+latency_c1="ddbbbb"
+latency_c2="d4aaaa"
+latency_c3="cc9999"
+latency_c4="c38888"
+latency_c5="bb7777"
+
+# Standard deviation color & opacity
+stddev_c="55333355"
+
+# Loss color
+loss_c="ee0000"
+
+
+gen_graph()
+{
+    png=$1
+    rrd=$2
+    start=$3
+    end=$4
+    step=$5
+    description=$6
+
+    echo "<RRD::GRAPH \"${png}\""
+        echo "--lazy"
+        echo "--start \"${start}\" --end \"${end}\" --step \"${step}\""
+        echo "--height ${graph_height} --width ${graph_width}"
+        echo "--title \"Average Latency and Packet Loss - ${description}\""
+        echo "--disable-rrdtool-tag"
+        echo "--color BACK#ffffff"
+        echo "--font DEFAULT:9:\"${font}\""
+        echo "--font AXIS:8:\"${font}\""
+
+        echo "DEF:latency_us=\"${rrd}\":latency:AVERAGE:step=\"${step}\""
+        echo "CDEF:latency=latency_us,1000,/"
+        echo "CDEF:latency_s0=latency,${latency_s0},MIN"
+        echo "CDEF:latency_s1=latency,${latency_s1},MIN"
+        echo "CDEF:latency_s2=latency,${latency_s2},MIN"
+        echo "CDEF:latency_s3=latency,${latency_s3},MIN"
+        echo "CDEF:latency_s4=latency,${latency_s4},MIN"
+        echo "VDEF:latency_min=latency,MINIMUM"
+        echo "VDEF:latency_max=latency,MAXIMUM"
+        echo "VDEF:latency_avg=latency,AVERAGE"
+        echo "VDEF:latency_last=latency,LAST"
+
+        echo "DEF:stddev_us=\"${rrd}\":stddev:AVERAGE:step=\"${step}\""
+        echo "CDEF:stddev=stddev_us,1000,/"
+        echo "VDEF:stddev_min=stddev,MINIMUM"
+        echo "VDEF:stddev_max=stddev,MAXIMUM"
+        echo "VDEF:stddev_avg=stddev,AVERAGE"
+        echo "VDEF:stddev_last=stddev,LAST"
+
+        echo "DEF:loss=\"${rrd}\":loss:AVERAGE:step=\"${step}\""
+        echo "CDEF:loss_neg=loss,-1,*"
+        echo "VDEF:loss_min=loss,MINIMUM"
+        echo "VDEF:loss_max=loss,MAXIMUM"
+        echo "VDEF:loss_avg=loss,AVERAGE"
+        echo "VDEF:loss_last=loss,LAST"
+
+        echo "COMMENT:\"                         Min              Max              Avg             Last\n\""
+
+        echo "COMMENT:\"     \""
+        echo "AREA:latency#${latency_c5}"
+        echo "AREA:latency_s4#${latency_c4}"
+        echo "AREA:latency_s3#${latency_c3}"
+        echo "AREA:latency_s2#${latency_c2}"
+        echo "AREA:latency_s1#${latency_c1}"
+        echo "AREA:latency_s0#${latency_c0}"
+        echo "LINE1:latency#000000:\"Latency  \""
+        echo "GPRINT:\"latency_min:%8.3lf ms\t\""
+        echo "GPRINT:\"latency_max:%8.3lf ms\t\""
+        echo "GPRINT:\"latency_avg:%8.3lf ms\t\""
+        echo "GPRINT:\"latency_last:%8.3lf ms\n\""
+
+        echo "COMMENT:\"     \""
+        echo "LINE1:stddev#${stddev_c}:\"Stddev   \""
+        echo "GPRINT:\"stddev_min:%8.3lf ms\t\""
+        echo "GPRINT:\"stddev_max:%8.3lf ms\t\""
+        echo "GPRINT:\"stddev_avg:%8.3lf ms\t\""
+        echo "GPRINT:\"stddev_last:%8.3lf ms\n\""
+
+        echo "COMMENT:\"     \""
+        echo "AREA:loss_neg#${loss_c}:\"Loss         \""
+        echo "GPRINT:\"loss_min:%4.1lf %%\t\t\""
+        echo "GPRINT:\"loss_max:%4.1lf %%\t\t\""
+        echo "GPRINT:\"loss_avg:%4.1lf %%\t\t\""
+        echo "GPRINT:\"loss_last:%4.1lf %%\n\""
+        echo "COMMENT:\" \n\""
+        echo "GPRINT:\"latency_last:Ending at %H\\:%M on %B %d, %Y\\r:strftime\""
+
+        echo ">"
+        echo "<p>"
+}
+
+(
+    echo "#!/usr/bin/rrdcgi"
+    echo "<html> <head> <title>Latency Statistics for ${rrdname}</title> </head> <body>"
+
+    gen_graph "${pngprefix}${pngname}-1.png" "${rrdprefix}${rrdname}.rrd" "now-8h" "now" "60" "Last 8 hours - 1 minute intervals"
+    gen_graph "${pngprefix}${pngname}-2.png" "${rrdprefix}${rrdname}.rrd" "now-36h" "now" "300" "Last 36 hours - 5 minute intervals"
+    gen_graph "${pngprefix}${pngname}-3.png" "${rrdprefix}${rrdname}.rrd" "now-8d" "now" "1800" "Last 8 days - 30 minute intervals"
+    gen_graph "${pngprefix}${pngname}-4.png" "${rrdprefix}${rrdname}.rrd" "now-60d" "now" "14400" "Last 60 days - 4 hour intervals"
+    gen_graph "${pngprefix}${pngname}-5.png" "${rrdprefix}${rrdname}.rrd" "now-1y" "now" "86400" "Last 1 year - 1 day intervals"
+    gen_graph "${pngprefix}${pngname}-6.png" "${rrdprefix}${rrdname}.rrd" "now-4y" "now" "86400" "Last 4 years - 1 day intervals"
+
+    echo "</body> </html>"
+) > "${pngname}.cgi"
--- a/rrd/dpinger_rrd_graph
+++ b/rrd/dpinger_rrd_graph
@@ -0,0 +1,131 @@
+#!/bin/sh
+
+if [ $# -ne 2 ]
+then
+    echo "usage: $0 rrdname pngname"
+    exit 1
+fi
+rrdname="${1}"
+pngname="${2}"
+
+# Prefixes for rrd and png files. Note that if the prefix is a directory, it must incldue the trailing slash
+# If no value is set, the files are located in the current directory
+rrdprefix=
+pngprefix=/tmp/
+
+# Graph dimensions
+graph_height=240
+graph_width=720
+#graph_height=280
+#graph_width=840
+
+# Preferred font
+font="DejaVuSansMono"
+
+# Latency breakpoints in milliseconds
+latency_s0=20
+latency_s1=40
+latency_s2=80
+latency_s3=160
+latency_s4=320
+
+# Latency colors
+latency_c0="dddddd"
+latency_c1="ddbbbb"
+latency_c2="d4aaaa"
+latency_c3="cc9999"
+latency_c4="c38888"
+latency_c5="bb7777"
+
+# Standard deviation color & opacity
+stddev_c="55333355"
+
+# Loss color
+loss_c="ee0000"
+
+
+gen_graph()
+{
+    png=$1
+    rrd=$2
+    start=$3
+    end=$4
+    step=$5
+    description=$6
+
+    rrdtool graph "${png}" \
+        --lazy \
+        --start "${start}" --end "${end}" --step "${step}" \
+        --height "${graph_height}" --width "${graph_width}" \
+        --title "Average Latency and Packet Loss - ${description}" \
+        --disable-rrdtool-tag \
+        --color BACK#ffffff \
+        --font DEFAULT:9:"${font}" \
+        --font AXIS:8:"${font}" \
+        \
+        DEF:latency_us="${rrd}":latency:AVERAGE:step="${step}" \
+        CDEF:latency=latency_us,1000,/ \
+        CDEF:latency_s0=latency,${latency_s0},MIN \
+        CDEF:latency_s1=latency,${latency_s1},MIN \
+        CDEF:latency_s2=latency,${latency_s2},MIN \
+        CDEF:latency_s3=latency,${latency_s3},MIN \
+        CDEF:latency_s4=latency,${latency_s4},MIN \
+        VDEF:latency_min=latency,MINIMUM \
+        VDEF:latency_max=latency,MAXIMUM \
+        VDEF:latency_avg=latency,AVERAGE \
+        VDEF:latency_last=latency,LAST \
+        \
+        DEF:stddev_us="${rrd}":stddev:AVERAGE:step="${step}" \
+        CDEF:stddev=stddev_us,1000,/ \
+        VDEF:stddev_min=stddev,MINIMUM \
+        VDEF:stddev_max=stddev,MAXIMUM \
+        VDEF:stddev_avg=stddev,AVERAGE \
+        VDEF:stddev_last=stddev,LAST \
+        \
+        DEF:loss="${rrd}":loss:AVERAGE:step="${step}" \
+        CDEF:loss_neg=loss,-1,* \
+        VDEF:loss_min=loss,MINIMUM \
+        VDEF:loss_max=loss,MAXIMUM \
+        VDEF:loss_avg=loss,AVERAGE \
+        VDEF:loss_last=loss,LAST \
+        \
+        COMMENT:"                         Min              Max              Avg             Last\n" \
+        \
+        COMMENT:"     " \
+        AREA:latency#${latency_c5} \
+        AREA:latency_s4#${latency_c4} \
+        AREA:latency_s3#${latency_c3} \
+        AREA:latency_s2#${latency_c2} \
+        AREA:latency_s1#${latency_c1} \
+        AREA:latency_s0#${latency_c0} \
+        LINE1:latency#000000:"Latency  " \
+        GPRINT:"latency_min:%8.3lf ms\t" \
+        GPRINT:"latency_max:%8.3lf ms\t" \
+        GPRINT:"latency_avg:%8.3lf ms\t" \
+        GPRINT:"latency_last:%8.3lf ms\n" \
+        \
+        COMMENT:"     " \
+        LINE1:stddev#${stddev_c}:"Stddev   " \
+        GPRINT:"stddev_min:%8.3lf ms\t" \
+        GPRINT:"stddev_max:%8.3lf ms\t" \
+        GPRINT:"stddev_avg:%8.3lf ms\t" \
+        GPRINT:"stddev_last:%8.3lf ms\n" \
+        \
+        COMMENT:"     " \
+        AREA:loss_neg#${loss_c}:"Loss         " \
+        GPRINT:"loss_min:%4.1lf %%\t\t" \
+        GPRINT:"loss_max:%4.1lf %%\t\t" \
+        GPRINT:"loss_avg:%4.1lf %%\t\t" \
+        GPRINT:"loss_last:%4.1lf %%\n" \
+        \
+        COMMENT:" \n" \
+        GPRINT:"latency_last:Ending at %H\:%M on %B %d, %Y\r:strftime"
+}
+
+
+gen_graph "${pngprefix}${pngname}-1.png" "${rrdprefix}${rrdname}.rrd" "now-8h" "now" "60" "Last 8 hours - 1 minute intervals"
+gen_graph "${pngprefix}${pngname}-2.png" "${rrdprefix}${rrdname}.rrd" "now-36h" "now" "300" "Last 36 hours - 5 minute intervals"
+gen_graph "${pngprefix}${pngname}-3.png" "${rrdprefix}${rrdname}.rrd" "now-8d" "now" "1800" "Last 8 days - 30 minute intervals"
+gen_graph "${pngprefix}${pngname}-4.png" "${rrdprefix}${rrdname}.rrd" "now-60d" "now" "14400" "Last 60 days - 4 hour intervals"
+gen_graph "${pngprefix}${pngname}-5.png" "${rrdprefix}${rrdname}.rrd" "now-1y" "now" "86400" "Last 1 year - 1 day intervals"
+gen_graph "${pngprefix}${pngname}-6.png" "${rrdprefix}${rrdname}.rrd" "now-4y" "now" "86400" "Last 4 years - 1 day intervals"
--- a/rrd/dpinger_rrd_update
+++ b/rrd/dpinger_rrd_update
@@ -0,0 +1,27 @@
+#!/bin/sh
+
+if [ $# -lt 2 ]
+then
+    echo "usage: $0 rrdname targetip [dpinger options]"
+    exit 1
+fi
+name="$1"
+targetip="$2"
+shift 2
+options=$*
+
+# Where the dpinger executable is located
+dpinger=/usr/local/bin/dpinger
+
+
+rrdfile="${name}.rrd"
+if [ \! -w ${rrdfile} ]
+then
+    echo "$0: file \"${rrdfile}\" does not exist or is not writable"
+    exit 1
+fi
+
+${dpinger} -f ${options} -s 500m -t 60s -r 60s ${targetip} |
+while read -r latency stddev loss; do
+    rrdtool update "${rrdfile}" -t latency:stddev:loss "N:$latency:$stddev:$loss"
+done
--- a/rrd/sample.html
+++ b/rrd/sample.html
@@ -0,0 +1,16 @@
+<html>
+    <head><title>>Latency Statistics for WAN</title></head>
+    <body>
+        <img src="/tmp/wan-1.png" alt="wan-1">
+        <p>
+        <img src="/tmp/wan-2.png" alt="wan-2">
+        <p>
+        <img src="/tmp/wan-3.png" alt="wan-3">
+        <p>
+        <img src="/tmp/wan-4.png" alt="wan-4">
+        <p>
+        <img src="/tmp/wan-5.png" alt="wan-5">
+        <p>
+        <img src="/tmp/wan-6.png" alt="wan-6">
+    </body>
+</html>
Author	SHA1	Message	Date
dennypage	664f5c7aa6	Use dpinger defaults for send_interval and time_period	2023-11-08 10:40:52 -08:00
Denny Page	0e963753e1	Use gcc by default	2023-11-08 10:22:58 -08:00
Denny Page	e3cb41889e	Add explicit cast for assignment of alarm_hold_periods	2023-11-08 10:08:07 -08:00
dennypage	9c31ea4380	Update default parameters and correct loss calculation Update parameters to reflect existing defaults and correct the loss resolution calculation when using subsecond send intervals.	2023-11-08 08:29:46 -08:00
Denny Page	fff9b65eb5	Correct usage note regarding loss resolution with subsecond send intervals	2023-11-08 08:07:39 -08:00
Denny Page	47f1a778b9	Correct usage note regarding parameters to the alert_cmd	2023-11-06 15:40:46 -08:00
Denny Page	ce7d88bddf	Update version to 3.3	2023-01-18 19:17:16 -08:00
Denny Page	67b8ba1f6d	Add option to explicitly control the hold time for alarms.	2023-01-18 18:24:20 -08:00
Denny Page	c845c582b4	Add examples for dpinger logging/monitoring with InfluxDB and Grafana	2022-05-14 14:50:40 -07:00
dennypage	fbc7e8f87f	Update copyright year	2022-03-01 08:21:24 -08:00
Denny Page	efc17c7204	Log signal number on exit	2022-02-28 10:07:30 -08:00
Denny Page	bc00923f62	Update text formatting to match current GitHub format. No change to actual license.	2020-06-07 13:29:36 -07:00
Denny Page	bf18a6e2a8	Update copyright	2020-06-07 13:23:33 -07:00
Denny Page	cee7ac9da0	Add a version number to usage output	2017-12-08 21:37:23 -08:00
Denny Page	2b032751e5	Enhance pid file support to detect running processes	2017-09-29 16:04:13 -07:00
Denny Page	84ee15b155	Clean up loss accuracy description	2017-09-29 15:13:48 -07:00
Denny Page	e10c51ad95	Move check for zero intervals back to caller. Prior commit broke disable of report interval.	2017-09-29 00:23:16 -07:00
Denny Page	579ae3d66b	Detect (and reject) negative numbers in paramaters	2017-09-28 15:53:20 -07:00
Denny Page	64e644e7be	Don't wait for send interval before sending first echo request	2017-09-28 14:20:21 -07:00
Denny Page	a18d82ab6e	Update copyright	2017-09-28 14:00:40 -07:00
Denny Page	34b0bb924e	Use accept4()	2017-09-28 13:04:16 -07:00
dennypage	4173834bbe	Create NOTES.md	2017-09-27 13:36:52 -07:00
dennypage	2a8eaa0c8f	Merge pull request #23 from joemiller/openbsd problem: cannot build on openbsd	2017-08-22 16:57:09 -07:00
joe miller	edb883498d	problem: cannot build on openbsd solution: include socket.h before if.h since if.h relies on types defined in socket.h	2017-03-15 08:16:00 -07:00
Denny Page	c276feb339	Confirm that the rrd file is writable before starting	2016-03-01 22:14:36 -08:00
Denny Page	ef21655e77	Change title	2016-03-01 22:14:13 -08:00
Denny Page	2d2d21892a	Change the default time period and send interval from 30s/250m to 60s/500m Check time period vs send interval and loss interval to ensure there is always one resolved slot	2016-03-01 22:11:03 -08:00
Denny Page	a24c0cd0d0	Add option to set receive thread scheduling class Don't call fatal with a NULL format	2016-03-01 21:58:10 -08:00
Denny Page	fdbd4a1d96	Add a safety cast for 32 bit systems	2016-02-27 21:29:33 -08:00
Denny Page	6796fa0752	Fix integer overflow on 32 bit	2016-02-27 19:34:47 -08:00
Denny Page	24022ac098	Fix race condition with ultra low latency links	2016-02-25 20:09:30 -08:00
Denny Page	42c84e965e	Time duration doesn't work in older versions of rrdtool	2016-02-21 14:38:53 -08:00
Denny Page	f94f6dcd47	Separate rrd and png names	2016-02-17 20:15:33 -08:00
dennypage	99208a60fe	Update README.md	2016-02-13 21:06:56 -08:00
Denny Page	eb67d76de0	Merge branch 'master' of https://github.com/dennypage/dpinger	2016-02-13 21:03:46 -08:00
Denny Page	0b68c438f2	Output to name.cgi	2016-02-13 21:03:13 -08:00
dennypage	daac746074	Update README.md	2016-02-13 20:46:43 -08:00
dennypage	5348af36c6	Create README.md	2016-02-13 20:44:42 -08:00
Denny Page	b0fc95f618	Add RRD sample scripts	2016-02-13 19:42:41 -08:00
Denny Page	e9ffd0b43e	Add void casts for discarded return values	2016-02-03 10:26:31 -08:00
Denny Page	9e8968adce	Add -Wno-unused-result to fix spurious gcc warning Default to optimized build	2016-02-03 10:11:48 -08:00
Denny Page	a8b44bedac	Fix warnings from clang -Weverything	2016-02-02 22:28:21 -08:00
Denny Page	1d34caccbd	Add explicit gcc/clang warning examples	2016-02-02 22:13:39 -08:00