diff --git a/README.md b/README.md index 820557a..c42838b 100644 --- a/README.md +++ b/README.md @@ -163,6 +163,8 @@ For instance, to create 4 parallel sockets of sFlow and one of NetFlow V5, you c $ ./goflow2 -listen 'sflow://:6343?count=4,nfl://:2055' ``` +More information about workers and resource usage is avaialble on the [Performance page](/docs/performance.md). + ### Docker You can also run directly with a container: diff --git a/docs/performance.md b/docs/performance.md new file mode 100644 index 0000000..2f1baef --- /dev/null +++ b/docs/performance.md @@ -0,0 +1,75 @@ +# Performance + +When setting up GoFlow2 for the first time, it is difficult to estimate the settings and resources required. +This software has been tested with hundreds of thousands of flows per second on common hardware but the default settings may not be optimal everywhere. + +It is important to understand the pattern of your flows. +Some environments have predictable trends, for instance a regional ISP will likely have a peak of traffic at 20:00 local time, +whereas a hosting provider may have large bursts of traffic due to a DDoS attack. + +We need to consider the following: + +* R: The rate of packets (controlled by sampling and traffic) +* C: The decoding capacity of a worker (dependent on CPU) +* L: The allowed latency (dependent on buffer size) + +In a typical environment, capacity matches or exceeds the rate (C >= R). +When the rate goes above the capacity (eg: bursts), packets waiting to be processed pile up. +Latency increases as long as the rate exceeds the capacity. It remains stable if the rate equals the capacity. +It can only lower when there is extra capacity (C-R). + +A buffer too large can cause "buffer bloat" where latency is too high for normal operations (eg: DDoS detection being delayed), +whereas a short buffer (or no buffer for real-time) may drop information during an temporary increase. + +The listen URI can be customized to meet an environment requirements. +GoFlow2 will work better in an environment with guaranteed resources. + +## Life of a packet + +When a packet is received by the collectors' machine, the kernel will send the packet towards a socket. +The socket is buffered. On Linux, the buffersize is a global configuration setting: `rmem_max`. + +If the buffer is full, new packets will be discarded and increasing the count of +UDP errors. + +A first level of load-balancing can be done by having multiple sockets listening +on the same port. +On Linux, this is done with `SO_REUSEPORT` and `SO_REUSEADDRESS` options. +In GoFlow2 you can set the `count` option to define the number of sockets. +Each socket will put the packet in a queue to be decoded. + +The number of `workers` should ideally match the number of CPUs available. + +`Blocking` mode forces GoFlow2 to operate in real-time instead of buffered. A packet is only decoded if +a worker is available and storage depends on the kernel UDP buffer. + +In buffered mode, the size of the queue is set by `queue_size`, much larger than the UDP buffer. + +The URI below summarizes the options: + +``` +$ goflow2 -listen flow://:6343/?count=4&workers=16&blocking=false&queue_size=1000000 + ^ ^ ^ ^ + ┃ ┃ ┃ ┗ In buffered mode, the amount of packets stored in memory + ┃ ┃ ┗ Real-time mode + ┃ ┗ Decoding workers + ┗ Open sockets listening +``` + +## Note on resources guarantees + +GoFlow2 works better on guaranteed fixed resources. +It requires the operator to scope for a worst case scenario in terms of latency. + +RAM usage is dependent on the `queue_size` (unless using blocking mode). +By default, this may exceed the host memory if rate is above capacity and result in an `OoM` crash. +As UDP packets can be a maximum of 9000 bytes, as a result, a 2GB RAM machine can only buffer 222222 packets if there no overhead. + +Kubernetes is an example of allowing flexible resources for processes. + +In a Pod `resources`, the `request` and `limits` can be set for CPU. Extra CPU can be used by other applications if colocated. +Make sure they are the same for RAM since if GoFlow2 is killed, data could be lost, unless you are confident other applications +will not require extra RAM during peaks. + +Furthermore, `HorizontalPodScalers` can be used to create additional GoFlow2 instances and route the packets when a metric crosses a threshold. +This is not recommended with NetFlow/IPFIX without having a shared template system due to cold-starts.