Commit Graph

609 Commits

Author SHA1 Message Date
Simon Sundberg
e7201c9eab pping: Update TODO based on map cleaning improvements
Also add description of three potential issues introduced by the
changes to cleanup process.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-03-10 09:59:21 +01:00
Simon Sundberg
f22025f716 pping: More aggressive map cleanup
Add conditions that allows removing old flow and timestamp entries
sooner.

For flow map, have added conditions that allow unopened flows and ICMP
flows to be removed earlier than open TCP flows (currently both set to
30 sec instead of 300 sec).

For timestamp entries, allow them to be removed if they're more than
TIMESTAMP_RTT_LIFETIME (currently 8) times higher than the flow's
sRTT.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-03-10 09:54:52 +01:00
Simon Sundberg
72404b6767 pping: Add map cleanup debug info
Add some debug info to the periodical map cleanup process. Push debug
information through the events perf buffer by using newly added
map_clean_event.

The old user space map cleanup process had some simple debug
information that was lost when transitioning to using bpf_iter
instead. Therefore, add back similar (but more extensive) debug
information but now collected from the BPF-side. In addition to stats
on entries deleted by the cleanup process, also include stats on
entries deleted by ePPing itself due to matching (for timestamp
entries) or detecting FIN/RST (for flow entries)

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-03-10 09:45:22 +01:00
Simon Sundberg
be0921d116 pping: Use BPF iterators to do map gc
To improve the performance of the map cleanup, switch from the
user-spaced loop to using BPF iterators. With BPF iterators, a BPF
program can be run on each element in the map, and can thus be done in
kernel-space. This should hopefully also avoid the issue the previous
userspace loop had with resetting in case an element was removed by
the BPF programs during the cleanup.

Due to removal of userspace logic for map cleanup, no longer provide
any debug information about how many entires there are in each map and
how many of them were removed by the garbage collection. This will be
added back in the next commit.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-03-10 09:45:15 +01:00
Simon Sundberg
d4c33259c7 pping: Improve program cleanup on shutdown
Explicitly stop the thread which performs periodical map cleanup on
shutdown, before attempting to free up any resources it might use.

For the cleanup order to make more sense, also setup the perf-buffer
before setting up the periodical map cleaning, so that the thread can
be stopped as the first part of the cleanup. This also matches better
with the next couple of commits where map cleaning debug information
will be pushed through the perf-buffer.

Finally, move the addition of the signalhandler earlier in the code to
eliminate a window where it was possible to terminate the program
without relevant cleanup code having a chance to run.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-03-09 19:20:51 +01:00
Simon Sundberg
6885771b14 pping: Allocate cleanup-args on main's stack
Allocate the clean_args on the stack of the main function rather than
the stack of setup_periodical_map_cleaning. The clean_args is used by
periodical_map_cleanup from a different thread, so allocating them on
stack for setup_periodical_map_cleaning which goes out of scope
directly after the thread is created opens up for errors where later
function calls may overwrite the arguments, causing unpredictable
behavior.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-03-09 14:54:20 +01:00
Toke Høiland-Jørgensen
27861e384e Merge pull request #39 from chenhengqi/update-btf-example
Update btf example
2022-03-09 12:00:39 +01:00
Hengqi Chen
e307581009 BTF-playground: Handle error and free resources explicitly
Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
2022-03-09 09:04:38 +08:00
Hengqi Chen
c9398145ac Ignore lib/libbpf-install/
Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
2022-03-06 21:54:01 +08:00
Hengqi Chen
8301679918 configure: Fix mixed indentions
Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>
2022-03-06 21:52:48 +08:00
Jesper Dangaard Brouer
ce714625ce AF_XDP-interaction: Fix function tx_pkt
If there were more transmit slots, then we umem free the
packet, but we continued sending it anyhow.

The places tx_pkt() is currently used this never happened.
Still fix the bug.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-21 19:03:49 +01:00
Toke Høiland-Jørgensen
dd24bce55c Merge pull request #32 from simosund/pping_improve_capabilities
PPing core improvements
2022-02-10 18:20:18 +01:00
Simon Sundberg
b8215e70b2 pping: Add section on potential issues to TODO-list
Collect potential issues under a new section in the TODO list. These
are issues I generally don't think are that severe, but may still be
useful to note down and keep in mind.

Move the section on potential concurrency issues from README to the
new section in the TODO-list.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-10 16:17:10 +01:00
Simon Sundberg
2647429081 pping: Add warnings for failing to create map entry
Send a warning notifying the user that PPing failed to create a
flow/timestamp entry due to the corresponding map being full. To avoid
sending a warning for every packet, only emit warnings every
WARN_MAP_FULL_INTERVAL (which is currently hard-coded to 1s).

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-10 16:17:10 +01:00
Simon Sundberg
54886c92d9 pping: Refactor event handling code
Refactor code for how events are handled in the user space
application. Preparation for adding an additional event type which
should not be handled by the normal functions for printing RTT and
flow events.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-10 16:17:10 +01:00
Simon Sundberg
32bdf11a96 pping: Only consider flow opened on reply
Wait with sending a flow open message until a reply has been seen for
the flow. Likewise, only emit a flow closing event if the flow has
first been opened (that is, a reply has been seen).

This introduces potential (but unlikely) concurrency issues for flow
opening/closing messages which are further described in the README.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-10 16:17:10 +01:00
Simon Sundberg
8a8f538759 pping: Do both timestamping and matching on ingress and egress
Perform both timestamping and matching on both ingress and egress
hooks. This makes it more similar to Kathie's pping, allowing the tool
to capture RTTs in both directions when deployed on just a single
interface.

Like Kathie's pping, by default filter out RTTs for packets going to
the local machine (will only include local processing delays). This
behavior can be disabled by passing the -l/--include-local option.

As packets that are timestamped on ingress and matched on egress will
include the local machines processing delay, add the "match_on_egress"
member to the JSON output that can be used to differentiate between
RTTs that include the local processing delay, and those which don't.

Finally, report the source and destination addresses from the perspective
of the reply packet, rather than the timestamped packet, to be
consistent with Kathie's pping.

Overall, refactor large parts of pping_kern to allow both timestamping
and matching, as well as updating both the flow and reverse flow and
handle flow-events related to them, in one go. Also update README to
reflect changes.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-10 16:16:24 +01:00
Simon Sundberg
928a4144a9 pping: Add RTT-based sampling
Add an option (-R, --rtt-rate) to adapt the rate sampling based on the
RTT of the flow. The sampling rate will be C * RTT, where C is a
configurable constant (ex 1.0 to get one sample every RTT), and RTT
is either the current minimum (default) or smoothed RTT of the
flow (chosen via the -t or --rtt-type option).

The smoothed RTT (sRTT) is updated for each calculated RTT, and is
calculated in a similar manner to srtt in the kernel's TCP stack. The
sRTT is a moving average of all RTTs, and is calculated according to
the formula:

  srtt = 7/8 * prev_srtt + 1/8 * rtt

To allow the user to pass a non-integer C (ex 0.1 to get 10 RTT
samples for every RTT-period), fixed-point arithmetic has been used
in the eBPF programs (due to lack of support for floats). The maximum
value for C has been limited to 10000 in order for it to be unlikely
that the C * RTT calculation will overflow (with C = 10000, overflow
will only occur if RTT > 28 seconds).

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-10 16:11:21 +01:00
Simon Sundberg
c79c4e8571 pping: Eliminate flow creation/deletion concurrency issue
Only push flow events for opening/closing flows if the
creation/deletion of the flow-state was successful (as indicated by
the bpf_map_*_elem() return value). This should avoid outputting
several flow creation/deletion messages in case multiple instances are
trying to create/delete a flow concurrently, as could theoretically
occur previously.

Also set the last_timestamp value before creating a new flow, to avoid
a race condition where the userspace cleanup might incorrectly
determine that a flow is old before the last_timestamp value can be
set. Explicitly skip the rate-limit for the first packet of a new flow
to avoid it failing the rate-limit. This also fixes an issue where the
first packet of a new flow would previously fail the rate-limit if the
rate-limit was higher than current time uptime (CLOCK_MONOTONIC).

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-10 16:08:23 +01:00
Simon Sundberg
1cadbe0ae7 pping: Make parsed protocols configurable
Add command-line flags for each protocol that pping should attempt to
parse and report RTTs for (currently -T/--tcp and -C/--icmp). If no
protocol is specified assume TCP. To clarify this, output a message
before start on how ePPing has been configured (stating output format,
tracked protocols and which interface to run on).

Additionally, as the ppviz format was only designed for TCP it does
not have any field for which protocol an entry belongs to. Therefore,
emit a warning in case the user selects the ppviz format with anything
other than TCP.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-10 16:02:24 +01:00
Simon Sundberg
bd6ded5c21 pping: Add support for ICMP echo messages
Allow pping to passivly monitor RTT for ICMP echo request/reply
flows. Use the echo identifier as ports, and echo sequence as packet
identifier.

Additionally, add protocol to standard output format in order to be
able to distinguish between TCP and ICMP flows.

The ppviz format does not include protocol, making it impossible to
distinguish between TCP and ICMP traffic. Will add warning if ppviz
format is used together with ICMP traffic in the future.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-03 16:04:14 +01:00
Simon Sundberg
af5e660d8e pping: Only match TSecr in ACKs
The echoed TCP timestamp (TSecr) is only valid if the ACK flag is
set. So make sure to only attempt to match on ACK packets.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-02-02 10:14:44 +01:00
Jesper Dangaard Brouer
c14f52a4d4 Merge pull request #38 from xdp-project/vestas06_tc_qdisc
TC policy example of overriding netstack TXQ
2022-02-01 15:35:35 +01:00
Jesper Dangaard Brouer
91432fe471 tc-policy: Update README with info on inspecting loaded BPF-progs
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-01 15:30:17 +01:00
Jesper Dangaard Brouer
71d1479d1d tc-policy: Add more advanced bpftrace script
Digging into the return value of netdev_pick_tx().
Want to be able to debug the case where a socket
selects another queue_id.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-01 14:49:47 +01:00
Jesper Dangaard Brouer
ba2c114db0 tc-policy: Take into account locally generated traffic
The BPF-prog "not_txq_zero" also needed to take into account
that skb->queue_mapping usually isn't set for locally
generated traffic.

I worry that sockets can set another queue id that could
override our (BPF choice) in netdev_pick_tx().
See sk_tx_queue_set() and sk_tx_queue_get().

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-01 13:28:20 +01:00
Jesper Dangaard Brouer
a3af1bb99c tc-policy: Implement new BPF section that disallow using TXQ zero
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-01 13:11:31 +01:00
Jesper Dangaard Brouer
741205e13c tc-policy: Mention XPS need to be disabled
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-01 12:40:14 +01:00
Jesper Dangaard Brouer
f33789e4c6 tc-policy: rename xps_setup.sh to xps_setup_ash.sh
This version of the XPS script have been modified to
work with the shell ash.  As bash was not avail on
the Yocto target host.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-01 12:33:49 +01:00
Jesper Dangaard Brouer
cf5ecd6999 tc-policy: Adjust README for GitHub rendering
The bpftrace oneliner was getting rendered wrong in GitHub html view.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-01 12:16:19 +01:00
Jesper Dangaard Brouer
e72d309789 tc-policy: Monitor TXQ usage with bpftrace script
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-02-01 11:56:32 +01:00
Toke Høiland-Jørgensen
007c0b6883 Merge pull request #37 from simosund/pping_graceful_shutdown
pping: Add graceful shutdown on SIGTERM
2022-02-01 00:12:26 +01:00
Jesper Dangaard Brouer
520d2e6109 tc-policy: Add README documentation
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 21:40:18 +01:00
Simon Sundberg
de5d65030c pping: Add graceful shutdown on SIGTERM
Also intercept SIGTERM (in addition the the previously intercepted
SIGINT) and perform graceful shutdown.

Perhaps it also makes sense to perform graceful shutdown on some
additional signals, like SIGHUP and SIGQUIT?

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2022-01-31 18:47:08 +01:00
Jesper Dangaard Brouer
1bc9b91e3f tc-policy: Verbose info when flushing TC-BPF programs
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 18:29:03 +01:00
Jesper Dangaard Brouer
c45bdfddfa tc-policy: No need to call bpf_tc_query after bpf_tc_attach
The attach_egress.prog_id have already been provided
after calling bpf_tc_attach.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 18:25:32 +01:00
Jesper Dangaard Brouer
a00ed96ed6 tc-policy: Be more verbose, but add --quiet option
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 17:29:52 +01:00
Jesper Dangaard Brouer
78515fdd2e tc-policy: Now --unload only remove our own TC-BPF prog
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 16:43:01 +01:00
Jesper Dangaard Brouer
665f26e06b tc-policy: Make more clear that destroying hook kill all progs
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 15:55:06 +01:00
Jesper Dangaard Brouer
9fba4b12ac tc-policy: Fix issue of too many TC-BPF filter prog attached multi
This seems to be a common occuring issue with tc cmdline.
And the C-code have inherited the issue in the API.

Trying to replace a TC-BPF prog often result in appending a new prog
(as a new tc filter instance).

Be careful to set both handle and prio and the replace flag.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 15:20:38 +01:00
Jesper Dangaard Brouer
5917dbff3f tc-policy: Add teardown --unload functionality
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 14:59:51 +01:00
Jesper Dangaard Brouer
17c3aa7661 tc-policy: Attach to TC egress hook as libbpf C-code
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 13:37:14 +01:00
Jesper Dangaard Brouer
9e5ab8f4fb tc-policy: Use bpftool skeleton header generate feature
The reason for going this route is that this allow us to
create a user binary that contains the BPF object file.
Thus, we can avoid having to load the BPF file from
a specific location or having to be in same dir as file.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 12:39:31 +01:00
Jesper Dangaard Brouer
dcdceea90c tc-policy: Add C-code boilerplate tc_txq_policy.c
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 12:10:45 +01:00
Jesper Dangaard Brouer
b1efd37ed5 tc-policy: Adapt XPS script to work with ash on Yocto
The Yocto build this is intended for doesn't have /bin/bash
adapt script.

External program "getopt" not avail.

The 'sort' tool is also different, as it comes from busybox.
Adapt the cmdline options for 'sort'.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 10:11:40 +01:00
Jesper Dangaard Brouer
fb84a226b0 tc-policy: Manuel setup steps via tc cmdline
Yocto build have a problem with loading this via tc

 # tc filter replace dev eth1 egress prio 0xC000 handle 1 bpf da obj tc_txq_policy_kern.o
 Continuing without mounted eBPF fs. Too old kernel?
 mkdir (null)/globals failed: No such file or directory
 Unable to load program

It can be worked around via mounting BPF file-system manually:

 # mount -t bpf bpf /sys/fs/bpf/

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 09:17:09 +01:00
Jesper Dangaard Brouer
ae52ac2140 tc-policy: Add script for disabling XPS
Taken from xdp-cpumap-tc git repo:
 https://github.com/xdp-project/xdp-cpumap-tc/blob/master/bin/xps_setup.sh

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-31 09:12:35 +01:00
Jesper Dangaard Brouer
8fbccb90ab tc-policy: Simple approach static map to TXq 4
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-28 17:25:08 +01:00
Jesper Dangaard Brouer
ceb3fd80e7 tc-policy: Add Makefile
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-28 16:37:06 +01:00
Jesper Dangaard Brouer
e1b18d7b12 AF_XDP-interaction: Fix another warning in complete_tx
warning: ‘return’ with no value, in function returning non-void

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2022-01-21 12:09:30 +01:00