Implement basic mechanic for parsing arguments from userspace and
passing them to a global config variable in the BPF programs.
This also changes the basic use of the program from:
$./pping interface
to:
$./pping -i interface
Also, revert to using the memset solution for the map_ipv4_to_ipv6
function to avoid the ipv4_prefix constant being stored in the .rodata
section. This makes it easier to set the value for the global config
variable from userspace, as the only thing left in the .rodata section
is the config struct.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
This example demonstrates how to write a simple eBPF Qdisc classifier
that classifies flows depending on their destination TCP port. The
example script, runner.sh shows how you can use the eBPF Qdisc
classifier and implement the same functionality using u32. The script
creates two network namespaces called Left and Right, representing two
different hosts. The script then illustrates the classifiers in action
using iperf3 by starting clients on the Left namespace that connect to
iperf3 servers on the Right namespace. The Qdisc classifiers give TCP
ports 8080 and 8081 a high rate limit, while TCP port 8082 represents
all other traffic capped at 20 Mbps.
Signed-off-by: Frey Alfredsson <freysteinn@freysteinn.com>
Add a check that to ensure verifier that opt_size is positive in case
its been read in from stack. Also enable (uncomment) the flow-state
cleanup from the XDP program as the added check avoids verifier
rejection.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Verifier might have rejected XDP program due to opt_size being loaded
from memory, see
https://blog.path.net/ebpf-xdp-and-network-security. Add check of
opt_size to attempt to convince verifier that it's not a negative
value or anything else crazy. Leads to verifier instead thinking the
program is too large (over 1m instructions).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
As reported in the xdp-tutorial (where this code is from), there were a
couple of sizeof checks in parsing_helpers.h that was using the pointer
size instead of the size of the struct being pointed to.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Add parsing of TCP FIN/RST to determine if connection is being closed,
and if so delete state directly from BPF programs.
Only enabled on tc-program, as verifier is unhappy about in on XDP
side for some reason.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Remove some out-commented code. Also use bpf_object__unpin_maps
instead of manually unpinning the ts_start map. Additionally, change
map_ipv4_to_ipv6 to use clearer implementation (that now also works
for tc due to always using libbpf to load program).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Load and pin the tc-bpf program in pping.c using libbpf, and only
attach the pinned program using iproute. That way, can use features
that are not supported by the old iproute loader, even if iproute does
not have libbpf support.
To support this change, extend bpf_egress_loader with option to load
pinned program. Additionally, remove configure script and parts of
Makefile that are no longer needed. Furthermore, remove multiple
definitions of ts_start map, and place singular definition in
pping_helpers.h which is included by both BPF programs.
Also, some minor fixes based on Toke's review.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Update SAMPLING_DESIGN.md, partly based on discussions during the
meeting with Red Hat on 2021-03-01.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add a per-flow rate limit, limiting how often new timestamp entries
can be created. As part of this, add per-flow state keeping track
of when last timestamp was created and last seen identifier for each
flow.
Additionally, remove timestamp entry as soon as RTT is
calculated, as last seen identifier is used to find first unique value
instead. Furthermore, remove packet_timestamp struct and only use
__u64 as timestamp, as used memeber is no longer needed.
This initial commit lacks cleanup of flow-state, user-configuration of
rate limit, mechanism to handle bursts etc.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add a document outlining my thoughts for how to implement
sampling. Intended both as a basis for discussion, as well as being a
form of documentation.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Rewrite/regroup/reorder some points for the General pping
section. Also add some new points, add some additional comments to
existing points, and check in the "Skip pure ACKs" as complated.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
The link to the original pping utility was easy to miss, and we didn't
credit Kathie with its implementation. That was clearly an oversight, so
let's fix that.
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Change how intitalization of pctx is done in tc and xdp
programs. Also, len to pkt_len in parsing_context.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Refactor parsing_context to have a len member instead of
data_end_end. Also, refactor parse_tcp_identifier to take pointers
directly to the ports instead of the flow_address structs.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add a parsing_context struct to keep track data, data_end and
currently parsed position, as well as handling the difference between
data_end for XDP and TC through data_end_end pointer.
Use parsing_context struct to detect pure TCP ACKs, and avoid creating
identifier for them on egress (to avoid creating timestamp
entries). This solves issue of calculating RTTs in inproper contexts.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
It is a bit strange we have this header file in this repo, but
it likely be very useful later.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
This is included by linux/if_link.h. Thus, we need it here if the
distro doesn't provide this include file.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Remove the saddr and daddr parmeters from parse_packet_identifier, and
use the is_egress parmeter to perform the saddr/daddr swap inside the
function. Also, minor style fixes.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
The bpf_helper_defs.h is used by (ibbpf provided) bpf/bpf_helpers.h.
Thus, it doesn't belong under headers/ directory.
Remove file: headers/bpf/bpf_helper_defs.h
Fixes: f0fce8f62b ("Update kernel headers and libbpf version")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Refactor TC and XDP programs to reuse common logic for parsing
packets. Add functions for parsing packets for an identifier to
pping_helpers.h which both TC and XDP parts use. Also make it easier
to extend pping with support for new protocols, as only new parsing
functions have to be added and inserted into a single place.
Also add reserved members to end of structs in pping.h to indicate
padding.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
When some bpf example use libxdp then these can be re-added, along
with description of why projects needs to include these files.
Files removed:
headers/xdp/libxdp.h
headers/xdp/prog_dispatcher.h
headers/xdp/xdp_helpers.h
Fixes: 4513664ca3 ("Initial import with encap-forward example")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
It is fairly natural to include <errno.h> in BPF programs as some
helpers and hooks use these errno defines (like lsm-nobpf/). Again we
discover that when compiling with clang option[1] "-target bpf" the OS
distros header files gets confused as __x86_64__ isn't define by clang
which in this case (on Fedora) cause include <gnu/stubs-32.h>.
The error looks like this:
$ make
CLANG lsm-nobpf-kern.o
In file included from lsm-nobpf-kern.c:6:
In file included from /usr/include/errno.h:25:
In file included from /usr/include/features.h:474:
/usr/include/gnu/stubs.h:7:11: fatal error: 'gnu/stubs-32.h' file not found
# include <gnu/stubs-32.h>
^~~~~~~~~~~~~~~~
This patch adds a compile test to configure script to help people
realize why compiling is failing on their systems.
[1] https://www.kernel.org/doc/html/latest/bpf/bpf_devel_QA.html#q-clang-flag-for-target-bpf
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Move some members in network_tuple and rtt_event around to avoid holes.
Also remove some uncecessary parentheses before & operator, and add
local definitions of AF_INET and AF_INET6.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Several changes to add IPv6 support:
- Change structs in pping.h
- replace ipv4_flow with network_tuple
- rename ts_key to packet_id
- rename ts_timestamp to packet_timestamp
- Add map_ipv4_to_ipv4 in pping_helpers.h
- Also remove obsolete fill_ipv4_flow
- Rewrite pping_kern*
- parse either IPv4 or IPv6 header (depending on proto)
- Use map_ipv4_to_ipv6 to store IPv4 address in network_tuple
Support printout of IPv6 addresses in pping.c
- Add function format_ip_address as wrapper over inet_ntop
- Change handle_rtt_event to first format IP-address strings in
local buffers, then perform single printout
While some steps have been taken to be more general towards different
types of packet identifiers (not just the currently supported TCP
timestamps), significant refactorization of pping_kern* will still be
required. Also, pping_kern_xdp and pping_kern_tc also have large
sections of very similar code that can be refactored into functions.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Create the /sys/fs/bpf/tc folder if it does not exist. Also check if
pping is run as root, otherwise inform user that it must run as root.
Libbpf will attempt to create the /sys/fs/bpf/tc/globals directory
when pinning the map, however it will not do so recursivly (so will
fail if /sys/fs/bpf/tc does not exist). So as a temporary solution,
attempt to create /sys/fs/bpf/tc (however, if sys/fs/bpf is not
mounted this will still fail).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Refactor tc_bpf_load and tc_bpf_clear to use a common run_program
function which does the fork+execv.
Enclose compound statement defines in parenthesis.
Removed argument CLOCK_MONOTONIC from callers to parameterless
function get_time_ns().
Also fix some weird spacing in pping_helpers.h, and fix some
formatting issues, using clang-format with the kernel source tree
.clang-format on the whole tree.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Perform various fixes and tweaks:
- Rename several defines to make them more informative
- Remove unrolling of loop in BPF programs
- Reuse defines for program sections between userspace and kernel
space programs
- Perform fork+exec to run bpf_egress_loader script instead of
system()
- Add comment to copied scripts indicating I've modified them
- Add pping.h and pping_helpers.h as dependencies in Makefile
Also, add a brief description of what PPing is and how it works to
README
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Split the print statements for RTTs into two parts to avoid inet_ntoa
overwriting one of the IP-addresses (causing both source and
destitionation address to appear the same). Also flip the order of
source and destination to be the same as Pollere's pping.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Copy setup from traffic-pacing-edt to use BTF-defined map if configure
detects that iproute2 has libbpf support, otherwise fall back on
bpf_elf_map. Also fix a minor bug with setting default value for SEC
in bpf_egress_loader.sh.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>