Commit Graph

640 Commits

Author SHA1 Message Date
f11ff90d61 ktrace-CO-RE: Add trivial BPF-prog
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-05-31 18:58:01 +02:00
7e0f0b0338 Merge pull request #19 from vincentbernat/fix/debian-glibc-install
configure: tell which package to install on Debian for header files
2021-05-31 17:24:43 +02:00
887d8bcef7 Merge pull request #18 from vincentbernat/fix/no-emacs
configure: remove requirements on M4 and Emacs
2021-05-31 17:24:12 +02:00
eb8e77fbe9 configure: tell which package to install on Debian for header files
Signed-off-by: Vincent Bernat <vincent@bernat.ch>
2021-05-31 12:23:29 +02:00
321293e3f5 configure: remove requirements on M4 and Emacs
They are not used.

Signed-off-by: Vincent Bernat <vincent@bernat.ch>
2021-05-31 12:22:14 +02:00
d2c3b9d8bc ktrace-CO-RE: Add README
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-05-26 22:21:45 +02:00
7ff771f665 Merge pull request #13 from simosund/pping_Add_Sampling
Add sampling to pping
2021-04-23 14:51:37 +02:00
20c6dbec4c pping: Remove pinning of maps
When both BPF programs are kept in the same file, no longer need to
pin the maps in order to share them between the programs.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-04-23 14:16:52 +02:00
9cc6b1eaab pping: Update documentation
Update documentation to reflect the current state of pping (after
merging pping_kern_tc and pping_kern_xdp into a single file).

Also add another point to the TODO list that has been discussed at a
previous meeting.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-04-22 18:06:09 +02:00
b7eae0846a pping: Reduce number of IPV6 extensions parsed
Reduce IPV6_EXT_MAX_CHAIN to 3 to avoid hitting the verifier limit of
processing 1 million instructions, This results in fewer loops in
parsing_helpers.h/skip_ip6hdrnext which simplifies the verifier
analysis. IPv6 extension headers do not appear to be that common, so
this is unlikely to cause a considerable limitation.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-04-22 17:51:49 +02:00
93b6c0eafa pping: Major refactor and add -f and -c options
Merge the pping_kern_tc.c, pping_kern_xdp.c and pping_helpers.h into
the single file pping_kern.c. Do not change any of the BPF code,
except renaming the map ts_start to packet_ts.

To handle both BPF programs kept in single ELF-file, change loading
mechanism to extract and attach both tc and XDP programs from it. Also
refactor main-method into several smaller functions to reduce its
size.

Finally, added the --force (-f) and --cleanup-interval (-c) options to
the argument parsing, and improved the parsing of the
--rate-limit (-r) option.

NOTE: The verifier rejects program in it's current state as too
large (over 1 million instructions). Setting the TCP_MAX_OPTIONS in
pping_kern.c to 5 (or less) solves this. Unsure at the moment what
causes the verifier to think the program is so large, as the code in
pping_kern.c is identical to the one from the three files it was
merged from.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-04-15 14:13:54 +02:00
f26f03a8ce pping: Verify opt_size is valid when parsing TCP options
Add a check that opt_size is at least 2 in
pping_helpers.h/prase_tcp_ts, otherwise terminate the loop
unsucessfully. Only check the lower bound of opt_size, the upper
bound will be checked in the first step of the next loop iteration
anyways.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-30 19:34:48 +02:00
48c25735ac pping: Declare opt_size volatile
Declare opt_size in pping_helpers.h/parse_tcp_ts volatile to ensure
compiler always reads it from stack as u8, which avoids confusing the
verifier into thinking it might have a negative value.

Old solution of having &=0x3f before adding opt_size to pos could
potentially cause weird behavior if a packet with an invalid TCP
option size arrived (for example, if opt_size was 64 it would be
interpreted as 0, and the loop would simply check the same position
again on each iteration). Simply changing the check to 0xff was not
possible because the compiler would optimize that away (as it knows
that to have no effect on a u8).

Also change check that TCP timestamp is not outside of boundaries from
pos+opt_size to pos+10. Before declaring opt_size as volatile compiler
automatically did this transformation, but now have to explicitly do
this. If this conversion is not done the verifier will reject the
program as it due to its goldfish memory isn't sure that opt_size has
to be 10 at this point.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-29 20:13:33 +02:00
9168b47cca pping: Refactor init_rodata
Refactor init_rodata to search for the first map with ".rodata" in its
name. Should be more robust than previous solution which first tried
to construct the name for the rodata map, and then find the map by
name.

Also remove some outcommented code that was not used.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-29 14:55:41 +02:00
9ec2381559 pping: Minor documentation fixes
Mainly fix some incorrect words and a couple of clumsy sentences in
the README.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-26 17:54:42 +01:00
0597b5536f pping: Update documentation
Update the README, the pping diagram (eBPF_pping_design.png) and TODO
to be more up to date with the current implementation.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-26 16:57:48 +01:00
6cd22850af bpf-link-hang: Add bpftrace invocation to show hang in synchronize_rcu_tasks
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2021-03-23 13:07:36 +01:00
25acd99b58 Add example to show hang on bpf_link close
Minimal example to show that the close() operation on a bpf_link can hang
indefinitely if the kernel is loaded (for example by traffic on an
interface with an XDP program loaded).

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2021-03-23 00:09:44 +01:00
cc8a16650b pping: Add userspace configuration
Implement basic mechanic for parsing arguments from userspace and
passing them to a global config variable in the BPF programs.

This also changes the basic use of the program from:
    $./pping interface
to:
    $./pping -i interface

Also, revert to using the memset solution for the map_ipv4_to_ipv6
function to avoid the ipv4_prefix constant being stored in the .rodata
section. This makes it easier to set the value for the global config
variable from userspace, as the only thing left in the .rodata section
is the config struct.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-22 12:23:27 +01:00
ea4d171cb0 Merge pull request #15 from freysteinn/master
Added an eBPF Qdisc classifier example
2021-03-17 11:04:44 +01:00
9337d2ff8a Adding a basic TC eBPF Qdisc classifier example
This example demonstrates how to write a simple eBPF Qdisc classifier
that classifies flows depending on their destination TCP port. The
example script, runner.sh shows how you can use the eBPF Qdisc
classifier and implement the same functionality using u32. The script
creates two network namespaces called Left and Right, representing two
different hosts. The script then illustrates the classifiers in action
using iperf3 by starting clients on the Left namespace that connect to
iperf3 servers on the Right namespace. The Qdisc classifiers give TCP
ports 8080 and 8081 a high rate limit, while TCP port 8082 represents
all other traffic capped at 20 Mbps.

Signed-off-by: Frey Alfredsson <freysteinn@freysteinn.com>
2021-03-17 02:26:58 +01:00
e04284a3ee pping: Add check to satisfy verifier
Add a check that to ensure verifier that opt_size is positive in case
its been read in from stack. Also enable (uncomment) the flow-state
cleanup from the XDP program as the added check avoids verifier
rejection.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-15 18:23:23 +01:00
2b64355b2e pping: Attempt to be nice to verifier...
Verifier might have rejected XDP program due to opt_size being loaded
from memory, see
https://blog.path.net/ebpf-xdp-and-network-security. Add check of
opt_size to attempt to convince verifier that it's not a negative
value or anything else crazy. Leads to verifier instead thinking the
program is too large (over 1m instructions).

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-15 11:52:17 +01:00
a10dc23e4a parsing_helpers: Fix sizeof checks
As reported in the xdp-tutorial (where this code is from), there were a
couple of sizeof checks in parsing_helpers.h that was using the pointer
size instead of the size of the struct being pointed to.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2021-03-09 23:09:13 +01:00
ad69dc4fb6 pping: Add instant cleanup of flow state
Add parsing of TCP FIN/RST to determine if connection is being closed,
and if so delete state directly from BPF programs.

Only enabled on tc-program, as verifier is unhappy about in on XDP
side for some reason.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-09 19:58:42 +01:00
0b2107f5c4 pping: Add periodic cleanup of flow state
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-09 17:50:27 +01:00
5204cd1e3c pping: Minor cleanup/refactor
Remove some out-commented code. Also use bpf_object__unpin_maps
instead of manually unpinning the ts_start map. Additionally, change
map_ipv4_to_ipv6 to use clearer implementation (that now also works
for tc due to always using libbpf to load program).

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-09 12:21:39 +01:00
3bd3333c69 pping: Update and linebreak TODO
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-02 17:49:36 +01:00
1446e6edec pping: Load tc-bpf program with libbpf
Load and pin the tc-bpf program in pping.c using libbpf, and only
attach the pinned program using iproute. That way, can use features
that are not supported by the old iproute loader, even if iproute does
not have libbpf support.

To support this change, extend bpf_egress_loader with option to load
pinned program. Additionally, remove configure script and parts of
Makefile that are no longer needed. Furthermore, remove multiple
definitions of ts_start map, and place singular definition in
pping_helpers.h which is included by both BPF programs.

Also, some minor fixes based on Toke's review.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-02 17:40:51 +01:00
1282bce7d8 pping: Update sampling design document
Update SAMPLING_DESIGN.md, partly based on discussions during the
meeting with Red Hat on 2021-03-01.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-01 20:09:21 +01:00
cecd6b54f2 pping: Inital rate limit implementation
Add a per-flow rate limit, limiting how often new timestamp entries
can be created. As part of this, add per-flow state keeping track
of when last timestamp was created and last seen identifier for each
flow.

Additionally, remove timestamp entry as soon as RTT is
calculated, as last seen identifier is used to find first unique value
instead. Furthermore, remove packet_timestamp struct and only use
__u64 as timestamp, as used memeber is no longer needed.

This initial commit lacks cleanup of flow-state, user-configuration of
rate limit, mechanism to handle bursts etc.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-01 18:16:48 +01:00
6e5136092d pping: Update sampling design document
Add sections on per-flow state, graceful degradation and some
implementation considerations.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-26 12:38:53 +01:00
ae1d89c7c9 pping: Add document about sampling design
Add a document outlining my thoughts for how to implement
sampling. Intended both as a basis for discussion, as well as being a
form of documentation.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-25 19:00:41 +01:00
a9c276cb54 pping: Update TODO-list
Rewrite/regroup/reorder some points for the General pping
section. Also add some new points, add some additional comments to
existing points, and check in the "Skip pure ACKs" as complated.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-25 16:33:26 +01:00
97fdefa90d pping: Make the link to Kathie's original pping utility clearer
The link to the original pping utility was easy to miss, and we didn't
credit Kathie with its implementation. That was clearly an oversight, so
let's fix that.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2021-02-25 11:18:44 +01:00
7f1868ac5c Merge pull request #10 from simosund/pping_IPv6
pping: Add IPv6 support
2021-02-16 14:02:39 +01:00
a2c6b0618b pping: Use designated initialization for parsing_context
Change how intitalization of pctx is done in tc and xdp
programs. Also, len to pkt_len in parsing_context.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-16 13:16:12 +01:00
7fe1d282ae pping: Minor refactor of parsing_context
Refactor parsing_context to have a len member instead of
data_end_end. Also, refactor parse_tcp_identifier to take pointers
directly to the ports instead of the flow_address structs.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-16 12:34:19 +01:00
641080a8a6 Merge pull request #12 from xdp-project/refactor01-include-dir.public
Refactor include directories and doc repo via README files
2021-02-15 17:25:09 +01:00
a25992973d Adjustments to README based on Toke's review
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-02-15 17:18:38 +01:00
219e962832 pping: Avoid timestamping pure TCP ACKs
Add a parsing_context struct to keep track data, data_end and
currently parsed position, as well as handling the difference between
data_end for XDP and TC through data_end_end pointer.

Use parsing_context struct to detect pure TCP ACKs, and avoid creating
identifier for them on egress (to avoid creating timestamp
entries). This solves issue of calculating RTTs in inproper contexts.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-12 18:31:30 +01:00
b5fd346589 Move jhash out of headers/linux into include/
It is a bit strange we have this header file in this repo, but
it likely be very useful later.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-02-12 18:06:51 +01:00
d5031bfc92 headers/linux: add netlink.h from kernel source v5.11-rc7
This is included by linux/if_link.h.  Thus, we need it here if the
distro doesn't provide this include file.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-02-12 18:03:48 +01:00
10abd546ca headers/linux: update if_link.h and if_xdp.h from kernel v5.11-rc7
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-02-12 18:01:02 +01:00
27765e8449 headers/linux: update bpf.h from kernel source v5.11-rc7
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-02-12 17:55:17 +01:00
3a92b67a53 headers/linux: Add missing bpf_common.h
The include file linux/bpf_common.h was missing.  This is used/included
via linux/bpf.h.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-02-12 17:40:09 +01:00
7aee417036 Add README for headers/linux/ directory
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-02-12 17:22:12 +01:00
502663f354 pping: Update TODO.md
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-12 12:09:37 +01:00
397b44cff7 pping: Refactor parse_packet_identifer
Remove the saddr and daddr parmeters from parse_packet_identifier, and
use the is_egress parmeter to perform the saddr/daddr swap inside the
function. Also, minor style fixes.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-12 11:40:43 +01:00
0264295d67 Add toplevel README describing project
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
2021-02-09 19:32:41 +01:00