Commit Graph

57 Commits

Author SHA1 Message Date
Simon Sundberg
32fc35f527 pping: Update README with info on output formats
Update README, mainly add a new section with a brief descriptions and
some examples of the output formats.

Also, update the files and maps list to reflect recent changes (BPF
programs can now push flow-events, and the map rtt_events has been
renamed to just events.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:26 +02:00
Simon Sundberg
d85329f728 pping: Refactor output code
Simplify the three output functions by breaking them up into smaller
helper functions. Also introduce the pping_event union, which can hold
either an rtt_event or flow_event.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:26 +02:00
Simon Sundberg
1975367a3a pping: Add end-of-flow message from userspace map cleanup
Make the flow_timeout function call the current output function to
simulate a flow-closing event. Also some other minor cleanup/fixes.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:26 +02:00
Simon Sundberg
543f75c9d8 pping: Add support for "flow events"
Add "flow events" (flow opening or closing so far) which will trigger
a printout of message.

Note: The ppviz format will only print out the traditional rtt events
as the format does not include opening/closing messages.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:26 +02:00
Simon Sundberg
399c9dc935 pping: Refactor json code and format
Use a JSON-writer library from iproute instead of complicated printf
statement. Also output timestamp, rtt and min_rtt as integers in
nanoseconds, rather than floats in seconds.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:26 +02:00
Simon Sundberg
148d4a26f3 pping: Change order of format_ip_address parameters
Change order of parameters for format_ip_address to follow the
convention of the printf functions where buffer is placed first,
instead of the conventions of the inet_ntop functions where buffer is
placed last.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:25 +02:00
Simon Sundberg
f96cfb7d7c pping: Track nr sent/received packets and bytes
Add per-flow tracking of number of packets and bytes
sent/received. Add these to the JSON output format.

Also update README regarding concurrency issue when updating these
statistics.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:25 +02:00
Simon Sundberg
fb454cd716 pping: Update README with info on concurrency issues
Also, remove comments about concurrency issues from code in
pping_kern.c as it is now documented in README.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:25 +02:00
Simon Sundberg
d92109b3c8 pping: Replace -j and -m options with -F/--format
The format option can take the values "standard" (default), "json" and
ppviz (new name for "machine-friendly").

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:25 +02:00
Simon Sundberg
3011bbb0b8 pping: Add "machine friendly" format
Add Kathie's "machine friendly" as an optional output format when
passing '-m' or '--machine-friendly' to pping. This format can be used
together with Kathie's ppviz tool to visaulize the output.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:25 +02:00
Simon Sundberg
0ed39800d0 pping: Add JSON output format
Add the option to output in JSON format by passing '-j' or '--json' to
pping. Include the protocol in the JSON format, and fix so kernel-side
actually stores the protocol in the flow_address struct.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:25 +02:00
Simon Sundberg
b4a810b09b pping: Add timestamp and min-RTT to output
To add timestamp to output, push the timestamp when packet was
processed from kernel as part of the rtt-event. Also keep track of
minimum encountered RTT for each flow in kernel, and also push that as
part of the RTT-event.

Additionally, avoid pushing RTT messages at all if no flow-state
information can be found (due to ex. being deleted from egress side),
as no valid min-RTT can then be given. Furthermore, no longer delete
flow-information once seeing the FIN-flag on egress in order to keep
useful flow-state around for RTT-messages longer. Due to the
FIN-handshake process, it is sufficient if the ingress program deletes
the flow-state upon seeing FIN. However, still delete flow-state from
either ingress or egress upon seeing RST flag, as RST does not have a
handshake process allowing for delayed deletion.

While minimum RTT could also be tracked from the userspace process,
userspace is not aware of when the flow is closed so would have to add
additional logic to keep track of minimum RTT for each flow and
periodically clean them up. Furthermore, keeping RTT statistics in the
flow-state map is useful for implementing future features, such as an
RTT-based sampling interval. It would also be useful in case pping is
changed to no longer have a long-running userspace process printing
out all the calculated RTTs, but instead simply occasionally looks up
the RTT from the flow-state map.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-06-23 15:02:25 +02:00
Simon Sundberg
20c6dbec4c pping: Remove pinning of maps
When both BPF programs are kept in the same file, no longer need to
pin the maps in order to share them between the programs.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-04-23 14:16:52 +02:00
Simon Sundberg
9cc6b1eaab pping: Update documentation
Update documentation to reflect the current state of pping (after
merging pping_kern_tc and pping_kern_xdp into a single file).

Also add another point to the TODO list that has been discussed at a
previous meeting.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-04-22 18:06:09 +02:00
Simon Sundberg
b7eae0846a pping: Reduce number of IPV6 extensions parsed
Reduce IPV6_EXT_MAX_CHAIN to 3 to avoid hitting the verifier limit of
processing 1 million instructions, This results in fewer loops in
parsing_helpers.h/skip_ip6hdrnext which simplifies the verifier
analysis. IPv6 extension headers do not appear to be that common, so
this is unlikely to cause a considerable limitation.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-04-22 17:51:49 +02:00
Simon Sundberg
93b6c0eafa pping: Major refactor and add -f and -c options
Merge the pping_kern_tc.c, pping_kern_xdp.c and pping_helpers.h into
the single file pping_kern.c. Do not change any of the BPF code,
except renaming the map ts_start to packet_ts.

To handle both BPF programs kept in single ELF-file, change loading
mechanism to extract and attach both tc and XDP programs from it. Also
refactor main-method into several smaller functions to reduce its
size.

Finally, added the --force (-f) and --cleanup-interval (-c) options to
the argument parsing, and improved the parsing of the
--rate-limit (-r) option.

NOTE: The verifier rejects program in it's current state as too
large (over 1 million instructions). Setting the TCP_MAX_OPTIONS in
pping_kern.c to 5 (or less) solves this. Unsure at the moment what
causes the verifier to think the program is so large, as the code in
pping_kern.c is identical to the one from the three files it was
merged from.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-04-15 14:13:54 +02:00
Simon Sundberg
f26f03a8ce pping: Verify opt_size is valid when parsing TCP options
Add a check that opt_size is at least 2 in
pping_helpers.h/prase_tcp_ts, otherwise terminate the loop
unsucessfully. Only check the lower bound of opt_size, the upper
bound will be checked in the first step of the next loop iteration
anyways.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-30 19:34:48 +02:00
Simon Sundberg
48c25735ac pping: Declare opt_size volatile
Declare opt_size in pping_helpers.h/parse_tcp_ts volatile to ensure
compiler always reads it from stack as u8, which avoids confusing the
verifier into thinking it might have a negative value.

Old solution of having &=0x3f before adding opt_size to pos could
potentially cause weird behavior if a packet with an invalid TCP
option size arrived (for example, if opt_size was 64 it would be
interpreted as 0, and the loop would simply check the same position
again on each iteration). Simply changing the check to 0xff was not
possible because the compiler would optimize that away (as it knows
that to have no effect on a u8).

Also change check that TCP timestamp is not outside of boundaries from
pos+opt_size to pos+10. Before declaring opt_size as volatile compiler
automatically did this transformation, but now have to explicitly do
this. If this conversion is not done the verifier will reject the
program as it due to its goldfish memory isn't sure that opt_size has
to be 10 at this point.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-29 20:13:33 +02:00
Simon Sundberg
9168b47cca pping: Refactor init_rodata
Refactor init_rodata to search for the first map with ".rodata" in its
name. Should be more robust than previous solution which first tried
to construct the name for the rodata map, and then find the map by
name.

Also remove some outcommented code that was not used.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-29 14:55:41 +02:00
Simon Sundberg
9ec2381559 pping: Minor documentation fixes
Mainly fix some incorrect words and a couple of clumsy sentences in
the README.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-26 17:54:42 +01:00
Simon Sundberg
0597b5536f pping: Update documentation
Update the README, the pping diagram (eBPF_pping_design.png) and TODO
to be more up to date with the current implementation.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-26 16:57:48 +01:00
Simon Sundberg
cc8a16650b pping: Add userspace configuration
Implement basic mechanic for parsing arguments from userspace and
passing them to a global config variable in the BPF programs.

This also changes the basic use of the program from:
    $./pping interface
to:
    $./pping -i interface

Also, revert to using the memset solution for the map_ipv4_to_ipv6
function to avoid the ipv4_prefix constant being stored in the .rodata
section. This makes it easier to set the value for the global config
variable from userspace, as the only thing left in the .rodata section
is the config struct.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-22 12:23:27 +01:00
Simon Sundberg
e04284a3ee pping: Add check to satisfy verifier
Add a check that to ensure verifier that opt_size is positive in case
its been read in from stack. Also enable (uncomment) the flow-state
cleanup from the XDP program as the added check avoids verifier
rejection.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-15 18:23:23 +01:00
Simon Sundberg
2b64355b2e pping: Attempt to be nice to verifier...
Verifier might have rejected XDP program due to opt_size being loaded
from memory, see
https://blog.path.net/ebpf-xdp-and-network-security. Add check of
opt_size to attempt to convince verifier that it's not a negative
value or anything else crazy. Leads to verifier instead thinking the
program is too large (over 1m instructions).

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-15 11:52:17 +01:00
Simon Sundberg
ad69dc4fb6 pping: Add instant cleanup of flow state
Add parsing of TCP FIN/RST to determine if connection is being closed,
and if so delete state directly from BPF programs.

Only enabled on tc-program, as verifier is unhappy about in on XDP
side for some reason.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-09 19:58:42 +01:00
Simon Sundberg
0b2107f5c4 pping: Add periodic cleanup of flow state
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-09 17:50:27 +01:00
Simon Sundberg
5204cd1e3c pping: Minor cleanup/refactor
Remove some out-commented code. Also use bpf_object__unpin_maps
instead of manually unpinning the ts_start map. Additionally, change
map_ipv4_to_ipv6 to use clearer implementation (that now also works
for tc due to always using libbpf to load program).

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-09 12:21:39 +01:00
Simon Sundberg
3bd3333c69 pping: Update and linebreak TODO
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-02 17:49:36 +01:00
Simon Sundberg
1446e6edec pping: Load tc-bpf program with libbpf
Load and pin the tc-bpf program in pping.c using libbpf, and only
attach the pinned program using iproute. That way, can use features
that are not supported by the old iproute loader, even if iproute does
not have libbpf support.

To support this change, extend bpf_egress_loader with option to load
pinned program. Additionally, remove configure script and parts of
Makefile that are no longer needed. Furthermore, remove multiple
definitions of ts_start map, and place singular definition in
pping_helpers.h which is included by both BPF programs.

Also, some minor fixes based on Toke's review.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-02 17:40:51 +01:00
Simon Sundberg
1282bce7d8 pping: Update sampling design document
Update SAMPLING_DESIGN.md, partly based on discussions during the
meeting with Red Hat on 2021-03-01.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-01 20:09:21 +01:00
Simon Sundberg
cecd6b54f2 pping: Inital rate limit implementation
Add a per-flow rate limit, limiting how often new timestamp entries
can be created. As part of this, add per-flow state keeping track
of when last timestamp was created and last seen identifier for each
flow.

Additionally, remove timestamp entry as soon as RTT is
calculated, as last seen identifier is used to find first unique value
instead. Furthermore, remove packet_timestamp struct and only use
__u64 as timestamp, as used memeber is no longer needed.

This initial commit lacks cleanup of flow-state, user-configuration of
rate limit, mechanism to handle bursts etc.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-03-01 18:16:48 +01:00
Simon Sundberg
6e5136092d pping: Update sampling design document
Add sections on per-flow state, graceful degradation and some
implementation considerations.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-26 12:38:53 +01:00
Simon Sundberg
ae1d89c7c9 pping: Add document about sampling design
Add a document outlining my thoughts for how to implement
sampling. Intended both as a basis for discussion, as well as being a
form of documentation.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-25 19:00:41 +01:00
Simon Sundberg
a9c276cb54 pping: Update TODO-list
Rewrite/regroup/reorder some points for the General pping
section. Also add some new points, add some additional comments to
existing points, and check in the "Skip pure ACKs" as complated.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-25 16:33:26 +01:00
Toke Høiland-Jørgensen
97fdefa90d pping: Make the link to Kathie's original pping utility clearer
The link to the original pping utility was easy to miss, and we didn't
credit Kathie with its implementation. That was clearly an oversight, so
let's fix that.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
2021-02-25 11:18:44 +01:00
Simon Sundberg
a2c6b0618b pping: Use designated initialization for parsing_context
Change how intitalization of pctx is done in tc and xdp
programs. Also, len to pkt_len in parsing_context.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-16 13:16:12 +01:00
Simon Sundberg
7fe1d282ae pping: Minor refactor of parsing_context
Refactor parsing_context to have a len member instead of
data_end_end. Also, refactor parse_tcp_identifier to take pointers
directly to the ports instead of the flow_address structs.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-16 12:34:19 +01:00
Simon Sundberg
219e962832 pping: Avoid timestamping pure TCP ACKs
Add a parsing_context struct to keep track data, data_end and
currently parsed position, as well as handling the difference between
data_end for XDP and TC through data_end_end pointer.

Use parsing_context struct to detect pure TCP ACKs, and avoid creating
identifier for them on egress (to avoid creating timestamp
entries). This solves issue of calculating RTTs in inproper contexts.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-12 18:31:30 +01:00
Simon Sundberg
502663f354 pping: Update TODO.md
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-12 12:09:37 +01:00
Simon Sundberg
397b44cff7 pping: Refactor parse_packet_identifer
Remove the saddr and daddr parmeters from parse_packet_identifier, and
use the is_egress parmeter to perform the saddr/daddr swap inside the
function. Also, minor style fixes.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-12 11:40:43 +01:00
Simon Sundberg
3268ba87bb pping: Refactor TC and XDP programs
Refactor TC and XDP programs to reuse common logic for parsing
packets. Add functions for parsing packets for an identifier to
pping_helpers.h which both TC and XDP parts use. Also make it easier
to extend pping with support for new protocols, as only new parsing
functions have to be added and inserted into a single place.

Also add reserved members to end of structs in pping.h to indicate
padding.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-09 18:09:30 +01:00
Simon Sundberg
eafdf87d80 pping: Fix struct alginment issues
Move some members in network_tuple and rtt_event around to avoid holes.

Also remove some uncecessary parentheses before & operator, and add
local definitions of AF_INET and AF_INET6.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-09 13:00:28 +01:00
Simon Sundberg
670df84bd9 pping: Add IPv6 support
Several changes to add IPv6 support:
- Change structs in pping.h
  - replace ipv4_flow with network_tuple
  - rename ts_key to packet_id
  - rename ts_timestamp to packet_timestamp
- Add map_ipv4_to_ipv4 in pping_helpers.h
  - Also remove obsolete fill_ipv4_flow
- Rewrite pping_kern*
  - parse either IPv4 or IPv6 header (depending on proto)
  - Use map_ipv4_to_ipv6 to store IPv4 address in network_tuple
Support printout of IPv6 addresses in pping.c
  - Add function format_ip_address as wrapper over inet_ntop
  - Change handle_rtt_event to first format IP-address strings in
    local buffers, then perform single printout

While some steps have been taken to be more general towards different
types of packet identifiers (not just the currently supported TCP
timestamps), significant refactorization of pping_kern* will still be
required. Also, pping_kern_xdp and pping_kern_tc also have large
sections of very similar code that can be refactored into functions.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-08 20:28:46 +01:00
Simon Sundberg
1bb5a44152 pping: Create pin-folder and check if root
Create the /sys/fs/bpf/tc folder if it does not exist. Also check if
pping is run as root, otherwise inform user that it must run as root.

Libbpf will attempt to create the /sys/fs/bpf/tc/globals directory
when pinning the map, however it will not do so recursivly (so will
fail if /sys/fs/bpf/tc does not exist). So as a temporary solution,
attempt to create /sys/fs/bpf/tc (however, if sys/fs/bpf is not
mounted this will still fail).

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-04 19:56:49 +01:00
Simon Sundberg
c777287af2 pping: Minor refactor and whitespace fixes
Refactor tc_bpf_load and tc_bpf_clear to use a common run_program
function which does the fork+execv.

Enclose compound statement defines in parenthesis.

Removed argument CLOCK_MONOTONIC from callers to parameterless
function get_time_ns().

Also fix some weird spacing in pping_helpers.h, and fix some
formatting issues, using clang-format with the kernel source tree
.clang-format on the whole tree.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-04 19:52:40 +01:00
Simon Sundberg
7410d5cc2c pping: Various minor fixes
Perform various fixes and tweaks:
- Rename several defines to make them more informative
- Remove unrolling of loop in BPF programs
- Reuse defines for program sections between userspace and kernel
  space programs
- Perform fork+exec to run bpf_egress_loader script instead of
  system()
- Add comment to copied scripts indicating I've modified them
- Add pping.h and pping_helpers.h as dependencies in Makefile

Also, add a brief description of what PPing is and how it works to
README

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-04 19:48:01 +01:00
Simon Sundberg
71c6458712 pping: Fix incorrect printout of IP-address
Split the print statements for RTTs into two parts to avoid inet_ntoa
overwriting one of the IP-addresses (causing both source and
destitionation address to appear the same). Also flip the order of
source and destination to be the same as Pollere's pping.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-04 19:47:15 +01:00
Simon Sundberg
8b42ba1e22 pping: TC-BPF use BTF map if iproute has libbpf
Copy setup from traffic-pacing-edt to use BTF-defined map if configure
detects that iproute2 has libbpf support, otherwise fall back on
bpf_elf_map. Also fix a minor bug with setting default value for SEC
in bpf_egress_loader.sh.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-04 19:45:59 +01:00
Simon Sundberg
b920d72fe0 pping: Let libbpf pin map and clean up TC and map at end
Switch order so XDP program loads first, so the ts_start map is
automatically pinned by libbpf (solves issue with tc not preserving
the name of the map).

Unload the TCP-BPF program (or rather remove the entire clsact qdisc
it is attached to) using bpg_egress_loader script once program
exits. Also unpin ts_start map on program shutdown.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-04 19:44:33 +01:00
Simon Sundberg
337126306b pping: Switch to BTF-defined maps for XDP program
Make loader use libbpf's existing functionality for reusing pinned
maps. The name for map not kept by tc, so cannot get fd of map by
name. Use fd of first encountered map as temporary workaround.

Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
2021-02-04 19:44:06 +01:00