Previously the program would only print out an error message if the
cleanup of a map failed, and then keep running. Each time the
periodical cleanup failed the error message would be repeated, but no
further action taken. Change this behavior so that the program instead
terminates the cleanup thread and aborts the rest of the program.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Due to a kernel bug for XDP programs loaded via libxdp that use global
functions (see https://lore.kernel.org/bpf/8735gkwy8h.fsf@toke.dk/t/),
XDP mode only works on relatively recent kernels where the bug is
patched (or kernels where the patch has been backported). As many
users may not have such a recent kernel they will only see a confusing
verifier error like the following:
Starting ePPing in standard mode tracking TCP on test123
libbpf: elf: skipping unrecognized data section(7) xdp_metadata
libbpf: elf: skipping unrecognized data section(7) xdp_metadata
libbpf: prog 'pping_xdp_ingress': BPF program load failed: Invalid argument
libbpf: prog 'pping_xdp_ingress': -- BEGIN PROG LOAD LOG --
Func#1 is safe for any args that match its prototype
Validating pping_xdp_ingress() func#0...
0: R1=ctx(id=0,off=0,imm=0) R10=fp0
; int pping_xdp_ingress(struct xdp_md *ctx)
0: (bf) r6 = r1 ; R1=ctx(id=0,off=0,imm=0) R6_w=ctx(id=0,off=0,imm=0)
1: (b7) r7 = 0 ; R7_w=invP0
; struct packet_info p_info = { 0 };
2: (7b) *(u64 *)(r10 -8) = r7 ; R7_w=invP0 R10=fp0 fp-8_w=00000000
3: (7b) *(u64 *)(r10 -16) = r7 ; R7_w=invP0 R10=fp0 fp-16_w=00000000
4: (7b) *(u64 *)(r10 -24) = r7 ; R7_w=invP0 R10=fp0 fp-24_w=00000000
5: (7b) *(u64 *)(r10 -32) = r7 ; R7_w=invP0 R10=fp0 fp-32_w=00000000
6: (7b) *(u64 *)(r10 -40) = r7 ; R7_w=invP0 R10=fp0 fp-40_w=00000000
7: (7b) *(u64 *)(r10 -48) = r7 ; R7_w=invP0 R10=fp0 fp-48_w=00000000
8: (7b) *(u64 *)(r10 -56) = r7 ; R7_w=invP0 R10=fp0 fp-56_w=00000000
9: (7b) *(u64 *)(r10 -64) = r7 ; R7_w=invP0 R10=fp0 fp-64_w=00000000
10: (7b) *(u64 *)(r10 -72) = r7 ; R7_w=invP0 R10=fp0 fp-72_w=00000000
11: (7b) *(u64 *)(r10 -80) = r7 ; R7_w=invP0 R10=fp0 fp-80_w=00000000
12: (7b) *(u64 *)(r10 -88) = r7 ; R7_w=invP0 R10=fp0 fp-88_w=00000000
13: (7b) *(u64 *)(r10 -96) = r7 ; R7_w=invP0 R10=fp0 fp-96_w=00000000
14: (7b) *(u64 *)(r10 -104) = r7 ; R7_w=invP0 R10=fp0 fp-104_w=00000000
15: (7b) *(u64 *)(r10 -112) = r7 ; R7_w=invP0 R10=fp0 fp-112_w=00000000
16: (7b) *(u64 *)(r10 -120) = r7 ; R7_w=invP0 R10=fp0 fp-120_w=00000000
17: (7b) *(u64 *)(r10 -128) = r7 ; R7_w=invP0 R10=fp0 fp-128_w=00000000
18: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
;
19: (07) r2 += -128 ; R2=fp-128
; if (parse_packet_identifer_xdp(ctx, &p_info) < 0)
20: (85) call pc+13
R1 type=ctx expected=fp
Caller passes invalid args into func#1
processed 206542 insns (limit 1000000) max_states_per_insn 32 total_states 13238 peak_states 792 mark_read 40
-- END PROG LOAD LOG --
libbpf: failed to load program 'pping_xdp_ingress'
libbpf: failed to load object 'pping_kern.o'
Failed attaching ingress BPF program on interface test123: Invalid argument
Failed loading and attaching BPF programs in pping_kern.o
To help users that run into this issue when loading the program in
generic or unspecified mode, add a small hint suggesting to
upgrade the kernel or use the tc ingress mode instead in case
attaching the XDP program fails.
However, if loaded in native mode, instead give the suggestion to try
loading in generic mode instead. While libbpf and libxdp already add
some messages hinting at this, this hint clarifies how to do this with
ePPing (using the --xdp-mode argument).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add an option to let the user configure which mode to load the XDP
program in (unspecified, native or generic).
Set the default mode to native (was unspecified previously) as that is
what the user most likely wants to use (generic or unpsecified falling
back on generic will likely have worse performance).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Using the XDP ingress hook requires a newer kernel (needs Toke's patch
fixing the verification of global function for BPF_PROG_TYPE_EXT
programs) than tc mode, is will likely perform worse than tc if
running in generic mode (due to no driver support for
XDP). Furthermore, even when XDP works and has driver support, its
performance benefit over tc is likely small as the packets are always
passed on to the network stack regardless (not creating a fast-path
that bypasses the network stack). Therefore, use the tc ingress hook
as default instead, and only use XDP if explicitly required by the
user (-I/--ingress hook xdp).
This partly addresses issue #49, as ePPing should no longer by default
get the confusing error message from failing verification if the
kernel lacks Toke's verifier patch.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Define the BPF program names in the user space component. The strings
corresponding to the BPF program names were before inserted in several
places, including in multiple string comparison, which is error prone
and could leave to subtle errors if the program names are changed and
not updated correctly in all places. With the program name string
being defined, they only have to be changed in a single place.
Currently only the names of the ingress programs occur in multiple
places, but also define the name for the egress program to be
consistent.
Note that even after this change one has the sync the defined values
with the actual program names declared in the pping_kern.c
file. Ideally, these would all be defined in a single place, but not
aware of a convenient way to make that happen (cannot use the defined
strings as function names as they are not identifiers, and if defined
as identifiers instead it would not be possible to use them as
strings).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
The userspace loader would only check if the tc clsact was created
when the egress program was loaded. Thus, if the ingress program
created the clsact the egress program would not have to create the
clsact, the ePPing would thus falsely believe it did not create a
clsact and fail to remove it on shutdown even if --force was used. Fix
this by checking if either ingress or egress created clsact.
This bug was introduced as a sneaky side effect of commit
78b45bde56 (pping: Use libxdp to load
and attach XDP program). Before this commit the egress program (for
which there is only a tc alternative) would be loaded first, and thus
it was sufficient to check if it created the clsact. When switching to
libxdp however, the ingress program (specifically the XDP program) had
to be loaded first, and thus the order of loading ingress and egress
program were swapped. Therefore, it was no longer sufficient to only
check the egress program as the tc ingress program may have created
the clsact before the the egress program is attached (and only
checking the ingress program would also not be enough as the tc
ingress program may never be loaded if XDP mode is used instead).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Make ePPing ignore TCP SYN packets by default, so that the
initial handshake phase of the connection is ignored. Add an
option (--include-syn/-s) to explicitly include SYN packets.
The main reason it can be a good idea to avoid SYN-packets is to avoid
being affected by SYN-flood attacks. When ePPing also includes
SYN-packets it becomes quite vulnerable to SYN-flood attacks, which
will quickly fill up its flow_state table, blocking actual useful
flows from being tracked. As ePPing will consider the connection
opened as soon as it sees the SYN-ACK (it will not wait for final
ACK), flow-state created from SYN-flood attacks will also stay around
in the flow-state table for a long time (5 minutes currently) as no
RST/FIN will be sent that can be used to close it.
The drawback from ignoring SYN-packets is that no RTTs will be
collected during the handshake phase, and all connections will be
considered opened due to "first observed packet".
A more refined approach could be to properly track the full TCP
handshake (SYN + SYN-ACK + ACK) instead of the more generic "open once
we see reply in reverse direction" used now. However, this adds a fair
bit of additional protocol-specific logic. Furthermore, to track the
full handshake we will still need to store some flow-state before the
handshake is completed, and thus such a solution would still be
vulnerable to SYN-flood attacks (although the incomplete flow states
could potentially be cleaned up faster).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Since we now have libxdp as a submodule, we can switch pping over to using
libxdp for loading its XDP program. This requires switching the order of
attachment, because libxdp needs to be fed the BPF object in an unloaded
state (and will load it as part of attaching).
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Refactor functions for parsing protocol-specific packet
identifiers (parse_tcp_identifier, parse_icmp6_identifer and
parse_icmp_identifer) so they no longer directly fill in the
packet_info struct. Instead make the functions take additional
pointers as arguments and fill in a protocol_info struct.
The reason for this change is to decouple the
parse_<protocol>_identifier functions from the logic of how the
packet_info struct should be filled. The parse_packet_indentifier is
now solely responsible for filling in the members of packet_info
struct correctly instead of working in tandem with the
parse_<protocol>_identifier, filling in some members each.
This might result in a minimal performance degradation as some values
are now first filled in the protocol_info struct and later copied to
packet_info instead of being filled in directly in packet_info.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Format code using clang-format from the kernel tree. However, leave
code in orginal format in some instances where clang-format clearly
reduces readability of code (ex. do not remove alginment of comments
for struct members and long options).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add some debug info to the periodical map cleanup process. Push debug
information through the events perf buffer by using newly added
map_clean_event.
The old user space map cleanup process had some simple debug
information that was lost when transitioning to using bpf_iter
instead. Therefore, add back similar (but more extensive) debug
information but now collected from the BPF-side. In addition to stats
on entries deleted by the cleanup process, also include stats on
entries deleted by ePPing itself due to matching (for timestamp
entries) or detecting FIN/RST (for flow entries)
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
To improve the performance of the map cleanup, switch from the
user-spaced loop to using BPF iterators. With BPF iterators, a BPF
program can be run on each element in the map, and can thus be done in
kernel-space. This should hopefully also avoid the issue the previous
userspace loop had with resetting in case an element was removed by
the BPF programs during the cleanup.
Due to removal of userspace logic for map cleanup, no longer provide
any debug information about how many entires there are in each map and
how many of them were removed by the garbage collection. This will be
added back in the next commit.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Explicitly stop the thread which performs periodical map cleanup on
shutdown, before attempting to free up any resources it might use.
For the cleanup order to make more sense, also setup the perf-buffer
before setting up the periodical map cleaning, so that the thread can
be stopped as the first part of the cleanup. This also matches better
with the next couple of commits where map cleaning debug information
will be pushed through the perf-buffer.
Finally, move the addition of the signalhandler earlier in the code to
eliminate a window where it was possible to terminate the program
without relevant cleanup code having a chance to run.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Allocate the clean_args on the stack of the main function rather than
the stack of setup_periodical_map_cleaning. The clean_args is used by
periodical_map_cleanup from a different thread, so allocating them on
stack for setup_periodical_map_cleaning which goes out of scope
directly after the thread is created opens up for errors where later
function calls may overwrite the arguments, causing unpredictable
behavior.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Send a warning notifying the user that PPing failed to create a
flow/timestamp entry due to the corresponding map being full. To avoid
sending a warning for every packet, only emit warnings every
WARN_MAP_FULL_INTERVAL (which is currently hard-coded to 1s).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Refactor code for how events are handled in the user space
application. Preparation for adding an additional event type which
should not be handled by the normal functions for printing RTT and
flow events.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Wait with sending a flow open message until a reply has been seen for
the flow. Likewise, only emit a flow closing event if the flow has
first been opened (that is, a reply has been seen).
This introduces potential (but unlikely) concurrency issues for flow
opening/closing messages which are further described in the README.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Perform both timestamping and matching on both ingress and egress
hooks. This makes it more similar to Kathie's pping, allowing the tool
to capture RTTs in both directions when deployed on just a single
interface.
Like Kathie's pping, by default filter out RTTs for packets going to
the local machine (will only include local processing delays). This
behavior can be disabled by passing the -l/--include-local option.
As packets that are timestamped on ingress and matched on egress will
include the local machines processing delay, add the "match_on_egress"
member to the JSON output that can be used to differentiate between
RTTs that include the local processing delay, and those which don't.
Finally, report the source and destination addresses from the perspective
of the reply packet, rather than the timestamped packet, to be
consistent with Kathie's pping.
Overall, refactor large parts of pping_kern to allow both timestamping
and matching, as well as updating both the flow and reverse flow and
handle flow-events related to them, in one go. Also update README to
reflect changes.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add an option (-R, --rtt-rate) to adapt the rate sampling based on the
RTT of the flow. The sampling rate will be C * RTT, where C is a
configurable constant (ex 1.0 to get one sample every RTT), and RTT
is either the current minimum (default) or smoothed RTT of the
flow (chosen via the -t or --rtt-type option).
The smoothed RTT (sRTT) is updated for each calculated RTT, and is
calculated in a similar manner to srtt in the kernel's TCP stack. The
sRTT is a moving average of all RTTs, and is calculated according to
the formula:
srtt = 7/8 * prev_srtt + 1/8 * rtt
To allow the user to pass a non-integer C (ex 0.1 to get 10 RTT
samples for every RTT-period), fixed-point arithmetic has been used
in the eBPF programs (due to lack of support for floats). The maximum
value for C has been limited to 10000 in order for it to be unlikely
that the C * RTT calculation will overflow (with C = 10000, overflow
will only occur if RTT > 28 seconds).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add command-line flags for each protocol that pping should attempt to
parse and report RTTs for (currently -T/--tcp and -C/--icmp). If no
protocol is specified assume TCP. To clarify this, output a message
before start on how ePPing has been configured (stating output format,
tracked protocols and which interface to run on).
Additionally, as the ppviz format was only designed for TCP it does
not have any field for which protocol an entry belongs to. Therefore,
emit a warning in case the user selects the ppviz format with anything
other than TCP.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Allow pping to passivly monitor RTT for ICMP echo request/reply
flows. Use the echo identifier as ports, and echo sequence as packet
identifier.
Additionally, add protocol to standard output format in order to be
able to distinguish between TCP and ICMP flows.
The ppviz format does not include protocol, making it impossible to
distinguish between TCP and ICMP traffic. Will add warning if ppviz
format is used together with ICMP traffic in the future.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Also intercept SIGTERM (in addition the the previously intercepted
SIGINT) and perform graceful shutdown.
Perhaps it also makes sense to perform graceful shutdown on some
additional signals, like SIGHUP and SIGQUIT?
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
The libbpf API has deprecated a number of functions used by the pping
loader. While a couple of functions have simply been renamed,
bpf_object__find_program_by_title has been completely deprecated in
favor of bpf_object__find_program_by_name. Therefore, change so that
BPF programs are found based on the C function names rather than
section names.
Also remove defines of section names as they are no longer used, and
change the section names in pping_kern.c to use "tc" instead of
"classifier/ingress" and "classifier/egress".
Finally replace the flags json_format and json_ppviz in pping_config
with a single enum for the different output formats. This makes the
logic for which output format to use clearer compared to relying on
multiple (supposedly) mutually exclusive flags (and implicitly
assuming standard format if neither flag was set).
One potential concern with this commit is that it introduces some
"magical strings". In case the function names in pping_kern.c are
changed it will require multiple changes in pping.c.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
The rate-limit and cleanup-interval arguments were only verified to be
positive. Add a check for an upper bound to avoid user being able to
pass values that result in an internal overflow. The limits for both
rate-limit and cleanup-interval have been set to one week which should
be more then enough for any reasonable user.
Additionally, disable the period cleanup entirely if the value 0 is
passed to cleanup-interval.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Make several changes to functions related to attaching and detaching
the BPF programs:
- Check the BPF program id when detaching programs to ensure that the
correct programs are removed.
- When attaching tc-programs, keep track of if the clsact qdisc was
created or existed previously. Attempt to delete the qdisc if it was
created and attaching failed. If the --force argument was given, also
attempt to delete qdisc on shutdown in case it did not previously
exist.
- Rely on XDP flags to replace existing XDP program if --force is used
rather than explicitly detaching any XDP program first.
- Print out hints for why pping might have failed attaching the XDP
program.
Also, use libbpf_strerror instead of strerror to better display
libbpf-specific error codes, and for more reliable error handling in
general (don't need to ensure the error codes are positive).
Finally, change return codes of tc programs to TC_ACT_UNSPEC from
TC_ACT_OK to allow other TC-BPF programs to be used on the same
interface as pping.
Concerns with this commit:
- When attaching a tc program libbpf will emit a warning if the
clsact qdisc already exists on the interface. The fact that the
clsact already exists is not an issue, and is handled in tc_attach
by checking for EEXIST, so the warning could be a bit
misleading/confusing for the user.
- The tc_attach and xdp_attach functions attempt to return the u32
prog_id in an int. In case the programs are assigned a very high
id (> 2^31) this may cause it to be interpreted as an error instead.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
For some machines, XDP may not be suitable due to ex. lack of XDP
support in NIC drivers or another program already being attached to
the XDP hook on the desired interface. Therefore, add an option to use
the tc-ingress hook instead of XDP to attach the pping ingress BPF
program on.
In practice, this adds an additional BPF program to the object file (a
TC ingress program). To avoid loading an unnecessary BPF program, also
explicitly disable autoloading for the ingress program not selected.
Also, change the tc programs to return TC_ACT_OK instead of
BPF_OK. While both should be compatible, the TC_ACT_* return codes
seem to be more commonly used for TC-BPF programs.
Concerns with this commit:
- The error messages for XDP attach failure has gotten slightly less
descriptive. I plan to improve the code for attaching and detaching
XDP programs in a separate commit, and will then address that.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
libbpf v0.4 added an API for attaching/detaching TC-BPF programs. So
use the new API to attach the tc program instead of calling on an
external script (which uses the tc command line utility).
Avoid removing the clsact qdisc on program shutdown or error, as
there's currently no convenient way to ensure the qdisc isn't used by
other programs as well. This means pping will not completely clean up
after itself, but this is a safer alternative than always destroying
the qdsic as done by the external script, which may pull the rug out
underneath other programs using the qdisc.
Finally, remove the pin_dir member from the configuration as pping no
longer pins any programs or maps, and remove deleted tc loading
scripts from README.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add cleanup code to load_attach_bpfprogs function, so it should always
unpin tc program, and additionally detach the tc and xdp programs in
case of any failure.
Also unpin tc program directly after attaching it (rather than on
program shutdown), so that multiple instances of pping can be run
simultaneously (on different interfaces).
Finally, rename some of the functions for attaching/detaching tc
programs to be more consistent with the xdp ones.
Note: Still need to keep a copy of most of the cleanup code in main as
well, as the tc and xdp programs also need to be detached on program
shutdown or if later functions fail.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Simplify the three output functions by breaking them up into smaller
helper functions. Also introduce the pping_event union, which can hold
either an rtt_event or flow_event.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Make the flow_timeout function call the current output function to
simulate a flow-closing event. Also some other minor cleanup/fixes.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add "flow events" (flow opening or closing so far) which will trigger
a printout of message.
Note: The ppviz format will only print out the traditional rtt events
as the format does not include opening/closing messages.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Use a JSON-writer library from iproute instead of complicated printf
statement. Also output timestamp, rtt and min_rtt as integers in
nanoseconds, rather than floats in seconds.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Change order of parameters for format_ip_address to follow the
convention of the printf functions where buffer is placed first,
instead of the conventions of the inet_ntop functions where buffer is
placed last.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add per-flow tracking of number of packets and bytes
sent/received. Add these to the JSON output format.
Also update README regarding concurrency issue when updating these
statistics.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
The format option can take the values "standard" (default), "json" and
ppviz (new name for "machine-friendly").
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add Kathie's "machine friendly" as an optional output format when
passing '-m' or '--machine-friendly' to pping. This format can be used
together with Kathie's ppviz tool to visaulize the output.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add the option to output in JSON format by passing '-j' or '--json' to
pping. Include the protocol in the JSON format, and fix so kernel-side
actually stores the protocol in the flow_address struct.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
To add timestamp to output, push the timestamp when packet was
processed from kernel as part of the rtt-event. Also keep track of
minimum encountered RTT for each flow in kernel, and also push that as
part of the RTT-event.
Additionally, avoid pushing RTT messages at all if no flow-state
information can be found (due to ex. being deleted from egress side),
as no valid min-RTT can then be given. Furthermore, no longer delete
flow-information once seeing the FIN-flag on egress in order to keep
useful flow-state around for RTT-messages longer. Due to the
FIN-handshake process, it is sufficient if the ingress program deletes
the flow-state upon seeing FIN. However, still delete flow-state from
either ingress or egress upon seeing RST flag, as RST does not have a
handshake process allowing for delayed deletion.
While minimum RTT could also be tracked from the userspace process,
userspace is not aware of when the flow is closed so would have to add
additional logic to keep track of minimum RTT for each flow and
periodically clean them up. Furthermore, keeping RTT statistics in the
flow-state map is useful for implementing future features, such as an
RTT-based sampling interval. It would also be useful in case pping is
changed to no longer have a long-running userspace process printing
out all the calculated RTTs, but instead simply occasionally looks up
the RTT from the flow-state map.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
When both BPF programs are kept in the same file, no longer need to
pin the maps in order to share them between the programs.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Merge the pping_kern_tc.c, pping_kern_xdp.c and pping_helpers.h into
the single file pping_kern.c. Do not change any of the BPF code,
except renaming the map ts_start to packet_ts.
To handle both BPF programs kept in single ELF-file, change loading
mechanism to extract and attach both tc and XDP programs from it. Also
refactor main-method into several smaller functions to reduce its
size.
Finally, added the --force (-f) and --cleanup-interval (-c) options to
the argument parsing, and improved the parsing of the
--rate-limit (-r) option.
NOTE: The verifier rejects program in it's current state as too
large (over 1 million instructions). Setting the TCP_MAX_OPTIONS in
pping_kern.c to 5 (or less) solves this. Unsure at the moment what
causes the verifier to think the program is so large, as the code in
pping_kern.c is identical to the one from the three files it was
merged from.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Refactor init_rodata to search for the first map with ".rodata" in its
name. Should be more robust than previous solution which first tried
to construct the name for the rodata map, and then find the map by
name.
Also remove some outcommented code that was not used.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Implement basic mechanic for parsing arguments from userspace and
passing them to a global config variable in the BPF programs.
This also changes the basic use of the program from:
$./pping interface
to:
$./pping -i interface
Also, revert to using the memset solution for the map_ipv4_to_ipv6
function to avoid the ipv4_prefix constant being stored in the .rodata
section. This makes it easier to set the value for the global config
variable from userspace, as the only thing left in the .rodata section
is the config struct.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Remove some out-commented code. Also use bpf_object__unpin_maps
instead of manually unpinning the ts_start map. Additionally, change
map_ipv4_to_ipv6 to use clearer implementation (that now also works
for tc due to always using libbpf to load program).
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Load and pin the tc-bpf program in pping.c using libbpf, and only
attach the pinned program using iproute. That way, can use features
that are not supported by the old iproute loader, even if iproute does
not have libbpf support.
To support this change, extend bpf_egress_loader with option to load
pinned program. Additionally, remove configure script and parts of
Makefile that are no longer needed. Furthermore, remove multiple
definitions of ts_start map, and place singular definition in
pping_helpers.h which is included by both BPF programs.
Also, some minor fixes based on Toke's review.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Add a per-flow rate limit, limiting how often new timestamp entries
can be created. As part of this, add per-flow state keeping track
of when last timestamp was created and last seen identifier for each
flow.
Additionally, remove timestamp entry as soon as RTT is
calculated, as last seen identifier is used to find first unique value
instead. Furthermore, remove packet_timestamp struct and only use
__u64 as timestamp, as used memeber is no longer needed.
This initial commit lacks cleanup of flow-state, user-configuration of
rate limit, mechanism to handle bursts etc.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Refactor TC and XDP programs to reuse common logic for parsing
packets. Add functions for parsing packets for an identifier to
pping_helpers.h which both TC and XDP parts use. Also make it easier
to extend pping with support for new protocols, as only new parsing
functions have to be added and inserted into a single place.
Also add reserved members to end of structs in pping.h to indicate
padding.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Move some members in network_tuple and rtt_event around to avoid holes.
Also remove some uncecessary parentheses before & operator, and add
local definitions of AF_INET and AF_INET6.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>
Several changes to add IPv6 support:
- Change structs in pping.h
- replace ipv4_flow with network_tuple
- rename ts_key to packet_id
- rename ts_timestamp to packet_timestamp
- Add map_ipv4_to_ipv4 in pping_helpers.h
- Also remove obsolete fill_ipv4_flow
- Rewrite pping_kern*
- parse either IPv4 or IPv6 header (depending on proto)
- Use map_ipv4_to_ipv6 to store IPv4 address in network_tuple
Support printout of IPv6 addresses in pping.c
- Add function format_ip_address as wrapper over inet_ntop
- Change handle_rtt_event to first format IP-address strings in
local buffers, then perform single printout
While some steps have been taken to be more general towards different
types of packet identifiers (not just the currently supported TCP
timestamps), significant refactorization of pping_kern* will still be
required. Also, pping_kern_xdp and pping_kern_tc also have large
sections of very similar code that can be refactored into functions.
Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>