xdp-project-bpf-examples

mirror of https://github.com/xdp-project/bpf-examples.git synced 2024-05-06 15:54:53 +00:00

Author	SHA1	Message	Date
Toke Høiland-Jørgensen	ff3e4272ff	Integrate libxdp as a submodule This adds libxdp as a submodule and link target alongside libbpf. This should make it just as easy for examples to use libxdp as it currently is for libbpf. Some hoops need to be jumped through to make libxdp link against the same version of libbpf as the one we use in this repository. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>	2022-04-20 13:12:45 +02:00
Toke Høiland-Jørgensen	58fcc521b7	pping: Guard local definitions of AF_* constants in pping_kern.c The header files included from pping_kern.c include definitions of AF_INET and AF_INET6, leading to warnings like: pping_kern.c:25:9: warning: 'AF_INET' macro redefined [-Wmacro-redefined] ^ /usr/include/bits/socket.h:97:9: note: previous definition is here ^ pping_kern.c:26:9: warning: 'AF_INET6' macro redefined [-Wmacro-redefined] ^ /usr/include/bits/socket.h:105:9: note: previous definition is here ^ 2 warnings generated. Fix this by guarding the definitions behind suitable ifdefs. Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>	2022-04-14 21:29:32 +02:00
Toke Høiland-Jørgensen	5ca0416e9e	Merge pull request #43 from simosund/pping_dualdirection_flowstate PPing dualdirection flowstate	2022-03-31 17:58:25 +02:00
Simon Sundberg	2e7595b3ca	pping: Replace boolean connection state flags with enum The connection state had 3 boolean flags related to what state it was in (is_empty, has_opened and has_closed). Only specific combinations of these flags really made sense (has_opened/has_closed didn't really mean anything if is_empty, and if has_closed one would expect is_empty to be false and has_opened to be true etc.). Therefore, replace these combinations of boolean values with a singular enum which is used to check if the flow is empty, waiting to open (seen outgoing packet but no response), is open or has closed. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-30 18:12:46 +02:00
Simon Sundberg	4fb1f6de64	pping: Combine flow state in each direction to a dualflow state Combine the flow state entries for both the "forward" and "reverse" direction of the flow into a single dualflow state. Change the flowstate map to use the dualflow state so that state for both directions can be retrieved using a single map lookup. As flow states are now kept in pairs, cannot directly create/delete states from the BPF map each time a flow opens/closes in one direction. Therefore, update all logic related to creating/deleting flows. For example, use "empty" slot in dualflow state instead of creating a new map entry, and only delete the dual flow state entry once both directions of the flow have closed/timed out. Some implementation details: Have implemented a simple memcmp function as I could not get the __builtin_memcmp function to work (got error "libbpf: failed to find BTF for extern 'memcmp': -2"). To ensure that both directions of the flow always look up the same entry, use the "sorted" flow tuple (the (ip, port) pair that is smaller is always first) as key. This is what the memcmp is used for. To avoid storing two copies of the flow tuple (src -> dst and dst -> src) and doing additional memcmps, always store the flow state for the "sorted" direction as the first direction and the reverse as the second direction. Then simply check if a flow is sorted or not to determine which direction in the dual flow state that matches. Have attempted to at least partially abstract this detail away from most of the code by adding some get_flowstate_from* helpers. The dual flow state simply stores the two (single direction) flow states as the struct members dir1 and dir2. Use these two (admittedly poorly named) members instead of a single array of size 2 in order to avoid some issues with the verifier being worried that the array index might be out of bounds. Have added some new boolean members to the flow state to keep track of "connection state". In addition the the previous has_opened, I now also have a member for if the flow is "empty" or if it has been closed. These are needed to cope with having to keep individual flow states for both directions of the flow around as long as one direction of the flow is used. I plan to replace these boolean "connection state" members with a single enum in a future commit. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-30 18:12:46 +02:00
Simon Sundberg	75a979fc31	pping: Refactor parsing of packet identifiers Refactor functions for parsing protocol-specific packet identifiers (parse_tcp_identifier, parse_icmp6_identifer and parse_icmp_identifer) so they no longer directly fill in the packet_info struct. Instead make the functions take additional pointers as arguments and fill in a protocol_info struct. The reason for this change is to decouple the parse_<protocol>_identifier functions from the logic of how the packet_info struct should be filled. The parse_packet_indentifier is now solely responsible for filling in the members of packet_info struct correctly instead of working in tandem with the parse_<protocol>_identifier, filling in some members each. This might result in a minimal performance degradation as some values are now first filled in the protocol_info struct and later copied to packet_info instead of being filled in directly in packet_info. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-22 13:24:09 +01:00
Simon Sundberg	2935fb05cc	pping: Minor formating fixes Format code using clang-format from the kernel tree. However, leave code in orginal format in some instances where clang-format clearly reduces readability of code (ex. do not remove alginment of comments for struct members and long options). Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-22 09:11:53 +01:00
Toke Høiland-Jørgensen	b7d365dcaa	Merge pull request #41 from simosund/pping_ref_count PPing optimization: Tracking outstanding timestamps	2022-03-21 15:52:06 +01:00
Simon Sundberg	8c0084dd0e	pping: Update TODO about timestamp ref count Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-15 15:39:53 +01:00
Simon Sundberg	11f1d88742	pping: Keep track of outstanding timestamps Add a counter of outstanding (unmatched) timestamped entires in the flow state. Before a timestamp lookup is attempted, check that there are any outstanding timestamps, otherwise avoid the unecessary hash map lookup. Use 32 bit counter for outstanding timestamps to allow atomic increments/decrements using __synch_fetch_and_add. This operation is not supported on smaller integers, which is why such a large counter is used. The atomicity is needed because the counter may be concurrently accessed by both the ingress/egress hook as well as the periodical map cleanup. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-15 15:39:53 +01:00
Toke Høiland-Jørgensen	a6e4dd8748	Merge pull request #34 from simosund/pping_better_map_cleaning PPing - Improve map cleanup	2022-03-10 14:28:22 +01:00
Simon Sundberg	e7201c9eab	pping: Update TODO based on map cleaning improvements Also add description of three potential issues introduced by the changes to cleanup process. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-10 09:59:21 +01:00
Simon Sundberg	f22025f716	pping: More aggressive map cleanup Add conditions that allows removing old flow and timestamp entries sooner. For flow map, have added conditions that allow unopened flows and ICMP flows to be removed earlier than open TCP flows (currently both set to 30 sec instead of 300 sec). For timestamp entries, allow them to be removed if they're more than TIMESTAMP_RTT_LIFETIME (currently 8) times higher than the flow's sRTT. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-10 09:54:52 +01:00
Simon Sundberg	72404b6767	pping: Add map cleanup debug info Add some debug info to the periodical map cleanup process. Push debug information through the events perf buffer by using newly added map_clean_event. The old user space map cleanup process had some simple debug information that was lost when transitioning to using bpf_iter instead. Therefore, add back similar (but more extensive) debug information but now collected from the BPF-side. In addition to stats on entries deleted by the cleanup process, also include stats on entries deleted by ePPing itself due to matching (for timestamp entries) or detecting FIN/RST (for flow entries) Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-10 09:45:22 +01:00
Simon Sundberg	be0921d116	pping: Use BPF iterators to do map gc To improve the performance of the map cleanup, switch from the user-spaced loop to using BPF iterators. With BPF iterators, a BPF program can be run on each element in the map, and can thus be done in kernel-space. This should hopefully also avoid the issue the previous userspace loop had with resetting in case an element was removed by the BPF programs during the cleanup. Due to removal of userspace logic for map cleanup, no longer provide any debug information about how many entires there are in each map and how many of them were removed by the garbage collection. This will be added back in the next commit. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-10 09:45:15 +01:00
Simon Sundberg	d4c33259c7	pping: Improve program cleanup on shutdown Explicitly stop the thread which performs periodical map cleanup on shutdown, before attempting to free up any resources it might use. For the cleanup order to make more sense, also setup the perf-buffer before setting up the periodical map cleaning, so that the thread can be stopped as the first part of the cleanup. This also matches better with the next couple of commits where map cleaning debug information will be pushed through the perf-buffer. Finally, move the addition of the signalhandler earlier in the code to eliminate a window where it was possible to terminate the program without relevant cleanup code having a chance to run. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-09 19:20:51 +01:00
Simon Sundberg	6885771b14	pping: Allocate cleanup-args on main's stack Allocate the clean_args on the stack of the main function rather than the stack of setup_periodical_map_cleaning. The clean_args is used by periodical_map_cleanup from a different thread, so allocating them on stack for setup_periodical_map_cleaning which goes out of scope directly after the thread is created opens up for errors where later function calls may overwrite the arguments, causing unpredictable behavior. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-03-09 14:54:20 +01:00
Toke Høiland-Jørgensen	27861e384e	Merge pull request #39 from chenhengqi/update-btf-example Update btf example	2022-03-09 12:00:39 +01:00
Hengqi Chen	e307581009	BTF-playground: Handle error and free resources explicitly Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>	2022-03-09 09:04:38 +08:00
Hengqi Chen	c9398145ac	Ignore lib/libbpf-install/ Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>	2022-03-06 21:54:01 +08:00
Hengqi Chen	8301679918	configure: Fix mixed indentions Signed-off-by: Hengqi Chen <chenhengqi@outlook.com>	2022-03-06 21:52:48 +08:00
Jesper Dangaard Brouer	ce714625ce	AF_XDP-interaction: Fix function tx_pkt If there were more transmit slots, then we umem free the packet, but we continued sending it anyhow. The places tx_pkt() is currently used this never happened. Still fix the bug. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-21 19:03:49 +01:00
Toke Høiland-Jørgensen	dd24bce55c	Merge pull request #32 from simosund/pping_improve_capabilities PPing core improvements	2022-02-10 18:20:18 +01:00
Simon Sundberg	b8215e70b2	pping: Add section on potential issues to TODO-list Collect potential issues under a new section in the TODO list. These are issues I generally don't think are that severe, but may still be useful to note down and keep in mind. Move the section on potential concurrency issues from README to the new section in the TODO-list. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-10 16:17:10 +01:00
Simon Sundberg	2647429081	pping: Add warnings for failing to create map entry Send a warning notifying the user that PPing failed to create a flow/timestamp entry due to the corresponding map being full. To avoid sending a warning for every packet, only emit warnings every WARN_MAP_FULL_INTERVAL (which is currently hard-coded to 1s). Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-10 16:17:10 +01:00
Simon Sundberg	54886c92d9	pping: Refactor event handling code Refactor code for how events are handled in the user space application. Preparation for adding an additional event type which should not be handled by the normal functions for printing RTT and flow events. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-10 16:17:10 +01:00
Simon Sundberg	32bdf11a96	pping: Only consider flow opened on reply Wait with sending a flow open message until a reply has been seen for the flow. Likewise, only emit a flow closing event if the flow has first been opened (that is, a reply has been seen). This introduces potential (but unlikely) concurrency issues for flow opening/closing messages which are further described in the README. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-10 16:17:10 +01:00
Simon Sundberg	8a8f538759	pping: Do both timestamping and matching on ingress and egress Perform both timestamping and matching on both ingress and egress hooks. This makes it more similar to Kathie's pping, allowing the tool to capture RTTs in both directions when deployed on just a single interface. Like Kathie's pping, by default filter out RTTs for packets going to the local machine (will only include local processing delays). This behavior can be disabled by passing the -l/--include-local option. As packets that are timestamped on ingress and matched on egress will include the local machines processing delay, add the "match_on_egress" member to the JSON output that can be used to differentiate between RTTs that include the local processing delay, and those which don't. Finally, report the source and destination addresses from the perspective of the reply packet, rather than the timestamped packet, to be consistent with Kathie's pping. Overall, refactor large parts of pping_kern to allow both timestamping and matching, as well as updating both the flow and reverse flow and handle flow-events related to them, in one go. Also update README to reflect changes. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-10 16:16:24 +01:00
Simon Sundberg	928a4144a9	pping: Add RTT-based sampling Add an option (-R, --rtt-rate) to adapt the rate sampling based on the RTT of the flow. The sampling rate will be C * RTT, where C is a configurable constant (ex 1.0 to get one sample every RTT), and RTT is either the current minimum (default) or smoothed RTT of the flow (chosen via the -t or --rtt-type option). The smoothed RTT (sRTT) is updated for each calculated RTT, and is calculated in a similar manner to srtt in the kernel's TCP stack. The sRTT is a moving average of all RTTs, and is calculated according to the formula: srtt = 7/8 * prev_srtt + 1/8 * rtt To allow the user to pass a non-integer C (ex 0.1 to get 10 RTT samples for every RTT-period), fixed-point arithmetic has been used in the eBPF programs (due to lack of support for floats). The maximum value for C has been limited to 10000 in order for it to be unlikely that the C * RTT calculation will overflow (with C = 10000, overflow will only occur if RTT > 28 seconds). Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-10 16:11:21 +01:00
Simon Sundberg	c79c4e8571	pping: Eliminate flow creation/deletion concurrency issue Only push flow events for opening/closing flows if the creation/deletion of the flow-state was successful (as indicated by the bpf_map_*_elem() return value). This should avoid outputting several flow creation/deletion messages in case multiple instances are trying to create/delete a flow concurrently, as could theoretically occur previously. Also set the last_timestamp value before creating a new flow, to avoid a race condition where the userspace cleanup might incorrectly determine that a flow is old before the last_timestamp value can be set. Explicitly skip the rate-limit for the first packet of a new flow to avoid it failing the rate-limit. This also fixes an issue where the first packet of a new flow would previously fail the rate-limit if the rate-limit was higher than current time uptime (CLOCK_MONOTONIC). Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-10 16:08:23 +01:00
Simon Sundberg	1cadbe0ae7	pping: Make parsed protocols configurable Add command-line flags for each protocol that pping should attempt to parse and report RTTs for (currently -T/--tcp and -C/--icmp). If no protocol is specified assume TCP. To clarify this, output a message before start on how ePPing has been configured (stating output format, tracked protocols and which interface to run on). Additionally, as the ppviz format was only designed for TCP it does not have any field for which protocol an entry belongs to. Therefore, emit a warning in case the user selects the ppviz format with anything other than TCP. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-10 16:02:24 +01:00
Simon Sundberg	bd6ded5c21	pping: Add support for ICMP echo messages Allow pping to passivly monitor RTT for ICMP echo request/reply flows. Use the echo identifier as ports, and echo sequence as packet identifier. Additionally, add protocol to standard output format in order to be able to distinguish between TCP and ICMP flows. The ppviz format does not include protocol, making it impossible to distinguish between TCP and ICMP traffic. Will add warning if ppviz format is used together with ICMP traffic in the future. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-03 16:04:14 +01:00
Simon Sundberg	af5e660d8e	pping: Only match TSecr in ACKs The echoed TCP timestamp (TSecr) is only valid if the ACK flag is set. So make sure to only attempt to match on ACK packets. Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-02-02 10:14:44 +01:00
Jesper Dangaard Brouer	c14f52a4d4	Merge pull request #38 from xdp-project/vestas06_tc_qdisc TC policy example of overriding netstack TXQ	2022-02-01 15:35:35 +01:00
Jesper Dangaard Brouer	91432fe471	tc-policy: Update README with info on inspecting loaded BPF-progs Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-01 15:30:17 +01:00
Jesper Dangaard Brouer	71d1479d1d	tc-policy: Add more advanced bpftrace script Digging into the return value of netdev_pick_tx(). Want to be able to debug the case where a socket selects another queue_id. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-01 14:49:47 +01:00
Jesper Dangaard Brouer	ba2c114db0	tc-policy: Take into account locally generated traffic The BPF-prog "not_txq_zero" also needed to take into account that skb->queue_mapping usually isn't set for locally generated traffic. I worry that sockets can set another queue id that could override our (BPF choice) in netdev_pick_tx(). See sk_tx_queue_set() and sk_tx_queue_get(). Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-01 13:28:20 +01:00
Jesper Dangaard Brouer	a3af1bb99c	tc-policy: Implement new BPF section that disallow using TXQ zero Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-01 13:11:31 +01:00
Jesper Dangaard Brouer	741205e13c	tc-policy: Mention XPS need to be disabled Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-01 12:40:14 +01:00
Jesper Dangaard Brouer	f33789e4c6	tc-policy: rename xps_setup.sh to xps_setup_ash.sh This version of the XPS script have been modified to work with the shell ash. As bash was not avail on the Yocto target host. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-01 12:33:49 +01:00
Jesper Dangaard Brouer	cf5ecd6999	tc-policy: Adjust README for GitHub rendering The bpftrace oneliner was getting rendered wrong in GitHub html view. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-01 12:16:19 +01:00
Jesper Dangaard Brouer	e72d309789	tc-policy: Monitor TXQ usage with bpftrace script Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-02-01 11:56:32 +01:00
Toke Høiland-Jørgensen	007c0b6883	Merge pull request #37 from simosund/pping_graceful_shutdown pping: Add graceful shutdown on SIGTERM	2022-02-01 00:12:26 +01:00
Jesper Dangaard Brouer	520d2e6109	tc-policy: Add README documentation Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-01-31 21:40:18 +01:00
Simon Sundberg	de5d65030c	pping: Add graceful shutdown on SIGTERM Also intercept SIGTERM (in addition the the previously intercepted SIGINT) and perform graceful shutdown. Perhaps it also makes sense to perform graceful shutdown on some additional signals, like SIGHUP and SIGQUIT? Signed-off-by: Simon Sundberg <simon.sundberg@kau.se>	2022-01-31 18:47:08 +01:00
Jesper Dangaard Brouer	1bc9b91e3f	tc-policy: Verbose info when flushing TC-BPF programs Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-01-31 18:29:03 +01:00
Jesper Dangaard Brouer	c45bdfddfa	tc-policy: No need to call bpf_tc_query after bpf_tc_attach The attach_egress.prog_id have already been provided after calling bpf_tc_attach. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-01-31 18:25:32 +01:00
Jesper Dangaard Brouer	a00ed96ed6	tc-policy: Be more verbose, but add --quiet option Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-01-31 17:29:52 +01:00
Jesper Dangaard Brouer	78515fdd2e	tc-policy: Now --unload only remove our own TC-BPF prog Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-01-31 16:43:01 +01:00
Jesper Dangaard Brouer	665f26e06b	tc-policy: Make more clear that destroying hook kill all progs Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>	2022-01-31 15:55:06 +01:00

1 2 3 4 5 ...

470 Commits