2021-06-17 17:02:08 +02:00
|
|
|
* DSCP-preserving TC filters
|
|
|
|
|
|
|
|
This example shows how to use BPF to preserve DSCP values across an
|
|
|
|
encapsulating interface such as Wireguard. It relies on the encapsulation layer
|
|
|
|
preserving the skb->hash value across the encapsulation, which is commonly the
|
|
|
|
case on kernel encapsulation protocols (including Wireguard).
|
|
|
|
|
2021-06-18 01:30:36 +02:00
|
|
|
*PRESERVING DSCP MARKS ACROSS ENCAPSULATION LEAKS DATA! DON'T DO THIS IF YOUR
|
|
|
|
TUNNEL GOES ACROSS THE PUBLIC INTERNET!*
|
|
|
|
|
2021-06-17 17:02:08 +02:00
|
|
|
The example contains two filters: one that parses the packet header and reads
|
|
|
|
the DSCP value out of the packet, then stores that value in a map keyed on the
|
|
|
|
skb->hash value (which is calculated if it isn't already set). And the second TC
|
|
|
|
filter reads back the skb->hash value, looks it up in the map, and if found
|
|
|
|
rewrites the packet DSCP based on that value.
|
|
|
|
|
|
|
|
The idea is that the first filter is run on the internal (encapsulating)
|
|
|
|
interface, and the second is run on the physical interface that transmits the
|
|
|
|
encapsulated packet. To install the filters, run the userspace component like:
|
|
|
|
|
|
|
|
=sudo ./preserve-dscp <ifname pre> <ifname post>=
|
|
|
|
|
|
|
|
To unload the filters again, run:
|
|
|
|
=sudo ./preserve-dscp <ifname pre> <ifname post> --unload=
|
|
|
|
|
|
|
|
Note that unloading will remove the clsact qdisc from the interfaces entirely,
|
|
|
|
so don't run this if you want to preserve that; instead manually remove the
|
|
|
|
filters using =tc=.
|
|
|
|
|
|
|
|
** Caveats
|
|
|
|
There are a couple of caveats to this approach:
|
|
|
|
|
2021-06-18 01:30:36 +02:00
|
|
|
- As mentioned above, doing this in the first place *LEAKS DATA*! I.e., it makes
|
|
|
|
it possible for an outside observer to distinguish between different types of
|
|
|
|
traffic inside the tunnel. This is generally a bad idea, especially if the
|
|
|
|
traffic goes across the public internet.
|
|
|
|
|
|
|
|
- This only works for encapsulation protocols that preserve the SKB hash in the
|
|
|
|
first place.
|
2021-06-17 17:02:08 +02:00
|
|
|
|
|
|
|
- The userspace program will try to detect if the =pre= interface has an
|
|
|
|
Ethernet header by checking if the interface has a type of =ARPHRD_NONE=, and
|
|
|
|
if so will assume the packet starts with the IP header. If this heuristic
|
|
|
|
turns out to be wrong, the filter will fail.
|
|
|
|
|
|
|
|
- There is no sanity checking on the outer filter that the packets actually
|
|
|
|
come from the interface that we ran the =pre= filter on in the first place;
|
|
|
|
there is no general way to check this from BPF, but the =write_dscp= filter can
|
|
|
|
be amended to do some other sanity checks on the packet before modifying it
|
|
|
|
(such as checking port numbers).
|
|
|
|
|
|
|
|
- Since this relies on =skb->hash=, it is flow-based; if individual packets in
|
|
|
|
the same flow have different marks, which ones will be preserved is racy.
|
|
|
|
|
|
|
|
|