67 Commits

Author SHA1 Message Date
Ties de Kock b93d7a8477 Merge pull request #117 from cjeker/avoid_empty_deltas
Avoid adding empty deltas to the cache
2024-03-01 09:36:03 +01:00
Claudio Jeker a1c36d6450 Minor cleanup as suggested by @ties 2024-03-01 09:30:33 +01:00
Claudio Jeker 277cd1e584 Return true or false from AddData().
Returns false and aborts the data addition if there is no change in the delta.
Use this to shortcut applyUpdateFromNewState() which prevents sending out
notifications for empty deltas.
2024-02-29 14:20:58 +01:00
Job Snijders 7bec069a27 Merge pull request #112 from bgp/nodelay-switch
Add switch to disable TCP_NODELAY
2024-02-09 11:41:45 +00:00
Job Snijders 2dfc26e226 Merge pull request #107 from cjeker/naming_is_hard
Naming is hard
2023-12-21 16:21:03 +01:00
Claudio Jeker d7c85e3a24 Use netip.MustParsePrefix() instead of some ugly kludge 2023-12-21 15:45:52 +01:00
Claudio Jeker d1f8fb2b00 Another badly named function. CountVRPs does not return the number
of VRPs but the number of objects in the cache.
2023-12-21 14:22:09 +01:00
Claudio Jeker 2f8e5b6d2b Kill manualserial, not used 2023-12-21 14:20:43 +01:00
Claudio Jeker 2f252df7f9 More consistent naming, do not refer to the cache as just VRPs there
is more in it now.
2023-12-21 14:20:22 +01:00
Claudio Jeker a5294449cf Try to bring more consistency in naming things.
The toplevel JSON object is now RPKIList and it contains VRPJson, VAPJson
and BgpSecKeyJson objects.
2023-12-21 14:18:47 +01:00
Claudio Jeker 477b3148c4 Don't use vrp for SendableData elements. Either use 'sd' or 'item'
depending if it is the array of SD or a single item.
2023-12-21 14:18:09 +01:00
Claudio Jeker d48fbc79f4 Remove the AFI dependency for ASPA
The AFI was removed from the ASPA profile so don't expect it anymore.
Now RTR is still using an old idea of ASPA profile so there just
duplicate the object once for IPv4 and once for IPv6. At some points
SIDROPS may finally fix this but for now this allows to export ASPA
objects that follow the rpki-client JSON (which no longer has the
AFI in the ASPA table).
2023-12-21 11:17:27 +01:00
Jeremiah Millay 236c4bf0dd Optimize VRP struct sizes. Use netip.Prefix instead of net.IPNet. 2023-12-18 10:50:26 -05:00
Ben Cartwright-Cox 5bd081b90b Add switch to disable TCP_NODELAY
This in an attempt quell CPU usage, since PCCW is having issues when
their {N} number of routers all start RTR sessions at the same time.

A `perf` of the system suggests all of the CPU is going to sending
TCP traffic, and since golang enables NO_DELAY (infamously?) by
default, this is a good smoking gun, since this would imply that
stayrtr is sending 1 TCP packet per RTR PDU, something that would
indeed cause a lot of CPU usage in aggergate!
2023-10-13 12:18:51 +01:00
Ben Cartwright-Cox 62f5952776 Fix lock/slow sending due to a lock "moshpit"
Instead we now sort while processing, a much much safer place to
do it!
2023-03-01 14:15:18 +00:00
Ben Cartwright-Cox 28752753e0 Harden ^b2a79528c5d221f46bdd766ce9c448714f3b62d5
It appears that the sorting function can be prone to data races.
This commit puts a lock on that.

Tag: https://github.com/bgp/stayrtr/issues/92
2023-02-27 16:03:03 +00:00
Ben Cartwright-Cox b2a79528c5 Fix possible crash from ROA PDU Race Minimization logic
Should fix the issue at the bottom

Tag https://github.com/bgp/stayrtr/issues/92
2023-02-27 15:17:53 +00:00
Ben Cartwright-Cox 8a3a71e045 Ensure error PDUs are sent before the TCP socket closes
This was intro'd in 6e4c533e8a
Since I did not expect the sending PDU channel to be _slightly_
slower than just yeeting the socket closed instantly.

Regardless, TCP disconnection now happens when the sendloop is dead,
that should allow for Error PDUs etc to be sent out before

Tag: https://github.com/bgp/stayrtr/issues/90
2023-02-24 14:13:45 +00:00
Ben Cartwright-Cox fa548afcaf Rename BSK(s) (BGPsecKey) to BRK's to algin with rpki-client
And rename ASPA stuff to VAPs
2023-02-23 17:01:12 +00:00
Ben Cartwright-Cox 3555d81035 Add -disable.aspa flag
Just in case!
2023-02-23 12:21:27 +00:00
Ben Cartwright-Cox e98648f8b2 Implement draft-ietf-sidrops-8210bis-10 ROA PDU Race Minimization
It will now sort entries before they go out, Sorted by:

Largest CIDR > Largest Max Length > IP address
2023-02-22 17:36:06 +00:00
Ben Cartwright-Cox 187410d9b6 Implment ASPA as defined in draft-ietf-sidrops-8210bis-10
Tag: https://github.com/bgp/stayrtr/issues/79
2023-02-22 17:18:46 +00:00
Ben Cartwright-Cox 3b73956a9c Add PDU encode/decode support for ASPA 2023-02-22 15:17:26 +00:00
Ben Cartwright-Cox 539a99d76c More cleanup of unused functions and/or struct contents 2023-02-21 22:16:26 +00:00
Ben Cartwright-Cox caad9d419f Cleanup unused struct fields 2023-02-21 22:14:20 +00:00
Ben Cartwright-Cox 6e4c533e8a Fix client.sendLoop possibly leaking/CPU burning
Basically, until now it was not possible for it to actually
exit since the break only breaks the for loop it's connected in
2023-02-21 22:11:43 +00:00
Ben Cartwright-Cox 513bda0e5f Implement BGPsec support
This imports and exports BGPsec router key data, and exports router
key data out over RTR to supporting clients (any version higher than 1)

Since it's obvious that at some point there will be clients that will
have issues seeing a RouterKey PDU for the first time ever, I've
included a -disable.bgpsec flag to prevent them from being sent.

That way if someone is caught off guard during an upgrade, they can
disable it and keep upgrading.

Tag: https://github.com/bgp/stayrtr/issues/57
2023-02-21 21:55:50 +00:00
Ben Cartwright-Cox b08f5383ac Convert lib.VRP to *lib.VRP
This allows the previous commit to be fully effecitve.

Since some tests showed potential for a nasty set of pointer
edge cases to appear, I will be running rtrmon between this and
a known "okay" version for a few hours to confirm I have not broken
anything.
2023-02-21 21:15:13 +00:00
Ben Cartwright-Cox 925ac75c42 Move all []VRP's to []SendableData in prep to support non VRP things
This does a bunch of work (and it's not fully done, since VRP needs
to be converted into *VRP across the codebase to ensure that SetFlag()
works) to let what was the VRPManager diffing/dispatch system support
things that are not VRPs. We need to do this since we are looking
to support BGPsec Router Keys and ASPA objects soon. And a previous
attempt to write such support resulted in a unaccptable amount of
duplicate code.

Doing it this way will also make it a lot easier to extend StayRTR
to support whatever is after ASPA.
2023-02-21 20:40:00 +00:00
Ben Cartwright-Cox a9d36b4707 Fix BGPsec ROUTER_KEY encoding/decoding
Also add a test to ensure it keeps decoding correctly
2023-02-21 19:52:36 +00:00
Job Snijders bd5a54d54d Always automatically generate a RTR Session ID 2023-02-06 11:10:07 +00:00
Job Snijders d5be6983b5 Bugfix: don't echo the router's session_id back to the router, instead report an error
Previously StayRTR would copy the client's Session ID back into the Cache
Response send to the router, even though the cache's internal Session ID
was something different.

The purpose of the Session ID is to help both router and cache understand
whether they are synchronized or not. There are two opportunities to fix
desyncs: if the cache recognises the router is desynced, the cache informs
the router (through an Error Report) to reconnect and send a Reset Query.
If the router recognises it is out of sync with the cache, the router can
send a Reset Query.

According to RFC 8210 section 5.1 the cache should send "Corrupt Data" when
a router sends a Serial Query with an unknown Session ID:

```
  Session ID:  A 16-bit unsigned integer.  When a cache server is
    started, it generates a Session ID to identify the instance of the
    cache and to bind it to the sequence of Serial Numbers that cache
    instance will generate.  This allows the router to restart a
    failed session knowing that the Serial Number it is using is
    commensurate with that of the cache.  If, at any time after the
    protocol version has been negotiated (Section 7), either the
    router or the cache finds that the value of the Session ID is not
    the same as the other's, the party which detects the mismatch MUST
    immediately terminate the session with an Error Report PDU with
    code 0 ("Corrupt Data"), and the router MUST flush all data
    learned from that cache.
```

Reformat with gofmt from Ties
2023-02-03 21:37:30 +00:00
Ben Cartwright-Cox 13659dd27e Filter VRPs if they have expired. Prevent stale JSON files from lingering
First, VRPs that have expiry times are now checked, and they are
filtered out at import time.

Second, If a VRP JSON file is too old, and the "current state"
(in the case of a update) is too old, the state will empty to avoid
routing on old data.

Third, Every time a refresh cycle now happens, the file is reprocessed
to check for expiry, if the resulting VRPs from that procesing changes
then a new update+serial is pushed

Tag: https://github.com/bgp/stayrtr/issues/15
2023-01-24 17:50:15 +00:00
Ben Cartwright-Cox 13186622bd Improve internal error messaging to match standard convention 2023-01-19 12:17:23 +00:00
Ben Cartwright-Cox 15503e8347 Use IP.Equal rather than bytes.compare
IP.Equal handles some edge cases inside how IP addresses are represented
rather than just flat out comparing some byte arrays blindly.
2023-01-19 12:15:41 +00:00
Ben Cartwright-Cox 029060a6a1 Replace redudant errors.new(fmt.sprintf with fmt.errorf(
They serve the same function, but it's more understandable what
is going on. go-static-check raises this as a warning
2023-01-19 12:11:02 +00:00
Ben Castricum 4fef7114a3 Revert defer unlock in StayRTR AddVRPs
vrplock needs to be unlocked before AddVRPsDiff() because AddVRPsDiff needs a full lock.

I added some debug logging found this deadlock

INFO[0000] new cache file: Updating sha256 hash  -> da753c7804d6f386bf303fed6931853eaaca0771ba160ef7fdbebb17e899d78b
INFO[0001] New update (306189 uniques, 306189 total prefixes).
INFO[0001] RLocking vrplock in AddVRPs
INFO[0002] RLocking vrplock in AddVRPsDiff
INFO[0002] RUnlocked vrplock in AddVRPsDiff
INFO[0002] Locking vrplock in AddVRPsDiff
...
2022-01-26 11:20:46 +01:00
Darren O'Connor 3726782f68 Use defers for locks 2021-10-30 09:52:55 -04:00
Darren O'Connor 91228f65e3 remove unused 2021-10-27 20:59:40 -04:00
Ties de Kock 041a1c52f3 Remove ineffectual assign 2021-10-25 20:21:07 +02:00
Darren O'Connor 968c0d5db1 Move to TDD for clients 2021-10-24 19:30:55 -04:00
Darren O'Connor fe8a0f4632 initial client test set up 2021-10-24 15:26:47 -04:00
Darren O'Connor e4acfc2178 ifs to switch 2021-10-24 10:46:03 -04:00
Darren O'Connor a72ccbe4ad More cleanup 2021-10-24 10:37:42 -04:00
Job Snijders 0bbe564d58 Correct terminology helps communicate more clearly about what is happening
Validators (such as rpki-client) ingest ROAs and emit Validated ROA Payloads (VRPs).
RTR servers exclusively deal with ingesting VRPs and emitting VRPs via RTR.
2021-05-08 15:26:08 +00:00
Mathilde Gilles d6cb793104 Fix: unbounded alloc and slice out of bounds crashes
In rtrlib.Decode():
* Now check the message length is not greater than a hardcoded limit
(2048) to prevent unbounded memory allocations
* Fix a few unchecked slice accesses that could result in crashes with the
right payload in the PDU_ID_ERROR_REPORT case.
2020-07-22 00:56:58 +02:00
John Bampton fea0197495 Fix spelling 2020-07-02 17:36:26 +10:00
Louis 107c06a4d6 More debug options on GoRTR 2020-06-05 17:58:58 -07:00
Louis Poinsignon 60070fffdb Protection against "too many open files"
* Raised in #65, if the server does not have enough sockets, Accept returns error
* Due to a bug, it was causing `invalid memory address or nil pointer dereference` if no other limit was specified
* Issue was triggered around 1024 concurrent sessions on out of box Linux (check `ulimit -a | grep "open files"`)
2020-05-18 14:51:28 -07:00
lspgn fb7be39c6a Merge pull request #56 from cloudflare/feature/json-serials
Serial control
2020-03-30 13:36:05 -07:00