Do-it-yourself đ¨ automation for BGP anycast (introducing one next-hop to rule them all đ)
If you donât know where youâre going, anycast can take you there
â Alice in Wonderland by Lewis Carroll (paraphrased)
Ivan was at it again last week, this time trying to solve gnarly multi-pathing issues in BGP route reflector constituencies. Using his evolving Netsim Tools to trick a bunch of virtual Cisco IOS nodes into sending a few extra paths based on RFC7911 (Add path), he eventually did get the job done (of course). But at what cost?
If you can get them, iosv nodes run as VMs that require a minimum of 3GB of RAM each (or something in that range â if I am wrong, let me know). Most laptops only have 8GB these days, so most people wouldnât be able to try out the 7-node topology illustrated above â at least not easily.
In another recent thread, Ivan explored the use of anycast in MPLS networks â and just today he followed that up with Building a BGP Anycast lab. The idea of using anycast in networks is far from new, but most use cases seem to revolve around the use of anycast towards âexternalâ clients of the network. In this blog, I will be exploring the use of anycast for BGP next hops.
Anycast for BGP next hops
Everything has already been thought of and written, but not by me, and not for you. Unsurprisingly, even the idea of using anycast inside the network is a well worn path: See for example this section of the Advanced Solutions Guide for Nokia 7750 SROS release 9.0R1, from 2011 (part of Alcatel-Lucent at the time).
In the context of the problem at hand, the idea is illustrated above: Instead of having the route reflector announce multiple paths by enabling âadd pathâ, we âsummarizeâ the path by introducing anycast loopbacks on the edge nodes c and d. That way, the route reflector can happily keep on sending out its single best path to y, no need to introduce a maintenance window to negotiate a new capability. Moreover, the number of paths advertised towards peers doesnât change, so there is no risk of running out of resources. Some people even suggest that âAdd Pathâ opens up new possibilities for DDoS attacks â that may or may not be the case, but hopefully you can agree with me that this looks like a good alternative worthy of further exploration.
In Ivanâs approach each iBGP route reflector client already uses ânext-hop-selfâ to change the next hop to its local loopback address (instead of an interface link address, which may not be reachable to other nodes). From there it is but a small step to use an alternative IP address:
SR Linux does support ânext-hop-selfâ, but it does not (yet) allow us to set the next hop to an arbitrary IP address (given that this is not commonly used in data center contexts). Not to worry â this gives us an excuse to introduce the Nokia Virtual Service Router (VSR) into the mix. Fully integrated with Containerlab using VRnetlab, this puts us in a position to lab this out.
Update 2021â12â15: For those interested, here you can find a more elaborate comparison of Add Path versus Anycast nexthops, using Netsim-Tools to compare both.
Containerlab for the winđĽ
Without further ado: https://github.com/jbemmel/srl-mpls-iot/tree/main/labs/sr-bgp-anycast contains the complete lab setup plus some detailed notes on how to set things up. Note that the 2 VSR nodes do require a license to run (not included).
Similar to Netsim Tools, Containerlab offers a template-based config system to create and apply custom configurations. For this exercise I have created (or rather mostly copy&pasted, really) 2 templates, one for each NOS platform:
- SRL nodes (a,b,m,y,rr)
- SROS nodes (c,d)
In this case I opted to use BGP both as an IGP (eBGP on each link) and towards the route reflector (iBGP), mostly because the original templates already had that figured out. One could use ISIS or OSPF or even static routes â it doesnât really change much for the purposes of this discussion.
BGP route policies
All nodes are configured with the following export policies:
- Towards eBGP peers (âunderlayâ): exchange only /32 system ips and anycast loopbacks, both from the local node and any neighbors
- Towards the Route Reflector: only export external routes (modeled as /24 prefixes)
This way, anycast routes follow the optimal paths through the network, and route tables are reduced to a minimum.
As illustrated above, each node has a route to the system ip of each other node, and a few local /31 peering prefixes/interfaces. The route of interest is the âanycastâ route to 10.42.42.0/24 (advertised by y).
Rewriting the AS path to include a special âanycastâ value (64999) isnât strictly necessary, but it helps operators identify these âspecialâ routes.
Finally, on node a:
Node a has a shortest path via c:
Q.E.D.
- Optimal forwarding at each point in the network
- Minimal routes exchanged with the Route Reflector; the RR is not aware of local anycast resolution
- Only a local change required at nodes c & d:
+ Create anycast loopback interface (and announce in IGP)
+ Modify policy to rewrite next-hop (from self to anycast ip) - No maintenance window
And last but not least: There exist open source routing stacks (like FRR) that support setting the next hop to an arbitrary IP. In other words: Nothing stops you from solving this problem yourself, a #truly open solution.
Next hops đŁ
OK, so if optimizing multi-path forwarding is as simple as instantiating an extra loopback interface with an anycast IP on a suitable sets of nodes, couldnât we automate that?
Or maybe, instead of changing next hops in export policies on each node, couldnât we simply instantiate only anycast loopback interface(s), and have some âintelligentâ Route Reflector take care of optimizing next hops where possible?
đ¤