Automating Scale-out IP Fabrics Part 2: Simplifying Underlays with BGP Unnumbered

SHARE:

Happy Birthday to Pluribus!

It gives me great pleasure to post this blog on Pluribus Networks’ 10th anniversary. Since its founding, Pluribus has always been focused on simplifying networking, so I can think of no better way to celebrate our innovation journey and our current market momentum than by talking about our latest innovations to make networking simpler.

In my recent blog, Automating Scale-Out IP Fabrics with the Adaptive Cloud Fabric, I focused on how new features in the Adaptive Cloud Fabric simplify building layer 2 (L2) overlay services. (Check out this blog to learn more about how the Adaptive Cloud Fabric enables overlay services using VXLAN tunnels over a L3 underlay fabric.) Today I will touch on a few customer applications for the new L2 overlay services, and then focus on another way Pluribus is simplifying networks: using BGP Unnumbered to build L3 underlay fabric. All of these capabilities are supported in the latest Netvisor® ONE network operating system release 5.2.0, now generally available.

 

Multipoint Bridge Domains Take Off

The most powerful of the new L2 overlay services are unquestionably multipoint bridge domains with flexible encapsulation, including QinQ. As with all overlay services of the Adaptive Cloud Fabric, it is extremely simple to create a fabric-wide bridge domain that stretches L2 connectivity across multiple sites with a single command. The automation built into the distributed control plane of the Adaptive Cloud Fabric takes care of the rest.

Every day as we talk to customers we are finding new applications for this powerful capability, such as:

  • Supporting active-active operations across two or more geographically distributed data centers for high application availability.
  • Connecting legacy infrastructure that relies on L2 connections, such as storage or security appliances, into a modern private cloud fabric without needing to change VLAN assignments.
  • Delivering multipoint Carrier Ethernet services, such as E-LAN and E-TREE, with automated one-touch provisioning across any arbitrary underlay transport technology or network topology.

I have no doubt that Pluribus customers will find a wide range of other applications for multipoint bridge domains that we haven’t even considered.

 

Simplifying the Underlay Fabric with BGP Unnumbered

Before a network operator can create these powerful, automated, fabric-wide overlay services like a multipoint bridge domain, of course, there must be underlay network connectivity. In the last blog, I noted that the industry best practice for building scale-out data center fabric architectures is to create a Layer 3 underlay fabric with a multi-stage Clos topology, such as the pod shown in Figure 1.

diagram: Data center pod example with four spine and 8 leaf switches (Layer 3 underlay fabric)
Figure 1: Data center pod example with four spine and 8 leaf switches

An increasingly popular approach for building a Layer 3 underlay fabric is to use Border Gateway Protocol (BGP). While originally designed to connect autonomous networks of different operators, BGP has proven very adaptable to many other applications and has been extended to data center leaf-spine networks. It has numerous advantages over other L3 routing protocols, such as IS-IS or OSPF, including simplicity, efficiency, and availability of mature and robust open-source implementations, including FRRouting.

However, since BGP wasn’t originally designed for Clos fabrics in a data center, which is characterized by dense mesh connectivity, it did need a few tweaks in order to work simply and efficiently. Some of these had to do with the routing protocol design itself, including allowing more rapid routing updates, reducing route propagation traffic, and supporting multi-path forwarding. Today we’ll focus on a particularly clever innovation known as BGP Unnumbered. But first, we need to understand what problem it addresses.

 

Why Do We Need BGP Unnumbered?

BGP traditionally requires each neighboring node (peer) to have an IPv4 address in order to exchange IPv4 reachability information. In a leaf-spine topology with a large mesh of ports and links, each interface facing a peer needs an IPv4 address configured and the total number of IPv4 addresses grows rapidly with network size. Even the simple data center pod from Figure 1, with four spine switches and eight leaf switches, requires 64 IPv4 addresses to be configured. Figure 2 depicts this, with each red dot representing an IPv4 interface address that is configured solely to enable BGP to work.

diagram: 64 IPv4 addresses required for four spine and 8 leaf switches without BGP Unnumbered
Figure 2: 64 IPv4 addresses required for four spine and 8 leaf switches without BGP Unnumbered

Not only does this waste a lot of IPv4 addresses, configuring all of those addresses, box-by-box, port-by-port, is time-consuming and error-prone and can lead to troubleshooting challenges.

As networks grow, this problem gets exponentially worse. Need to add one leaf cluster (two switches)? That’s 16 more IP addresses to be assigned, configured and tracked. Building an 8-spine, 96-leaf fabric? That’s 8 x 96 x 2 = 1536 addresses!

 

How Does BGP Unnumbered Help?

The architects of BGP Unnumbered recognized that there was another way for BGP peers to exchange reachability information without using IPv4 addresses, and the key is actually IPv6.

The essential clever idea of BGP Unnumbered is contained in the full title of the IETF standard RFC 5549, Advertising IPv4 Network Layer Reachability Information with an IPv6 Next Hop. IPv6 includes the concepts of a link local address, an IPv6 address that is automatically generated for a link between directly connected routers, and a router advertisement (RA) protocol to ensure neighboring routers automatically discover each other, including their neighbor’s MAC address (as an option). Even if a network is not using IPv6 for anything else, if the devices in the network are IPv6 capable, they can use these mechanisms for discovery.

RFC 5549 builds on these inherent capabilities by allowing BGP to use an IPv6 address as the “Next-hop” address when advertising IPv4 routes, rather than requiring an IPv4 Next-hop. Then, each router creates a static IPv4 ARP table entry, associating a local IPv4 address with the neighboring router’s MAC address, and uses that local IPv4 address for IPv4 packet forwarding.

The net result of these techniques is that ZERO addresses need to be configured on the fabric links. This dramatically reduces the number of configurations needed and thereby simplifies the task of building and growing BGP-enabled leaf-spine underlay networks.

 

Networking – Simplified.

The above examples of multipoint bridge domains and BGP Unnumbered are just two of the latest innovations supported in Netvisor ONE release 5.2.0. With each new release we continue to extend our leadership in delivering data center network fabrics that are fully automated, from the leaf-spine underlay to overlay virtualized network services with deep network segmentation, in order to radically simplify operations. That’s what we call networking simplified.

SHARE:

SHARE:

Subscribe to our updates and be the first to hear about the latest blog posts, product announcements, thought leadership and other news and information from Pluribus Networks.

Subscribe to Updates

SHARE:


About the Author

Jay Gill

Jay Gill

Jay Gill is Senior Director of Marketing at Pluribus Networks, responsible for product marketing and open networking thought leadership. Prior to Pluribus, he guided product marketing for optical networking at Infinera, and held a variety of positions at Cisco focused on growing the company’s service provider business. Earlier in his career, Jay worked in engineering and product development at several service providers including both incumbents and startups. Jay holds a BSEE and MSEE from Stanford and an MBA from UCLA Anderson.