In the 2000s, data center network infrastructure for large companies and large compute farms was built based on a three-layer hierarchical model. This traditional network architecture is also known as a three-tier architecture, which Cisco in turn calls the “hierarchical inter-networking model.” It consists of core layer switches ($$$), which connect to distribution layer switches ($$, sometimes called aggregation switches), which in turn connect to access layer switches ($). Access layer switches are frequently located at the top of a rack, so they are also known as top-of-rack (ToR) switches.
Figure 1: The hierarchical model in a data center network switching architecture
The good news with this hierarchical model is that traffic between two nodes in the same rack, if at Layer 2 of the network stack, is sent with low latency. If the access switches are 10G, then this communication can have high throughput as well. Also, this type of configuration allows for a vast number of ports at the access layer.
However, the problem with this traditional network design is that it’s expensive and not deterministic – east-west communication between racks of gear has traffic traveling to the aggregation layer and frequently to the data center core. These multiple hops, frequently across oversubscribed backplanes, can take a very long time – 50 us. Any Layer 3 traffic needs to leave the rack and reach the aggregation tier of switches before being routed, even back to the same rack it came from.
The advent of server virtualization exacerbated these problems with traditional networking as east-west traffic became even more prevalent, because virtualization essentially randomized the locations of the virtual machines (VMs). With a traditional network architecture, the data center manager could load a rack with components that were likely to communicate with each other (say, application servers and database servers). With virtualization, those components could be anywhere within the virtualized infrastructure. Virtualization also pushes the limits of IP addressing. For example, the maximum number of VLANs is 4,096 (a limit based on the IEEE 802.1Q standard), which can drive artificial limits within a virtualized facility. While a facility might naturally need thousands of VLANs for multi-tenancy, because of the VLAN limit the facility might need to be divided into multiple small virtualization clusters. This limits resource management options, for example, preventing a VM from being able to be moved to the least loaded server if that server is in some other cluster.
There are additional problems with traditional networking. Once this hierarchical infrastructure is put in place, change is difficult. Another rack of gear not only means another ToR switch, but possibly another aggregation switch or even more ports in the core switch. Furthermore, visibility into traffic is limited and debugging is a challenge. Quality of service, traffic prioritization and packet or traffic capture for regulations or debugging are all challenges (or downright impossible). In fact, many network admins need to worry not only about mean time to recovery (MTTR) from a problem, but also MTTI – mean time to innocence.
The 2010s saw the rise of leaf-and-spine networks with a more modern design that “flattens” the more traditional hierarchical network to increase performance for east-west traffic and provide a scale-out architecture where it is simple to add capacity. Those networks, inspired by the hyperscalers, remove the aggregation layer, and each ToR switch becomes a leaf that is connected to two different spine switches, which makes every leaf in its pod only one hop away. This full mesh of leaves and spines is also known as a Clos fabric – it effectively looks like a large chassis based-switch with leaves acting as line cards and the spine acting as a backplane. East-west data flows within this architecture take the same number of hops on the network regardless of source and destination and can be load-balanced and loop free, enabling predictable delay and latency times for information traveling through the network and increasing resiliency. Part of the predictability of this topology comes from a Layer 3 routing paradigm that interconnects the two-layer design. This network can dynamically route traffic to the best path based on network changes and traffic spikes and moves past the limitations of Spanning Tree Protocol (STP) network deployment topologies.
Figure 2: A more modern leaf-and-spine data center network architecture
More recently, the second half of the decade has seen the rise of software-defined networking (SDN) and network virtualization in the data center, as well as the concept of the overlay and the underlay. In this scenario, the role of the physical leaf-and-spine underlay discussed in the previous paragraph is to provide an IP-based fabric foundation for a virtualized overlay. This physical underlay can either be configured and managed box by box or automated with an SDN solution where a change can be made to all ToR switches with one command, reducing operational costs and improving configuration consistency. These SDN underlay solutions are often implemented with external controllers that hold the state of the network, such as OpenFlow-based solutions. However, this approach is often expensive in small/medium data centers and has other limitations such as greenfield-only requirements, single points of failure and more. Furthermore, it becomes prohibitively expensive in a multi-site data center or edge deployment because three controllers are required for redundancy at every data center location, and it also requires a controller of controllers in a multi-site scenario, increasing complexity and cost. Pluribus Networks takes a controllerless approach with our Adaptive Cloud Fabric™, leveraging the multi-core CPU processing power of the switch that is being deployed for the physical connectivity in the first place. You can read more about this in the blog Perspective: Controller-based vs Controllerless-based SDN Solutions.
Network virtualization is the creation of an overlay network in software with network connections defined completely in software. This is achieved by instantiating VXLAN tunnels across the Layer 3 IP fabric underlay for the data plane and some sort of “fabric” control plane solution. The control plane could be a solution like EVPN or a much simpler pre-built solution like the Pluribus Adaptive Cloud Fabric. In addition, these tunnels are paired with virtual switches, virtual routers and virtual load balancers, so multiple networks can be instantiated completely in software and abstracted from the physical underlay, and Layer 2 and Layer 3 services can be delivered within this underlay. This allows the network operator or application developer to spin up and modify virtual networks and associated network services very quickly without touching the underlay network. The overlay solution also solves the 4,096 VLAN scaling challenge in that it supports over 16 million VLAN addresses. In addition, the underlay layer does not contain any “per-tenant” state; that is, devices do not maintain and share reachability information about virtual or physical endpoints. With the overlay network fabric one can instantiate networks on a per-tenant or per-application basis very easily.
Many overlay approaches are implemented such that the VXLAN tunnels terminate in the host compute layer at VXLAN tunnel endpoints (VTEPs). While this approach has some advantages, it can be extremely expensive with per-CPU licensing costs for every host, the need for SmartNICs to preserve host CPU cycles for applications and the need for external controllers (in addition to the SDN controllers already deployed). Pluribus Networks takes a controllerless approach here as well, leveraging the multi-core CPU built into the switches already being deployed, as well as taking advantage of the hardware acceleration available in the on-board packet processing ASIC to terminate the VXLAN tunnels. This provides a very cost-effective solution that easily stretches across geographically distributed data centers. This approach also has the advantage of unifying the underlay and overlay automation framework, providing a solution that works out of the box, ideal for overstretched IT teams. You can read more about this in the blog SDN for Physical and Virtual Networks in Space- and Cost-Constrained Environments.
Where Do I Learn More?
This blog answers the question “what is traditional networking?” and touches on innovations in data center networking over the last two decades. If you want to learn more about Pluribus Networks’ approach to modern network architectures, SDN, network virtualization and data center network automation check out our recent blog Automating Scale-Out IP Fabrics with the Adaptive Cloud Fabric or the more in-depth Technical Brief: Achieving a Scale-Out IP Fabric with the Adaptive Cloud Fabric Architecture.
This post was originally published in July 2012, and was updated by Mike Capuano in December 2019.
Subscribe to our updates and be the first to hear about the latest blog posts, product announcements, thought leadership and other news and information from Pluribus Networks.