Part Three of a Five-Part Series on Software-Defined Data Centers in a Multi-Cloud World. In this blog, we focus on SDN for physical networks vs. virtual networks in space and cost-constrained environments.
In my last post, The “Easy Button” for SDN Control of Physical and Virtual Data Center Networks, I posed three questions regarding automating the data center network in pursuit of building a software-defined data center (SDDC) to support private cloud. In that blog I focused on question 1 – should I deploy open networking? – and highlighted how open networking switches have developed powerful, server-like control planes featuring multi-core CPUs complemented with plenty of RAM and flash storage.
In this blog on physical networks vs virtual networks, I will focus on two more sets of questions:
- Do I want my leaf-and-spine physical network to be deployed as a software-defined network (SDN) fabric, or am I comfortable with box-by-box configuration, operations, and troubleshooting? Are the cost and complexity of deploying SDN worth the effort? What are the various approaches?
- Do I want to create a virtualized network overlay fabric that creates a mesh of virtual tunnels between all endpoints (servers, storage, and other devices) and offers the ability to establish new network topologies and services in seconds? Are the cost and complexity of deploying a virtual network worth the effort? What are the various approaches?
When first rolling out a data center leaf-and-spine network, one needs to deploy physical switches for connectivity. Typical deployments have two 10 GbE/25 GbE connections from each server to a top-of-rack (TOR) switch, which in turn connects via 100 GbE uplink to a spine switch. In an open networking world, the network operator loads their choice of open source network operating system (NOS) via the Open Networking Install Environment (ONIE) onto the switch, with the NOS running on the switch CPU (e.g., from Intel) and programming the forwarding ASIC (e.g., from Broadcom).
In a leaf-and-spine network without SDN automation, each of these switches must be configured, managed, and monitored individually, typically through command-line interface (CLI), which consumes a lot of time from deeply technical networking experts. SDN can automate the underlay and make 10, 20, 30, or more switches look like one logical programmable entity. In other words, instead of managing 32 switches, for example, the operator manages a single logical switch with a single IP management address, dramatically increasing agility, simplifying management, reducing errors, and enabling less-skilled techs to operate the network.
Controller-based vs. Controllerless SDN Underlay Automation
How the automation is achieved is where things get interesting. The more traditional SDN approach is based on the Open Networking Foundation’s (ONF) OpenFlow protocol. In this architecture, all the network fabric state is kept in a centralized SDN controller, physically separate from the network switches, and the forwarding tables in the switches are programmed via the OpenFlow protocol. Examples of solutions that use this approach come from Big Switch Networks or leverage open-source controllers such as Open Daylight (ODL) or ONOS. To achieve high availability, typical best practice dictates three redundant controllers per location. This is certainly a workable approach for a single large data center, but it can run into economic feasibility challenges with smaller or geographically distributed sites because the network team needs to pay for and deploy three servers along with three controllers at every site. Two sites equal six controllers, three sites equal nine controllers, and so on.
The alternate approach is to deploy a distributed SDN function that runs as an application in the user space of every switch in the network – leveraging the power of distributed computing – with the state of the fabric distributed into an extremely space-efficient database running on every switch. This is the approach Pluribus takes with our Netvisor® ONE and Adaptive Cloud Fabric™. With this “controllerless” approach, there is no set of external controllers and thus no associated controller cost or unnecessary consumption of space and power, which can be problematic in constrained environments. There are a number of other benefits that come with the controllerless approach, including the ability to easily insert into brownfield networks, improved resiliency with in-band control, and the ability to seamlessly stretch across geographically distributed sites regardless of distance with no need for multiple controllers at every site.
Once SDN is deployed, the underlay can be managed as a single fabric, so it’s easy to make configuration changes or troubleshoot across all switches in the fabric with a single command or query. SDN can be used to set up any topology for the underlay, including Layer 2 or Layer 3 between leaf and spine, and automation increases agility and reduces operational costs and human errors significantly.
Once you have the underlay established, the next step is to virtualize the network by creating an overlay network. The overlay network is constructed by creating software tunnels between virtualized endpoints with an encapsulation technology such as VXLAN, software-based switches and software-based routers. This method of software abstraction is agnostic to the physical server connections and the underlying network topology.
Overlay network virtualization has a number of important benefits, including scalability, multi-tenant security and operational agility.
Underlay networks, including OpenFlow-based networks that only implement VLAN-based flow redirection without an overlay, are typically limited to 4094 VLANs, while VXLAN-based overlays enable scaling to over 16 million VLANs. This is especially valuable for multi-tenant service provider networks because it enables each tenant to control its own VLAN numbering and scale independently of other tenants up to the full range of 4094 VLANs.
Overlay networks also improve security in those multi-tenant networks by providing an additional layer of abstraction and full isolation between customers. That level of isolation is also highly valuable in other use cases, including separating Internet of Things (IoT) devices and applications from mission-critical corporate traffic and efficiently utilizing firewall ports and other security devices.
Overlays also enable far better scalability and service agility when stretching services across multiple sites and arbitrary topologies, because they decouple logical service provisioning from the underlying network topology and take advantage of the optimized resilience and efficient link utilization of standard Layer 3 underlay technologies, such as equal-cost multipath (ECMP) load balancing. With a virtualized overlay network running completely in software, new network services can be quickly established without having to touch the underlay.
Like SDN for the underlay, there are multiple approaches to network virtualization. One approach has the software switches and routers running on the same servers that run the application workloads. In this case, the VXLAN tunnels terminate into VXLAN tunnel endpoints (VTEPs) that run on those same servers as well. The pricing model for this approach is typically on a per-processor basis for each of the servers, often adding thousands of extra dollars per CPU. Then, in addition, a number of separate external servers are needed for management, network controllers, edge services gateways and more. This means extra cost, space and power for the servers, as well as multiple software licenses that must be purchased. Again, like SDN for the underlay, these overlay controllers need to be deployed in clusters of three for redundancy. Finally, because the packet processing in these solutions consumes compute power that would otherwise support application workloads, many in the industry are advocating that SmartNICs be deployed. These “smart” network interface cards (NICs) include CPUs and memory to offload network processing from the main server CPUs, increasing processing power but adding yet more expense and complexity, requiring multiple integration and configuration steps. Solutions that take this approach are Juniper Contrail, Nokia Nuage, and VMware NSX. Some data center operators will see benefits to this approach that outweigh the added cost and complexity, but generally, that will only be true for larger data center environments where having multiple servers for management and control is acceptable and where there are usually large IT teams that can manage integration complexity. In those cases, the software overlay solutions can be deployed over a Pluribus SDN controller underlay or even over a standard manually configured underlay.
A New Approach to Overlay Fabric Networking
The alternative approach for overlay fabric networking is to leverage the power of the CPU and the packet processing power of the forwarding ASIC in the Top of Rack switches that are being deployed anyway. Just like the SDN underlay, this is the approach that Pluribus takes with our Adaptive Cloud Fabric. In this case, the VXLAN tunnels terminate on the white box TOR switch and leverage the specialized ASIC from Broadcom to hardware-accelerate the packet processing and termination of VXLAN tunnels into VTEPs.
Layer 2 and Layer 3 unicast and multicast services are distributed throughout the overlay fabric, again leveraging the distributed processing power of the switches. Not only does this eliminate the per-CPU license expense and the optional per-server SmartNIC expense of other network virtualization solutions, but it also is controllerless – again, where no external controllers are needed – reducing space, power and cost, which is especially critical in smaller data center environments.
Underlay and Overlay Unified – It Just Works
In the case of Pluribus’ Netvisor ONE Network Operating System and Adaptive Cloud Fabric, the typical deployment is with unified SDN control of the underlay and VXLAN overlay fabric, with no external controllers needed. Not only is this approach very cost-effective, with extremely low space and power consumption, the unification of these two automation layers results in a simple deployment with minimal integration required. The solution works out of the box, from zero-touch provisioning (ZTP) of the switches to building an SDN-automated underlay to deploying the virtualized network fabric. It is simple, fast, and efficient, as you can see in this short four-minute video that shows the deployment of a simple four-switch fabric using our UNUM management system – UNUM Day-0 Automation. Once deployed, the solution is easily integrated through a northbound RESTful API with orchestration systems such as vCenter or Red Hat OpenStack.
In a blog earlier this year, I talked about the increasing trend of data center distribution and what we call distributed cloud. That is, fundamentally more and smaller data centers moving workloads toward the edge and closer to users and things to improve customer experience and enable new latency-sensitive applications. Most approaches to SDN control of the underlay and overlay are complex, costly, and consume a lot of space and power, and are not really well designed for smaller and geographically distributed mini data center environments. By using the processing power of the switches that are being deployed anyway for physical connectivity, Pluribus enables a very cost-, space- and power-efficient SDN underlay and virtualized overlay, which makes it feasible to deploy an SDDC in small/medium data centers and constrained edge data center deployments. This is not the final layer of automation we need, however, so in the next blog in the series I will talk about the importance of network analytics and how, once again, the power of open networking switches can be leveraged with clever software to deliver cost-effective yet very granular network telemetry.
Webinar replay: If you would like more detail on how Pluribus helps put SDDC and private cloud within reach for every IT team, then watch the replay of our webinar “Realizing the SDDC: Simple, Affordable SDN and Network Virtualization for Any Size Data Center.” In this webinar, I am joined by Drew Schulke, VP Product Management, Dell EMC, and Alessandro Barbieri, VP Product Management, Pluribus Networks. You can see the replay here.
Subscribe to our updates and be the first to hear about the latest blog posts, product announcements, thought leadership and other news and information from Pluribus Networks.
About the Author
Mike is Chief Marketing Officer of Pluribus Networks. Mike has over 20 years of marketing, product management and business development experience in the networking industry. Prior to joining Pluribus, Mike was VP of Global Marketing at Infinera, where he built a world class marketing team and helped drive revenue from $400M to over $800M. Prior to Infinera, Mike led product marketing across Cisco’s $6B service provider routing, switching and optical portfolio and launched iconic products such as the CRS and ASR routers. He has also held senior positions at Juniper Networks, Pacific Broadband and Motorola.