It can be difficult these days to keep up with all the new product releases and offerings that churn through the technological landscape. It seems that as soon as one technology matures, another technology rises up to take its place. Some of these tech trends don’t make it much further than product marketing, but some actually stick and begin to be widely accepted as practice.
From my perspective, the technologies that don’t persist are ones that don’t solve a “real” problem – sure, there might be marginal benefits to the platform, but not enough to change buying behaviors. The trends that stick are the ones that actually solve a big underlying issue, opening up greater efficiency and productivity. Server virtualization is a great example of a technology that “stuck.” When virtualization was first getting started there were a lot of misgivings, but it made a very compelling argument for itself with the resiliency, power savings, and hardware efficiencies that it offered. As a result, server virtualization is more or less standard these days, with software-defined storage following it closely.
Challenges of Modernizing Your Network
There is one element of IT that is lagging when it comes to modernization though – networking. If you look at some underlying principles behind network engineering, it makes sense. Networking at the core is a distributed communications system that uses very established protocols. No one entity “owns” the web, nor are updates rolled out across the world simultaneously. Tinkering with a protocol to “improve things” is more likely to cut you off from the rest of the world than help modernize your infrastructure. And in case of misconfiguration, a network outage is HIGHLY visible – if the network is down, so is everything else!
So, for those reasons, the network industry has been historically resistant to change. Configurations are still often done via the CLI. There is a steep tribal knowledge barrier to entry, and because networks can be taken down by forgetting a single keyword, any changes made to the network are tightly regulated. Networking follows the “if it ain’t broke, don’t fix it” mantra and that mantra has served it pretty well… for the most part.
I know I’m venturing into generalization, but I think networks are happiest when they are architected, configured… and then left to run. Nobody wants to be constantly logging in to the datacenter network infrastructure and making changes due to the massive blast radius when human error inevitably strikes. Human error is one of the leading causes for network outages, after all.
So, this mindset causes several issues when working with large “webscale” datacenters. Some of the key advantages that a virtualized datacenter brings is greater resilience, better efficiency and overall optimization. If VMs start to run into problems at the host level, a well-designed VMware environment will work around it seamlessly – via vMotion, HA, DRS, and other key technologies – and the users should never realize how close they were to a disaster behind the scenes. The VMware datacenter can be constantly refreshing, updating, optimizing, shuffling, and so on – all in the name of user experience and efficiency!
The problem is – how does this interact with a network that is designed to be static? Often the network is configured in a ‘set-it-and-forget-it’ style. VMware doesn’t natively have a great way to tell OSPF that a VM in subnet A has moved to the other side of the datacenter because it will run better over there. So, instead the virtualization engineers asked the network engineers to extend L2 everywhere so communication wouldn’t be broken when things moved around… and the network engineers were probably upset about it, but the project had already been approved. L2 extension does allow VMware to be as dynamic as it pleases, because RARP allows VMs to move around on the network and they don’t have to be re-IPed each time they move (which would cause all kinds of complexity with user access as well)… but it comes at a cost.
Those of you familiar with networking no doubt recognize some of the pain that is inherent with L2 communications. It’s fast, sure – but in its raw state, volatile. A L2 segment can often rightly be considered a failure domain. Broadcast storms cause headaches, MAC tables have hardware limitations, some flavor of STP is a necessity unless you have technology like TRILL in place, and so on. L2 is only meant to have a single path for data, which throws conventional redundancy options out the window. L3 by comparison is stable and resilient. There are many ways that networking manufacturers have tried to solve the L2 datacenter problem in network hardware – SPB, TRILL, Fabricpath, VPLS – the list goes on. While these protocols are running happily in many production environments, they can be very complex, require a good deal of configuration, and they require expensive hardware and licenses just to function.
I know that following my thought pattern is a bit of a rabbit trail here, but recall what I mentioned at the start of this blog – in order for a technological trend to catch, it has to solve a real problem. And hopefully I’ve illustrated that there is a real problem when pairing a dynamic system with an environment that resists change.
Virtual Extensible LAN (VXLAN) Protocol
Enter VXLAN, a newer network protocol that has been popping up a lot lately to solve that particular problem. At the core, VXLAN is a tunneling protocol that can encapsulate frames inside of packets. It orchestrates this through VXLAN Tunnel End Points (VTEPs), which act as entry points into the network fabric and construct and tear down tunnels between each other to shuttle traffic around the datacenter. VXLAN is not vendor proprietary – it was originally created in collaboration by Arista, Cisco and VMware – and the concept of tunneling is not a new one, but often the tunnels are constructed in hardware. VXLAN allows the creation of an “Overlay” network that can be defined by the virtualization environment and runs on top of the “Underlay” network that is created via physical hardware.
The “Underlay” network can be constructed with a L3 design in a rock solid, resilient configuration. Network engineers only have to ensure that there is basic IP connectivity across the datacenter, and that the MTU is large enough to accommodate the VXLAN header that is added to the frame. Once the Underlay is in place, it will not often have to change.
The “Overlay” network is where the real magic happens. Using VXLAN, software can rapidly construct tunnels between the host VTEPs that ride on top of the “Underlay” network, shaping the network as required by applications. If a VM needs to think that it has L2 adjacency to a VM on the other side of the datacenter, the VTEPs can construct a L2 tunnel between the two VMs. No hardware changes are required and again, the physical network hardware only needs to give basic IP connectivity between the two hosts. Even better, because the VTEPs in software keep track of MAC address tables and L2 connectivity, your hardware only needs to remember where the VTEPs themselves are and how to move traffic between the VTEPs. Everything else is out of sight and out of mind of your physical infrastructure. The real weight of the VM MAC address table is now in software, rather than inflating the cost of our hardware.
So, as I mentioned, there are several solutions out there that utilize VXLAN – Arista, Cisco’s ACI, VMware NSX, and others. We at Edge are particularly excited to be working with the VMware NSX platform. VMware has been virtualizing servers and abstracting storage for years now; it only makes sense that they would also tackle the last piece of the trifecta – the network!
VMware networking in its original state has some limitations. First, the standard vSS/vDS is not able to route traffic – it can only switch traffic. This results in some less than ideal traffic patterns, as traffic is hairpinned on the physical network to move between VMs in a segmented multi-tier application. In addition, the physical network has to be configured to support changes in the virtualized network environment… and if network change control is in place (as it should be), this can take a long time!
VMware NSX solves these issues by bringing routing, security, edge services, load balancing, NAT and more down into the virtualized environment. Virtualization administrators can now spin up all the network services that they need within the virtual environment. By bringing L3 intelligence all the way down into the host, the hairpinning problem is solved. By enforcing network traffic policies at the vnic level, even East/West traffic is now secured.
As a result, the physical network is more or less abstracted with VMware NSX. If VMs need to be on the same subnet to operate but they get moved to separate hosts, VTEPs on each host construct a VXLAN tunnel between the two hosts to allow the traffic to pass through as if they were still L2 adjacent. These tunnels are architected by the VMware NSX platform in software, allowing them to fulfill the promise of a Software Defined Data Center.
VMware NSX consists of several key components:
- NSX Manager
- NSX Controllers
- NSX vSwitch
- NSX Edge Services Gateway
The NSX Manager integrates with vCenter Server to coordinate management across the VMware environment. It’s important to note that if the NSX Manager goes down or otherwise becomes unavailable, NSX will continue to function.
The NSX Controllers share the burden of the control plane and coordinate network functions across the vSphere, ensuring that changes are kept in sync.
The NSX vSwitch is where L2, L3 and firewall decisions are made – within the host! This piece of NSX is responsible for a great deal of what gives the platform it’s “teeth,” so to speak.
Finally, the NSX Edge Services Gateway is a VM that provides additional services that are not included in the vSwitch – services like IPsec VPN, NAT, load balancing, DHCP, DNS relay, and more. Edge Services Gateways can also be deployed as Perimeter Edges, which bridge the gap between the physical network and your virtual environment. Using dynamic routing protocols like BGP, OSPF or IS-IS they can coordinate changes between the two environments and advertise any changes you make in the NSX environment.
All these components work together to create the software defined data center, allowing the network to finally interact with the virtualized environment in a coherent fashion. I expect to see a lot more NSX in the coming years.
There are many other benefits that NSX offers the datacenter that I haven’t covered in this blogpost – service-insertion for traffic filtering, microsegmentation, service composition, data loss protection and so on – the security features alone could fill up several articles. In honesty, I’ve only scratched the surface. If you’d like to learn more about VMware NSX and start going into specifics about how it can improve your environment, please contact the Edge team online or give us a call at 888-861-8884- we would be happy to help!