Virtualization

This article is more than 1 year old

Not that scary or that hard: Two decades of VLANS

Classic or encapsulated? You choose

Fri 30 Jun 2017 // 09:48 UTC

Sysadmin blog Next year will see the 20th anniversary of IEEE 802.1Q, the standard that defines the tagged VLAN for Ethernet networks. Despite it being two decades since modern VLANs started being used in anger a significant number of systems administrators remain afraid of them. Unfortunately, the time is upon us where VLANs are becoming a necessity even in small businesses.

VLANs are used to simulate the physical segmentation of networks without having to actually create separate physical networks. Let's say, for example, that I had two networks that were both set up to be 10.0.0.0/24, and for some reason I needed to run them on the same physical infrastructure - perhaps because of a merger - but they couldn't actually be on the same network.

Before VLANs, this was impossible. You simply could not have two systems with the same IP address on the same physical network. With VLANs, you can, so long as at least one of the networks is set up to be on a separate VLAN.

To understand the technical bits, it's easiest to start with a little bit about how tagging works in practice.

Tagging in practice

Implementing VLANs requires understanding the three basic port types. Access ports only allow untagged packets. Trunk ports only allow tagged packets. Hybrid ports allow both kinds of packets.

On most networks, end user devices plugged into switches don't have to configure a VLAN to access the network. Your Windows workstation, for example, probably just plugs in, gets an IP address via DHCP and you're off to the races. This doesn't mean that the traffic from that port isn't going to participate in a VLAN, it just means that anything attached to that port doesn't need to worry about it.

One can configure a switch to tag all traffic on an untagged port as a given VLAN. So, for example, I could say that all traffic in and out of ports 1-8 was VLAN 10, all traffic in and out of ports 9-12 was VLAN 20, and ports 13-16 were trunk ports that only passed tagged traffic.

If I plugged a computer into port 1 and port 8 they could talk to one another because they are both on LAN 10, however, a computer on port 1 could not talk to one on port 9. This is because port 1's untagged traffic is set for VLAN 10 and port 9's untagged traffic is set for VLAN 20. Similarly, if I plugged a computer into ports 13-16 they couldn't talk to anything because those ports would simply drop untagged traffic.

Also important is the concept of the native VLAN. When creating a switch fabric that includes VLANs switches need to know which VLAN will be connected to ports otherwise unconfigured for VLANs. By default, this is VLAN 1, but this can be changed.

Trunk ports don't need to be switch-to-switch only. A virtual server, for example, is a good candidate for using a trunk port. There's a reasonable chance that different VMs operating on the server will use different VLANs.

Virtual switches can be set up to operate with VLANs. In most hypervisors one can create multiple VM networks, each with a different VLAN and attach them to a single virtual switch. That virtual switch is then connected to a physical network card (or cards) which are connected to physical switches.

Assigning a virtual machine to a VM network configured to a specific VLAN means that all untagged traffic from that VM will be tagged by the virtual switch as belonging to the relevant VLAN. Configuring a virtual switch to allow the guest VM to handle tagging is possible, but it's usually a bad idea.

Personally, I tend to use hybrid ports for my virtual servers. I leave my management networking untagged and make all my VM traffic tagged. This way, if I screw something up in configuring VLANs I still can get to the management interfaces and rectify my problem.

VLAN security

To understand why guest VLAN tagging is rare, or why all ports aren't simply configured as hybrid ports, one must understand the security implications.

It's not hard for a server, switch or virtual machine to be configured to work with VLANs, so don't think that VLANs are some sort of security holy grail. VLANs are absolutely part of proper network segmentation and security; however, vigilance must be applied to ensure that workloads aren't given the opportunity to access networks they shouldn't.

This is why, as a general rule, administrators tend to limit VLANs to switches, both physical and virtual. The individual workloads should communicate with their switch untagged. An application administrator shouldn't have the opportunity to simply decide "I want to see what's going on in VLAN 20, so I'll configure my network card for that and start interrogating the network".

As an additional layer of security, network administrators will typically only allow trunk and hybrid ports access to VLANs which are required on that port instead of any arbitrary VLAN. This helps mitigate the security impact of, for example, accidentally setting a port to hybrid instead of access.

It may be appropriate for virtualized routers to have unrestricted tagged access to the network. It is probably appropriate for virtual servers to have restricted tagged access. There aren't a lot of other scenarios I can think of.

A slightly technical look

VLANs work by adding a 32-bit field between the source MAC address and the EtherType/length fields in TCP/IP packets. In other words a tagged VLAN's packet headers have extra information that a non-tagged VLAN's packet headers don't have.

Conventional (802.1Q) VLANs only allow for 4095 VLANs. Newer 802.1aq VLANs allow for up to 16.8 million VLANs, solving a major problem for large networks.

VLANs can be nested, with double tagging (sometimes referred to as Q-in-Q) being an accepted practice. This allows service providers (be they telecommunications carriers or cloud operators) to use VLANs to manage their own networks while allowing tenant networks to use VLANs without interference.

If the idea of "only" 4095 VLANs being considered "a problem" seems a little bit bonkers to you, you're not alone. Manually configuring thousands of VLANs on every switch – both physical and virtual – seems patently insane.

Despite this, even my small 5 man business uses 5 VLANs on our production network and 50 in our test lab. It isn't hard to see how larger organizations could run out 4095 VLANs in a right hurry, to say nothing of hyperscale providers such as public cloud operators.

This then sets us up for a discussion about dynamic VLAN registration and software defined networking.

GARP, GVRP, MRP and MVRP and VTP

Dynamically registering network attributes is not a new concept. Generic Attribute Registration Protocol (GARP) is old, defined in 802.1p and later incorporated into 802.1D, way back in 1998. GARP VLAN Registration Protocol (GVRP) is a means by which switches and standards-compliant connected servers could register attributes with the network dynamically, including VLAN membership.

Multiple Registration Protocol (MRP) and Multiple VLAN Registration Protocol (MVRP) were the replacement protocols for GARP and GVRP. MRP and MVRP were designed as a less bandwidth intensive versions of their predecessors that also allowed for faster network convergence (response to change) times. This became increasingly critical in the mid 00s as networks with large numbers of VLANs began to appear.

Both of these have been around so long that you'll find support for them even on cheap-as-chips Netgear switches. Of course, some vendors will claim that "there isn't widespread support" and thus push their own proprietary versions of dynamic registration protocols.

Cisco's version of the above is VLAN Trunking Protocol (VTP). Unlike the standards-based versions which rely on advertisements, VTP uses a client-server model to distribute VLAN information. I will leave it up to the real network nerds to engage in debate about why one or the other is better.

What's worth noting is that many virtual switches – most notably VMware's – don't support these protocols. This means that if one were to create a VM network on a virtual host it would not advertise to the rest of the network. Network administrators would still have to manually allow the switch ports connected to that host's network cards to carry that VLAN's traffic.

To say that this is highly inconvenient is putting it rather mildly.

Spanning Tree and Shortest Path Bridging

Spanning Tree Protocol (STP) is a means of allowing switches to cope with multiple interconnections without getting into nasty broadcast loops that can bring down entire fabrics. Think about that time you thought "I need more speed between these two switches", went ahead and plugged a second cable between them and watched the whole network collapse. STP is the thing that prevents that.

STP was defined in 802.1D and restated in 802.1Q along with more grown up versions 802.1w and 802.1s and then completely supplanted by the more VLAN aware Shortest Path Bridging (SPB) in 802.1aq, which was approved in 2012. SPB is generally considered one of the most significant changes in Ethernet's long history, and it's not hard to see why.

STP was significant because it allowed organizations to wire up redundant links between switches without cratering the whole network, adding reliability that would otherwise not be possible until rapid reconvergence software defined network fabrics emerged two decades later. It made modern networks possible.

SPB allows organizations to not only wire up redundant links between switches, but to use all those links simultaneously. Previous attempts (such as TRILL) were either proprietary or suffered from practical setbacks that severely limited their deployment.

SPB can understand double tagged VLANs and ensure that packets from a given VLAN follow the same path through the network. This ensures that a given VLAN's packets don't have wildly varying latency. It also supports 16M VLANS instead of the classic 4k.

In other words SPB allows you to wire your network up like Dr Seuss' worst nightmare and it will not only not break, it will operate efficiently. Basically, it's black magic.

Encapsulation

The solution preferred by VMware (and some of the other SDN players) to all of the above is to bypass switch awareness of VLANs altogether. By all means, set up a resilient physical network fabric that uses links efficiently, but VMware believes control of VLANs should be up to software.

Instead of using 802.1Q VLANs, VMware prefers the use of the VXLAN encapsulation protocol, which does what it says on the tin. It encapsulates packets to segment networks instead of changing the packet header. (STT and GRE are competing encapsulation protocols, but VXLAN appears to be winning.)

VXLAN has a lot of advantages over VLANs. For example, it's routable, and can handle 16M VLANs. Its routability means that virtual networking can be more easily accomplished across physically disconnected networks, allowing administrators to shrink the layer 2 failure domain.

While a single large network with lots of interconnects between switches can provide for high throughput without needing expensive routers, with classic VLANs a single bad NIC transmitting bad frames can wreck a VLAN or even the entire fabric.

The future

That said, neither classic VLANs nor encapsulated VLANs are the whole of the solution. What administrators both need and want is for both to interoperate and do so in a dynamic, scriptable and centrally controllable fashion.

Unfortunately, getting to that utopia would require various tech industry titans to stop trying to create monopolies and actually work together. In other words: that isn't going to happen any time soon.

For the foreseeable future classic and encapsulated VLANs will have to be managed separately, and usage of both will likely increase as networks grow increasingly complex. Best practices calling for network segmentation and the increasing pressure of regulatory compliance will drive adoption, even within small networks.

VLANs aren't scary. Even the newfangled SDN VXLANs aren't all that hard. If you haven't taken the plunge, it's time to experiment. Good luck. ®

Topics

Special Features

Vendor Voice

Resources

Virtualization

Not that scary or that hard: Two decades of VLANS

Classic or encapsulated? You choose

Tagging in practice

VLAN security

A slightly technical look

GARP, GVRP, MRP and MVRP and VTP

Spanning Tree and Shortest Path Bridging

Encapsulation

The future

More about

More about

Narrower topics

More about

More about

More about

Narrower topics

TIP US OFF

Other stories you might like

Huawei's cloud unit is its current growth vehicle

Tencent Cloud to revisit design after circular dependencies slowed emergency API fix

Alleged cryptojacker accused of stealing $3.5M from cloud to mine under $1M in crypto

Reducing the cloud security overhead

Alibaba Cloud reveals network telemetry tool that helped cut number of engineers needed by 86%

Backblaze cloud storage buzzes with added Event Notifications

AWS must pay $525M to cloud storage patent holder, says jury

SharePoint logs are easily circumvented and Microsoft is dragging its heels

US-EAST-1 region is not the cloudy crock it's made out to be, claims AWS EC2 boss

Huawei Cloud reveals the dynamic traffic allocation system it uses to cut bandwidth bills

Irish power crunch could be prompting AWS to ration compute resources

Alibaba Cloud slashes prices outside China

About Us

Our Websites

Your Privacy