Solved: Re: Purpose of the native VLAN in modern networks?

a1111 · ‎09-24-2024

Hello,

Can someone please shed some light on the purpose of a native VLAN in a modern network?

I’ve seen the classic explanation, as outlined in this previous post:

https://community.cisco.com/t5/switching/why-native-vlan-is-used/td-p/2694265

Ahmed Muhi's phrasing captures it well:

Suppose you have the topology: SW1 <-> hub <-> SW2.

So, the switches are connected to each other via a hub. Both switches have end devices in different VLANs, and the hub has end devices plugged in as well.

The link between the switches and the hub can’t be an access link. If it were, only devices in the same VLAN as those two links could communicate. For example, if the access VLAN between the hub and switches is VLAN 10, only devices in VLAN 10 would communicate, while others in different VLANs wouldn’t.

So the links must be trunks. Hubs aren't VLAN-aware, which means they can't remove dot1q tags. Most end devices aren't. As a result, the devices that are plugged into the hub are in the native VLAN.

This explanation makes sense, but hubs are hardly used anymore. Is the native VLAN still relevant in modern networks? What is its purpose now?

Some forum posts say that DTP, STP, LLDP, CDP, VTP, PAGP, LACP, LLDP, UDLD etc(???) traffic uses the native VLAN. This is contradicted by other posts which say that VLAN 1 is used for this purpose -- even if you change the native VLAN from VLAN 1 to some other VLAN. What everyone agrees on is these protocols still work, even if the native VLAN or VLAN 1 (depending on which is thought to carry the protocols) isn’t in the allowed list on trunks.

To make things even more confusing: some recommend that for security purposes, you change the native VLAN to a random VLAN, which has no interfaces in it. So not only does the native VLAN not serve a purpose, it's even a security risk, at least according to some!

Also, I’ve seen another explanation stating that you could have a trunk link connected to a server that is not VLAN-aware. But why wouldn’t you just bundle multiple access ports in a LAG instead of using a trunk?

Would love to hear your thoughts on this, especially in relation to current networking practices!

Hamed Fazel · ‎09-24-2024

Hi @a1111

In Ethernet networking, there is no specific field within the Ethernet frame that explicitly defines the "native VLAN." The concept of a native VLAN is important when dealing with trunk ports, which are used to carry traffic from multiple VLANs across a single link. Any untagged packet that traverses a trunk port is assumed by the switch to belong to the native VLAN.

Here’s how it works:

Native VLAN: On a trunk port, traffic from multiple VLANs is carried, but any traffic that arrives untagged (i.e., without an 802.1Q VLAN tag) is assumed to belong to the native VLAN. This allows legacy devices that don’t support VLAN tagging to communicate on a network.
Ethernet Frame: In a standard Ethernet frame, there is no field that identifies which VLAN it belongs to unless it's explicitly tagged (using 802.1Q tagging). When a frame is untagged, the switch assigns it to the native VLAN of the trunk port.
Native VLAN Mismatch Detection: Although the Ethernet frame itself doesn’t carry any native VLAN information, switches use certain control protocols, such as CDP (Cisco Discovery Protocol) or BPDU (Bridge Protocol Data Unit, used in Spanning Tree Protocol), to exchange information about the native VLAN configuration. These protocols include the native VLAN ID in their packets. If there’s a mismatch between the native VLAN settings on two connected devices, the switch can detect this through these control packets and flag a "native VLAN mismatch" error.
Why It Matters: Native VLAN mismatches can lead to issues such as traffic being misrouted or dropped, and they may even expose the network to security risks, as untagged traffic might end up on the wrong VLAN.

In summary, although Ethernet frames don’t have a field for native VLANs, switches can detect mismatches through control protocols like CDP and BPDU, ensuring proper configuration and network performance.

View solution in original post

Giuseppe Larosa · ‎09-24-2024

Hello @a1111 ,

>> To make things even more confusing: some recommend that for security purposes, you change the native VLAN to a random VLAN, which has no interfaces in it. So not only does the native VLAN not serve a purpose, it's even a security risk, at least according to some!

the use of a dedicated native VLAN that is not associated to any access ports is a countermeasure to double VLAN hopping L2 attack where the attacker sends frames with two 802.1Q tags where the external one is equal to the VLAN ID associated to the access port the attacker connects to. If this VLAN happens to be the native VLAN on a trunk the attacker frames can be sent on the trunk with only the internal 802.1Q tag actually hopping to that VLAN.

The modern trend is to use all VLANs tagged between switches for user traffic.

This provides some advantages: the 802.1Q 4 bytes header includes the 802.1P CoS so tagged frames can be processed by QoS mechanisms.

An 802.1Q header is a requirement for some advanced monitoring methods like Ethernet OAM that creates the L2 equivalent of IP SLA ( just to make a comparison) to qualify end to end L2 paths in metro ethernet or service provider environments.

For user facing ports the native VLAN tagged has still some value and use. For example you can have an ESXI host where the Hypervisor can be reached in the native VLAN untagged and several VMs using tagged frames with different VLAN IDs. Sometimes when an ESXi host restarts it may be missing some of the VLAN tagged vNIC ( special case not the rule) so you can still connect to the Hypervisor and restore the missing tagged VNICs.

Hope to help

Giuseppe

View solution in original post

Ramblin Tech · ‎09-24-2024

My opinion, FWIW,... whatever utility the native VLAN concept might have provided long ago is ancient history now and native VLANs should be relegated to the archives (much like most of the ramblings I post, BTW).

Every frame on a trunk should be explicitly VLAN-tagged, with the only exceptions being the L2 Control Protocols (L2CP) which tend to be designed as untagged, non-transit, link-local sessions. It is my belief (unsupported by any hard evidence) that native-VLAN mismatches on trunks have caused more aggravation, security breaches, and outages than the concept is worth. Even if there is going to be only a single VLAN on a link between two switches/routers, make that link a trunk and explicitly tag that transit traffic. L2CP traffic will then be the only untagged traffic that you would ever see in a packet capture on a trunk.

Disclaimers: I am long in CSCO. Bad answers are my own fault as they are not AI generated.

Hamed Fazel · ‎09-24-2024

Hi @a1111

In Ethernet networking, there is no specific field within the Ethernet frame that explicitly defines the "native VLAN." The concept of a native VLAN is important when dealing with trunk ports, which are used to carry traffic from multiple VLANs across a single link. Any untagged packet that traverses a trunk port is assumed by the switch to belong to the native VLAN.

Here’s how it works:

Native VLAN: On a trunk port, traffic from multiple VLANs is carried, but any traffic that arrives untagged (i.e., without an 802.1Q VLAN tag) is assumed to belong to the native VLAN. This allows legacy devices that don’t support VLAN tagging to communicate on a network.
Ethernet Frame: In a standard Ethernet frame, there is no field that identifies which VLAN it belongs to unless it's explicitly tagged (using 802.1Q tagging). When a frame is untagged, the switch assigns it to the native VLAN of the trunk port.
Native VLAN Mismatch Detection: Although the Ethernet frame itself doesn’t carry any native VLAN information, switches use certain control protocols, such as CDP (Cisco Discovery Protocol) or BPDU (Bridge Protocol Data Unit, used in Spanning Tree Protocol), to exchange information about the native VLAN configuration. These protocols include the native VLAN ID in their packets. If there’s a mismatch between the native VLAN settings on two connected devices, the switch can detect this through these control packets and flag a "native VLAN mismatch" error.
Why It Matters: Native VLAN mismatches can lead to issues such as traffic being misrouted or dropped, and they may even expose the network to security risks, as untagged traffic might end up on the wrong VLAN.

In summary, although Ethernet frames don’t have a field for native VLANs, switches can detect mismatches through control protocols like CDP and BPDU, ensuring proper configuration and network performance.

a1111 · ‎09-26-2024

Hi,

Thank you for the thorough response.

If I understand you correctly, there is no benefit in modern networks for the native VLAN.

An incorrect configuration can result in harm ("expose the network to security risks"). But a correct configuration can only result in no harm done.

Best-case scenario: the native VLAN is just administrative overhead. The only benefit of a correct configuration is that it avoids the harm that an incorrect configuration would cause.

However, eliminating the native VLAN entirely would remove both the administrative burden and the potential for misconfiguration.

I'm genuinely interested in finding as many good use cases for native VLANs in modern networks. Can you please provide some?

Thank you.

Joseph W. Doherty · ‎09-24-2024

"Also, I’ve seen another explanation stating that you could have a trunk link connected to a server that is not VLAN-aware. But why wouldn’t you just bundle multiple access ports in a LAG instead of using a trunk?"

Can you post a reference to that explanation?

"Can someone please shed some light on the purpose of a native VLAN in a modern network?"

Laugh, because modern networks sometimes have some very old technology running within them that's both critical to the business but for which there's no (simple) replacement.

For an example of another old network technology, some newer switches stopped supporting 10/half, but, of course, there's some host that only runs 10/half.

"Also, I’ve seen another explanation stating that you could have a trunk link connected to a server that is not VLAN-aware. But why wouldn’t you just bundle multiple access ports in a LAG instead of using a trunk?"

In Cisco technology, trunks are multi VLAN supporting interfaces and LAG (Etherchannel) is a bundle of multiple interfaces treated as a single interface, which might, or might not, also be a trunk. I.e. two totally different things. Believe other vendors may use different terminology for the same two technologies.

a1111 · ‎09-26-2024

Hi,

Thank you for the thorough response.

"Can you post a reference to that explanation?"

Yes. But now that I've found the source again, I realized that I remembered it incorrectly!

Link:
https://community.cisco.com/t5/switching/vlans-default-native-and-management/td-p/2527491

Answer:
"1) first scenario for native vlan. think about a situation in which some device (perhaps your PC) that does not understand tagged frames is connected to a switch port. That switch port is configured as a trunk. If the switch sends frames over that port that are tagged the device will not understand them. But the frames sent on the native vlan are not tagged and the device will understand them and process them."

The answer doesn't talk about a server. But I still don't understand it. If it's a normal end host ("perhaps your PC" is the example), then why wouldn't the port be configured in access mode anyway?

Can you please give an example of a host that only runs 10/half?

"In Cisco technology, trunks are multi VLAN supporting interfaces and LAG (Etherchannel) is a bundle of multiple interfaces treated as a single interface, which might, or might not, also be a trunk. I.e. two totally different things. Believe other vendors may use different terminology for the same two technologies."

What I had in mind is a server with multiple physical NICs, and multiple switchports connected to it. All of those switchports are in the same access VLAN, and they are bundled in a LAG.

Can you please suggest a benefit of configuring those links as trunk links instead of access links bundled in a LAG?

Thank you.

Giuseppe Larosa · ‎09-24-2024

Hello @a1111 ,

>> To make things even more confusing: some recommend that for security purposes, you change the native VLAN to a random VLAN, which has no interfaces in it. So not only does the native VLAN not serve a purpose, it's even a security risk, at least according to some!

the use of a dedicated native VLAN that is not associated to any access ports is a countermeasure to double VLAN hopping L2 attack where the attacker sends frames with two 802.1Q tags where the external one is equal to the VLAN ID associated to the access port the attacker connects to. If this VLAN happens to be the native VLAN on a trunk the attacker frames can be sent on the trunk with only the internal 802.1Q tag actually hopping to that VLAN.

The modern trend is to use all VLANs tagged between switches for user traffic.

This provides some advantages: the 802.1Q 4 bytes header includes the 802.1P CoS so tagged frames can be processed by QoS mechanisms.

An 802.1Q header is a requirement for some advanced monitoring methods like Ethernet OAM that creates the L2 equivalent of IP SLA ( just to make a comparison) to qualify end to end L2 paths in metro ethernet or service provider environments.

For user facing ports the native VLAN tagged has still some value and use. For example you can have an ESXI host where the Hypervisor can be reached in the native VLAN untagged and several VMs using tagged frames with different VLAN IDs. Sometimes when an ESXi host restarts it may be missing some of the VLAN tagged vNIC ( special case not the rule) so you can still connect to the Hypervisor and restore the missing tagged VNICs.

Hope to help

Giuseppe

a1111 · ‎09-26-2024

Hi,

Thank you for the thorough response.

The first part of your answer seems to question why the native VLAN even needs to exist. Since it can be used to attack the network, it’s not only a threat, but also creates extra work for administrators, because they need to implement countermeasures.

But the second part of your answer gives a good use case for the native VLAN, if I understand you correctly.

In special cases, when the ESXi host restarts, it can't handle VLAN tags. I assume it has something to do with the virtual switch. But if you can connect to the ESXi host via the trunk link, using native VLAN frames, you can boot up the virtual switch. From then on, it can handle VLAN tags.

Is this what you have in mind?

Thank you.