I am trying to understand some interesting layer 2 behavior that occurred when pruning a VLAN.
The topology is a 3750 stack uplinked to a 6509 and 4006 via 1gb fiber ports configured for dot1Q trunking. The 6509 and 4006 are also connected by a trunk. Initially VLAN 10 was allowed on all trunks, and Spanning Tree was blocking the 3750 port to the 6509. The 4006 is the root bridge.
The original problem was intermittent packet loss on VLAN 10. No port errors or other obvious issues. Other VLANs using the 6509 link were error free. To force VLAN 10 traffic to use the 6509 link, the 4006 trunk port was pruned to clear VLAN 10. As expected VLAN 10 then used the 6509, and no errors were seen, at least not initially. After an hour or so, 3 IP phones (out of 100 plus on the stack) started resetting every few minutes.
Observing traffic counters it became clear that VLAN 10 traffic was looping in one direction. The outbound utilization on the 6509 matched the inbound on the 4006. The inbound utilization on the 4006 was about 10 times the outbound. A Sniffer SPAN'd to the port did not show this asymmetry. At that point the 3750 trunk to the 4006 was also pruned to not allow VLAN 10, resulting in the loop being broken and traffic returning to normal.
The looping traffic was not Ethernet broadcasts which the Sniffer would have seen, so I am thinking it was VLAN 10 PVST BPDUs, but why would the 4006 that was the root for VLAN 10 not discard BPDUs that it had originated? Or will a port SPAN session on a trunk not show inbound traffic on a pruned VLAN that is actually being forwarded by the switch. Unfortunately I did not put the Sniffer on the 6509 port.
Clearly, when pruning VLANs, both sides of a trunk should match, but what happens when only one side is pruned? Shouldn't the pruned trunk block that VLAN in both directions?
It should block if it is pruned on one side , we have huge setup similar to yours and thats what we have in a client/server environment. Is is possible you have a dirty uplink or crosslink that could be causing spanning tre to flap . If its a fiber environment you should think about using UDLD if you aren't already. Also ceck your version of code on the 4006 , I know in earlier 6.X codes there was any issue where nodes in a vlan would just stop talking to each other for no reason . Check for any topology changes during the problem time .
CSCdt80707 Bug Details Bug #1 of 2 | < Previous | Next >
Partial loss of connectivity on a 4006 switch.
On a Catalyst 4006 with a SupII switch engine, switch ports in the
same VLAN may lose connectivity with one another.
The loss of connectivity results in a VLAN appearing to be partitioned
into several isolated segments. A host may be able to ping one set of
devices in its VLAN, while it can not ping another set of devices in
the same VLAN.
This loss of connectivity is independent of the slot that a linecard
is installed in, i.e. the same set of ports on a given linecard are
affected regardless of the slot that the linecard is installed in.
The workaround is to reset the switch.
This problem has a software work-around in versions 5.5(7), 6.1(3), 6.2(1)
and later. This software fix for this issue will be in 5.5(10), 6.2(3), 6.3(1)
See CSCdu48749 for more details and other symptoms of this issue.
The 4006 is 7.3 if memory serves. I am going to do some more testing, but it appears that trunk pruning results in a unidirectional link. No frames out, but inbound broadcast frames are still flooded. Perhaps a 4006 only issue, but it would be good to know for sure. I agree UDLD should have been on, but I thought pruning blocked both ways.
Similar to the bug you mention, our 4006s exhibit unpredictable loss of connectivity if the Sup II gig ports are in use, and a SPAN session is also setup. As a result, we just don't use the Sup ports.