cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1453
Views
0
Helpful
17
Replies

UCS B200M3, 3750G Switch connectivity oddness

Chris Hibbert
Level 1
Level 1

Morning All,

Hoping someone can help, I have a strange situation that occurs with my new UCS installation.

Configuration is:

UCS Chassis connected to a pair of 6248UP Fabric Interconnects, using 2208XP IO Modules.

I connect from this to a distribution 3750 stack, I have configured 2 etherchannels on the 3750 stack, and on the UCS Manager, these are configured to pass all the relevant VLAN's, I have three blades (B200M3) installed into the chassis with VMWare ESXI 5 installed.

Once they have all been set up they can communicate both ways to and from the network without problem, both the vmware networks and the LAN.

Overnight (without any changes) 2 of the blades (slots 2 and 3) stop communicating to the network. I can get them working again by making a few changes to the network settings and all will be ok until the next day.

I am at a loss as to what can be causing this.

Any help would be great.

Thanks

Chris

1 Accepted Solution

Accepted Solutions

That should fix your issue.  UCS will not forward unknown unicast, so if a UCS blade/VM MAC address ages out on your 3750's, the outside world will not be able to reach it.  Under normal/production operation servers are normally chatty enough to keep the aging timers from depleting so you'll likely only see this at this time during the install when there are few/no VMs sending/receiving.  Another option is to increase the aging timers on the 3750.

Let me know it goes tomorrow.

Regards,

Robert

View solution in original post

17 Replies 17

Robert Burns
Cisco Employee
Cisco Employee

Please connect to the UCSM CLI and collect the following output:

connect nxos

show cdp neighbor

show port-c sum

show int trunk

Paste here.

Regards,

Roberrt

hi, Thanks for your reply, please see the information requsted

trl-secure-A(nxos)# show cdp neighbors

Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge

                  S - Switch, H - Host, I - IGMP, r - Repeater,

                  V - VoIP-Phone, D - Remotely-Managed-Device,

                  s - Supports-STP-Dispute

Device ID              Local Intrfce   Hldtme  Capability  Platform      Port ID

trl-secure-A(nxos)# show port

port            port-channel    port-profile    port-security

trl-secure-A(nxos)# show port-channel  summary

Flags:  D - Down        P - Up in port-channel (members)

        I - Individual  H - Hot-standby (LACP only)

        s - Suspended   r - Module-removed

        S - Switched    R - Routed

        U - Up (port-channel)

--------------------------------------------------------------------------------

Group Port-       Type     Protocol  Member Ports

      Channel

--------------------------------------------------------------------------------

7     Po7(SU)     Eth      LACP      Eth1/27(P)   Eth1/28(P)

1284  Po1284(SU)  Eth      NONE      Eth1/1/5(P)  Eth1/1/7(P)

1285  Po1285(SU)  Eth      NONE      Eth1/1/6(P)  Eth1/1/8(P)

1286  Po1286(SU)  Eth      NONE      Eth1/1/1(P)  Eth1/1/3(P)

1287  Po1287(SU)  Eth      NONE      Eth1/1/2(P)  Eth1/1/4(P)

1290  Po1290(SU)  Eth      NONE      Eth1/1/9(P)  Eth1/1/11(P)

1291  Po1291(SU)  Eth      NONE      Eth1/1/10(P)  Eth1/1/12(P)

trl-secure-A(nxos)# show in

in-order-guarantee   incompatibility      install              interface            inventory

trl-secure-A(nxos)# show interface tr

transceiver   trunk

trl-secure-A(nxos)# show interface trunk

--------------------------------------------------------------------------------

Port          Native  Status        Port

              Vlan                  Channel

--------------------------------------------------------------------------------

Eth1/15       1       trunking      --

Eth1/27       1       trnk-bndl     Po7

Eth1/28       1       trnk-bndl     Po7

Po7           1       trunking      --

Veth693       2       trunking      --

Veth698       1       trunking      --

Veth699       1       trunking      --

Veth701       2       trunking      --

Veth703       1       trunking      --

Veth705       1       trunking      --

Veth707       2       trunking      --

Veth709       1       trunking      --

Veth711       1       trunking      --

Eth1/1/33     1       trunking      --

--------------------------------------------------------------------------------

Port          Vlans Allowed on Trunk

--------------------------------------------------------------------------------

Eth1/15       1-2,5,200

Eth1/27       1-2,5,200

Eth1/28       1-2,5,200

Po7           1-2,5,200

Veth693       2

Veth698       1

Veth699       2,200

Veth701       2

Veth703       1

Veth705       2,200

Veth707       2

Veth709       1

Veth711       2,200

Eth1/1/33     1-2,5,200,4044,4047-4049

--------------------------------------------------------------------------------

Port          Vlans Err-disabled on Trunk

--------------------------------------------------------------------------------

Eth1/15       none

Eth1/27       none

Eth1/28       none

Po7           none

Veth693       none

Veth698       none

Veth699       none

Veth701       none

Veth703       none

Veth705       none

Veth707       none

Veth709       none

Veth711       none

Eth1/1/33     none

--------------------------------------------------------------------------------

Port          STP Forwarding

--------------------------------------------------------------------------------

Eth1/15       1-2,5,200

Eth1/27       none

Eth1/28       none

Po7           1-2,5,200

Veth693       2

Veth698       1

Veth699       2,200

Veth701       2

Veth703       1

Veth705       2,200

Veth707       2

Veth709       1

Veth711       2,200

Eth1/1/33     1-2,5,200,4044,4047-4049

--------------------------------------------------------------------------------

Port          Vlans in spanning tree forwarding state and not pruned

--------------------------------------------------------------------------------

Eth1/15       --

Eth1/27       --

Eth1/28       --

Po7           --

Veth693       --

Veth698       --

Veth699       --

Veth701       --

Veth703       --

Veth705       --

Veth707       --

Veth709       --

Veth711       --

Eth1/1/33     --

trl-secure-A(nxos)#

Do you have CDP enabled on the 3750 stack interfaces?  No output in the show CDP neighbors.

I'd also like to see the interface config for the 3750 ports (both Port Channel and member interfaces).

show run int x/y

show run int po x

Robert

Thanks again, no CDP isn't enabled by default on this one, I have enabled it to facilitate a fix.

Here's the info.

Firstly the Neighbors report

trl-secure-A(nxos)# show cdp neighbors

Capability Codes: R - Router, T - Trans-Bridge, B - Source-Route-Bridge

                  S - Switch, H - Host, I - IGMP, r - Repeater,

                  V - VoIP-Phone, D - Remotely-Managed-Device,

                  s - Supports-STP-Dispute

Device ID              Local Intrfce   Hldtme  Capability  Platform      Port ID

U12CoreRed.trlsecure.l Eth1/27         124     R S I       WS-C3750G-24T Gig1/0/15

U12CoreRed.trlsecure.l Eth1/28         125     R S I       WS-C3750G-24T Gig2/0/18

trl-secure-A(nxos)#

--------------------------------------------------------------------------------------------------------------------------

Port-channel 7 Info:

U12CoreRed#show int port-channel 7

Port-channel7 is up, line protocol is up (connected)

  Hardware is EtherChannel, address is 0022.916c.7492 (bia 0022.916c.7492)

  Description: UBS FabA

  MTU 1500 bytes, BW 2000000 Kbit/sec, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, link type is auto, media type is unknown

  input flow-control is off, output flow-control is unsupported

  Members in this channel: Gi1/0/15 Gi2/0/18

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input never, output 00:00:00, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 0 bits/sec, 0 packets/sec

  5 minute output rate 3000 bits/sec, 4 packets/sec

     3850680 packets input, 3822146418 bytes, 0 no buffer

     Received 254378 broadcasts (172854 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 172854 multicast, 0 pause input

     0 input packets with dribble condition detected

     14766733 packets output, 7313073697 bytes, 0 underruns

     0 output errors, 0 collisions, 4 interface resets

     0 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 pause output

     0 output buffer failures, 0 output buffers swapped out

U12CoreRed#show int g1/0/15

GigabitEthernet1/0/15 is up, line protocol is up (connected)

  Hardware is Gigabit Ethernet, address is 0022.916c.730f (bia 0022.916c.730f)

  MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:26, output 00:00:04, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 0 bits/sec, 0 packets/sec

  5 minute output rate 2000 bits/sec, 4 packets/sec

     234954 packets input, 99988405 bytes, 0 no buffer

     Received 166587 broadcasts (86599 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 86599 multicast, 0 pause input

     0 input packets with dribble condition detected

     6851946 packets output, 600462501 bytes, 0 underruns

     0 output errors, 0 collisions, 4 interface resets

     54387 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 pause output

     0 output buffer failures, 0 output buffers swapped out

U12CoreRed#show int g2/0/18

GigabitEthernet2/0/18 is up, line protocol is up (connected)

  Hardware is Gigabit Ethernet, address is 0022.916c.7492 (bia 0022.916c.7492)

  MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:12, output 00:00:33, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 0 bits/sec, 0 packets/sec

  5 minute output rate 0 bits/sec, 0 packets/sec

     3615893 packets input, 3722198373 bytes, 0 no buffer

     Received 87958 broadcasts (86411 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 86411 multicast, 0 pause input

     0 input packets with dribble condition detected

     7924984 packets output, 6714119929 bytes, 0 underruns

     0 output errors, 0 collisions, 5 interface resets

     0 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 pause output

     0 output buffer failures, 0 output buffers swapped out

-------------------------------------------------------------------------------------------------------------------------

Port-channel 8 info

U12CoreRed#show int port-channel 8

Port-channel8 is up, line protocol is up (connected)

  Hardware is EtherChannel, address is c8f9.f9e3.118f (bia c8f9.f9e3.118f)

  Description: UBS FabB

  MTU 1500 bytes, BW 2000000 Kbit/sec, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, link type is auto, media type is unknown

  input flow-control is off, output flow-control is unsupported

  Members in this channel: Gi1/0/16 Gi3/0/15

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input never, output 00:00:00, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 0 bits/sec, 0 packets/sec

  5 minute output rate 4000 bits/sec, 6 packets/sec

     212497 packets input, 65688627 bytes, 0 no buffer

     Received 182191 broadcasts (173635 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 173635 multicast, 0 pause input

     0 input packets with dribble condition detected

     9493059 packets output, 930235990 bytes, 0 underruns

     0 output errors, 0 collisions, 4 interface resets

     0 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 pause output

     0 output buffer failures, 0 output buffers swapped out

U12CoreRed#show int g1/0/16

GigabitEthernet1/0/16 is up, line protocol is up (connected)

  Hardware is Gigabit Ethernet, address is 0022.916c.7310 (bia 0022.916c.7310)

  MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:13, output 00:00:01, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 0 bits/sec, 0 packets/sec

  5 minute output rate 2000 bits/sec, 3 packets/sec

     117044 packets input, 32002148 bytes, 0 no buffer

     Received 95561 broadcasts (87266 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 87266 multicast, 0 pause input

     0 input packets with dribble condition detected

     5234624 packets output, 480601569 bytes, 0 underruns

     0 output errors, 0 collisions, 4 interface resets

     54421 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 pause output

     0 output buffer failures, 0 output buffers swapped out

U12CoreRed#show int g3/0/15

GigabitEthernet3/0/15 is up, line protocol is up (connected)

  Hardware is Gigabit Ethernet, address is c8f9.f9e3.118f (bia c8f9.f9e3.118f)

  MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,

     reliability 255/255, txload 1/255, rxload 1/255

  Encapsulation ARPA, loopback not set

  Keepalive set (10 sec)

  Full-duplex, 1000Mb/s, media type is 10/100/1000BaseTX

  input flow-control is off, output flow-control is unsupported

  ARP type: ARPA, ARP Timeout 04:00:00

  Last input 00:00:20, output 00:00:21, output hang never

  Last clearing of "show interface" counters never

  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0

  Queueing strategy: fifo

  Output queue: 0/40 (size/max)

  5 minute input rate 0 bits/sec, 0 packets/sec

  5 minute output rate 1000 bits/sec, 1 packets/sec

     95525 packets input, 33704654 bytes, 0 no buffer

     Received 86702 broadcasts (86441 multicasts)

     0 runts, 0 giants, 0 throttles

     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored

     0 watchdog, 86441 multicast, 0 pause input

     0 input packets with dribble condition detected

     4262744 packets output, 450231499 bytes, 0 underruns

     0 output errors, 0 collisions, 4 interface resets

     0 unknown protocol drops

     0 babbles, 0 late collision, 0 deferred

     0 lost carrier, 0 no carrier, 0 pause output

     0 output buffer failures, 0 output buffers swapped out

I need the "show run int x/y" for the interfaces, not the "show int x/y"

Robert

oops

please find the running config for the interfaces. thanks again.

U12CoreRed#show run interface g1/0/15

Building configuration...

Current configuration : 191 bytes

!

interface GigabitEthernet1/0/15

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2,5,200

switchport mode trunk

channel-protocol lacp

channel-group 7 mode active

end

U12CoreRed#show run interface g2/0/18

Building configuration...

Current configuration : 191 bytes

!

interface GigabitEthernet2/0/18

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2,5,200

switchport mode trunk

channel-protocol lacp

channel-group 7 mode active

end

U12CoreRed#show run interface g1/0/16

Building configuration...

Current configuration : 191 bytes

!

interface GigabitEthernet1/0/16

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2,5,200

switchport mode trunk

channel-protocol lacp

channel-group 8 mode active

end

U12CoreRed#show run interface g3/0/15

Building configuration...

Current configuration : 191 bytes

!

interface GigabitEthernet3/0/15

switchport trunk encapsulation dot1q

switchport trunk allowed vlan 2,5,200

switchport mode trunk

channel-protocol lacp

channel-group 8 mode active

end

Enable Port Fast on the 3750 interfaces.  Not the main cause of your issue, but it's recommended when connected to UCS.  Should be something like "spanning-tree portfast edge trunk"

As for the connectivity issues, the only thing I can think of is the blades MAC address is aging out fo the 3750 MAC table.

What is the OS of the blades having the issue?

How are the blades OS interfaces configured? (Teamed together, or sepearate interfaces, vSwitch etc?)

When you say they lose connectivity, does that mean from the blade you can't ping outwards or from the network you can't reach your blades?

Robert

I have enabled the portfast as recommended.

The MAC addresses aging out might be a possible answer as the entry on the ARP table has cleared and only now that I try to ping the devices it shows as incomplete. is there a way of stopping it "aging out"?

The end O/S is ESXi, when the MAC address drops from the 3750 netiher end can ping each other (ie LAN to blade and blade to LAN).

The OS config is pairs of virtual nics (one from each Fabric) going to a vSwitch, vlan'd.

it's one of those annoying issues that just frustrates 

There's a config problem somewhere.  The VM should always be able to ping outward, even if the "quiet VM" effect occurs and the MAC ages out, as soon as the VM initiates communication it will arp out for the gateway which should rebuild the ARP table upstream.

Are the vSwitch uplinks conifgured as default teaming options (Route based on virtual Port ID)? Hopefully you're not trying to use an IP Hash on the vSwitch uplinks, which will not work with UCS.

Robert

There are no VM's yet on the servers, all three are configured as the default image using the custom Cisco build for UCS servers. (i.e. Route based on virtual port, second nic in "standby adapter").

A simple restart of the management network is "usually" enough to kick the link back to life, the odd thing is that server 1 (slot 1) almost NEVER has this problem (only once and I believe it was misconfigured at the time). the servers in slot 2 and 3 are always getting this issue overnight.

I totally agree that the symptoms point to a misconfiguration, but the fact that server one is ok, and the others have profiles cloned from the first it's a real mystery.

Next time you get a host in this state - can you leave it like this?  That would be the best time to tshoot this.  From what I gather the UCS side sounds fine, I'd be sniffing around ESX/vSwitch for the config issue.  If you simply restart Mgmt service on the host to get things back online, that means UCS pinning and VLANs are all fine.  This is a host-side issue.

Any reason why you're using one NIC as standby and not using both A/A?

Robert

Hi Robert,

It was in this state the whole time during this, I think you might have hit the answer as on the third server rather than restarting the interface I simply ran the test from the management interface and this seemed to sort the problem immediately, I wonder if perhaps as these are in a default install, the servers are just too quiet and they simply age out.

I am installing a base windows server on all boxes as frankly they chat non stop - that might solve the issue and in a live environment the problem will simply not be there (ie heartbeats etc).

I'll update tomorrow when the servers have a quiet night and see if this doesn't cure the problem.

thanks a lot for all your help so far.

Chris

That should fix your issue.  UCS will not forward unknown unicast, so if a UCS blade/VM MAC address ages out on your 3750's, the outside world will not be able to reach it.  Under normal/production operation servers are normally chatty enough to keep the aging timers from depleting so you'll likely only see this at this time during the install when there are few/no VMs sending/receiving.  Another option is to increase the aging timers on the 3750.

Let me know it goes tomorrow.

Regards,

Robert

Well, this didn't entirely fix the problem - the management MAC's for the same 2 servers dropped again from the MAC table, however the virtual servers I added remained pingable, so I take that as a minor victory.

I have now added the ESX hosts to the cluster, I believe that the HA services will be more than enough to keep the MAC from aging.

I will close this down tomorrow assuming this has now resolved the issue.

Chris

Review Cisco Networking for a $25 gift card

Review Cisco Networking for a $25 gift card