cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1747
Views
0
Helpful
8
Replies

Management inteface on fabric interconnect losing connectivity spiradically

Jason Flory
Level 1
Level 1

Hello Everyone

I keep getting kicked out of UCS management interface.  Logs just say could not reach management interface on either fab a or b.   Started some ping sessions directly to those IPs and sure enough one or the other will drop occasionally.

 

When I SSH to Fab A or B it says both interfaces are administratively down. But shows the correct IP address and mac address and shows data stats.   Could this be my problem? 

 

I did read that faulty gateway can cause some of these issues but gateway is good.

 

Thanks

 

1 Accepted Solution

Accepted Solutions

Walter Dey
VIP Alumni
VIP Alumni

The fact that mgt interface are down is a feature ? don't worry ?

Do you have duplicate IP addresses ?

FI Management IP addresse, VIP and the KVM (out of band) have to be in the same vlan and IP subnet ?

View solution in original post

8 Replies 8

Walter Dey
VIP Alumni
VIP Alumni

The fact that mgt interface are down is a feature ? don't worry ?

Do you have duplicate IP addresses ?

FI Management IP addresse, VIP and the KVM (out of band) have to be in the same vlan and IP subnet ?

The fact that mgt interface are down is a feature ? don't worry ? Was that a question or you saying don't worry about it?

No dupe IP, both management and VIP are on the same subnet as well as ext-mgt pool for KVM.

Do a ping from the outside network to FI-A and B as well as the VIP ! would be interesting to know, how long ping is down.

Regarding the mgt interface down status: I seem to remember, that this is not a NX-OS interface but a Linux interface; therefore the NX-OS command shows the wrong status.

mgmt interface down is done on purpose:

https://tools.cisco.com/bugsearch/bug/CSCta43727/?reffering_site=dumpcr

Ping the Fabrics from the upstream switch when the issue takes place and see if that does not reach the FIs either

An screenshot of the error message when you are kick out of UCSM would be helpful too.

 

-Kenny

Thanks I will try that.  However this went from happening every day to now has not happened in the last week.

When this was happening frequently I did have pings going to each FI from my workstation and they did drop when the management interface would go down.  In the UCS it would say could not reach either management IP on fab a or fab b.  

I know you are saying to ping from the upstream switch which I will try.   Another thing that I have noticed and not sure if this is happening at the same time but periodically it will go through an election process and change primary and subordinate status on FIs.

It would also be helpful to know if you were having processes crashing... do you see any core files generated?  can you share try to run the "connect loc a|b" (do it for both a & b) and "show pmon state" ?... please run that command know and right after you recover access to UCSM (if it happens again)

 

-Kenny

What is your indication, that a election process is running ? log file ? or simply the fact that primary is changing ?

If this is happening, it means that the primary FI is crashing, and as Kenny mentioned above, you should see a core dump file, and also in NX-OS a change of the up time ?

The FI does not crash completely where it has to reboot.  Just the management IP seems to drop.

I will look for the crash dump files as suggested.  I have also opened a TAC case.

Review Cisco Networking products for a $25 gift card