07-24-2012 03:32 AM - edited 03-01-2019 10:31 AM
Hi All.
I have a very strange situation. Recently arrive the new UCS 2.0(3a) to our site.
After a week of running without problem the fabric interconnect B went down ( this happens twice ).
If I do a
porfic03-B# show cluster extended-state
Start time: Thu Jul 12 17:38:02 2012
Last election time: Thu Jul 12 18:29:59 2012
B: UP, PRIMARY
A: UP, SUBORDINATE
B: memb state UP, lead state PRIMARY, mgmt services state: UP
A: memb state UP, lead state SUBORDINATE, mgmt services state: UP
heartbeat state PRIMARY_OK
INTERNAL NETWORK INTERFACES:
eth1, DOWN
eth2, UP
HA READY
in the porfic03-A I get :
porfic03-A# show cluster extended-state
Start time: Thu Jul 12 18:29:51 2012
Last election time: Thu Jul 12 18:29:54 2012
A: UP, SUBORDINATE
B: UP, PRIMARY
A: memb state UP, lead state SUBORDINATE, mgmt services state: UP
B: memb state UP, lead state PRIMARY, mgmt services state: UP
heartbeat state PRIMARY_OK
INTERNAL NETWORK INTERFACES:
eth1, UP
eth2, UP
HA READY
So as you can see all looks fine inside UCS. Although outside UCS I cannot ping porfic03-B and the cluster virtual IP ( because is attached to porfic03-b that is the primary node )
Phisically I see that the management network card in porfic03-B as link but as no activity.
Can anyone point in the right direction to solve this issue ?.
Reboot the porfic03-B solve the problem but then the problem after a week comes back.
Any ideas ?
Regards
07-24-2012 05:07 AM
Hello,
Please provide following information from FI B
scope monitoring
scope sysdebug
show cores detail
connect nxos b
show version
show system reset-reason
show int mgmt0
------------------
Regarding network connectivity for FI B mgmt interface, start with verifying the cabling and upstream switch port configuration.
Padma
07-24-2012 05:54 AM
I already check FI B mgmt interface cabling and upstream switch port ( no erros ) port is up in the switch. I already switch the cable in the mgmt A to the mgmt B ant the port still was no activity.
The output of:
porfic03-B# scope monitoring
porfic03-B /monitoring #
porfic03-B# scope sysdebug
^
% Invalid Command at '^' marker
porfic03-B# show cores detail
^
% Invalid Command at '^' marker
connect nxos b
show version:
Software
BIOS: version 3.5.0
loader: version N/A
kickstart: version 5.0(3)N2(2.03a)
system: version 5.0(3)N2(2.03a)
power-seq: Module 1: version v1.0
Module 3: version v2.0
uC: version v1.2.0.1
SFP uC: Module 1: v1.0.0.0
BIOS compile time: 02/03/2011
kickstart image file is: bootflash:/installables/switch/ucs-6100-k9-kickstart.
5.0.3.N2.2.03a.bin
kickstart compile time: 6/19/2012 7:00:00 [06/19/2012 15:21:08]
system image file is: bootflash:/installables/switch/ucs-6100-k9-system.5.0
.3.N2.2.03a.bin
system compile time: 6/19/2012 7:00:00 [06/19/2012 17:04:19]
Hardware
cisco UCS 6248 Series Fabric Interconnect ("O2 32X10GE/Modular Universal Platf
orm Supervisor")
Intel(R) Xeon(R) CPU with 16622556 kB of memory.
Processor Board ID FOC161117SU
Device name: porfic03-B
bootflash: 31266648 kB
Kernel uptime is 11 day(s), 20 hour(s), 13 minute(s), 4 second(s)
Last reset
Reason: Unknown
System version: 5.0(3)N2(2.03a)
Service:
plugin
Core Plugin, Ethernet Plugin, Fc Plugin, Virtualization Plugin
show system reset-reason:
----- reset reason for Supervisor-module 1 (from Supervisor in slot 1) ---
1) No time
Reason: Unknown
Service:
Version: 5.0(3)N2(2.03a)
2) At 462964 usecs after Wed Jul 4 16:04:11 2012
Reason: Reset Requested by CLI command reload
Service:
Version: 5.0(3)N2(2.03a)
3) At 493083 usecs after Wed Jul 4 10:40:51 2012
Reason: Reset Requested by CLI command reload
Service:
Version: 5.0(3)N2(2.03a)
4) At 902919 usecs after Tue Jul 3 15:29:48 2012
Reason: Reset Requested by CLI command reload
Service:
Version: 5.0(3)N2(2.02q)
show int mgmt0:
mgmt0 is down (Administratively down)
Hardware: GigabitEthernet, address: 547f.ee8b.c060 (bia 547f.ee8b.c060)
Internet Address is xxx.xx.xx.xx/24
MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
reliability 64808/255, txload 1/255, rxload 1/255
Encapsulation ARPA
auto-duplex, 1000 Mb/s
EtherType is 0x0000
1 minute input rate 0 bits/sec, 0 packets/sec
1 minute output rate 0 bits/sec, 0 packets/sec
Rx
472140 input packets 0 unicast packets 472140 multicast packets
0 broadcast packets 39500853 bytes
Tx
0 output packets 0 unicast packets 0 multicast packets
0 broadcast packets 0 bytes
Thanks for your reply.
Regards
07-24-2012 06:18 AM
Hello,
Can you please check if there are any core dumps on the FI by
scope monitoring
scope sysdebug
show cores detail
Mgmt status being display down is a known issue and we cannot consider it in this scenario.
Is mac address of FI B mgmt B interface learned on upstream switch port ?
Padma
07-24-2012 08:39 AM
Hi
No mac address in the upstream port.
porfic03-B /monitoring/sysdebug # scope monitoring
porfic03-B /monitoring # show cores detail
^
% Invalid Command at '^' marker
porfic03-B /monitoring # scope sysdebug
porfic03-B /monitoring/sysdebug # show cores detail
porfic03-B /monitoring/sysdebug # porfic03-B /monitoring/sysdebug # scope monitoring
porfic03-B /monitoring # show cores detail
^
% Invalid Command at '^' marker
porfic03-B /monitoring # scope sysdebug
porfic03-B /monitoring/sysdebug # show cores detail
porfic03-B /monitoring/sysdebug #
Thanks for the replay
Regards
07-25-2012 04:05 AM
Hi all
When I do a:
- porfic03-B(nxos)# show hardware internal cpu-mac mgmt stats
I get a lot of errors in the mgmt port. I will switch the module and check in the next days if the problem was solved.
Regards
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide