cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

Leap second: UCSM crash after NTP removal

Walter Dey
Advocate
Advocate

Customer running 2.2.1b; removed NTP server in the configuration because of the approaching leap issue.

Then, UCS Manager crashed ? check of NX-OS up time of both FI showed no reboot; therefore it seems only UCSM !

Q. is this normal behaviour ?

Q. has this seen before ?

 

7 REPLIES 7

Keny Perez
Collaborator
Collaborator

Did you see any interruption of the services?

Do you see any crash/core when going to the "local mgmt" and you do a "show pmon state"?

Any useful info from the "show log log" & "show nvram" in NXOS?

 

-Kenny

 

It seems that FI didn't crash; I have no access to the systems. From that point of view, it would be cosmetic bug only ! No time to open a TAC case.

https://tools.cisco.com/bugsearch/bug/CSCus83447

doesn't mention anything ?

 

2015 Jun 29 16:19:48 ch01u201-A %UCSM-2-MANAGEMENT_SERVICES_FAILURE: [F0451][critical][management-services-failure][sys/mgmt-entity-B] Fabric Interconnect B, management services have failed

2015 Jun 29 16:20:12 ch01u201-A %UCSM-2-MANAGEMENT_SERVICES_FAILURE: [F0451][cleared][management-services-failure][sys/mgmt-entity-B] Fabric Interconnect B, management services have failed

2015 Jun 29 16:20:48 ch01u201-A %UCSM-2-MANAGEMENT_SERVICES_UNRESPONSIVE: [F0452][critical][management-services-unresponsive][sys/mgmt-entity-B] Fabric Interconnect B, management services are unresponsive

2015 Jun 29 16:21:08 ch01u201-A %UCSM-2-MANAGEMENT_SERVICES_UNRESPONSIVE: [F0452][cleared][management-services-unresponsive][sys/mgmt-entity-B] Fabric Interconnect B, management services are unresponsive

 

show pmon state hat keine crashes/cores, auf B aber zwei Signal 15:

ch01u201-B(local-mgmt)# show pmon state

 

SERVICE NAME             STATE     RETRY(MAX)    EXITCODE    SIGNAL    CORE

------------             -----     ----------    --------    ------    ----

svc_sam_controller     running           1(4)           0        15      no

svc_sam_dme            running           1(4)           0        15      no

 

 

Hard to tell Walter... you would need to check on the sam_dme file in UCSM tech support but what you see there will be just too much.. you will need to know the specific time and then see if it makes sense according to the behavior seen (hoping the logs are not rolled over)

 

-Kenny

ssumichrast
Beginner
Beginner

Interesting.  I was configuring our domains for UCS Central last week and was adjusting their NTP servers.  When I changed the NTP servers their time drifted and I lost connection to UCS-M.  I could see a time drift causing that potentially, but it sure scared the crap out of me when I thought my FIs had reloaded.  Luckily it was just UCS-M (I have to assume it's time sensitive for replication).

In my little experience with Central, I have seen how Central and NTP are really sensitive together (I want to stress that is based on my very little experience with UCSC)

 

-Kenny

Yes which is why I was fixing our ntp. UCSM did not work with our round robin DNS entry for NTP so I was setting then to some IP addresses instead. Our domains were unable to join UCS Central becuse the time was just different enough -- I'm talking less than 30 seconds. 

 

When I removed the DNS entry and then supplied the IPs UCSM saved the config and then crashed about 20 seconds later, probably because of a clock adjustment. I don't think UCSM likes huge clock changes, which is understandable for the database synchronization.

Gotcha, thanks for the feedback!

 

-Kenny

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: