04-21-2024 08:35 PM
Cisco Fabric Interconnect is having some Pmon Services as failed
==================================================
FI-B(local-mgmt)# show pmon state
SERVICE NAME STATE RETRY(MAX) EXITCODE SIGNAL CORE
------------ ----- ---------- -------- ------ ----
svc_sam_controller running 0(4) 0 0 no
svc_sam_dme running 1(4) 0 15 no
svc_sam_dcosAG failed 5(4) 0 15 no
svc_sam_bladeAG running 0(4) 0 0 no
svc_sam_portAG failed 5(4) 0 15 no
svc_sam_statsAG running 0(4) 0 0 no
svc_sam_hostagentAG running 0(4) 0 0 no
svc_sam_nicAG running 0(4) 0 0 no
svc_sam_licenseAG running 0(4) 0 0 no
svc_sam_extvmmAG failed 5(4) 0 15 no
httpd.sh running 0(4) 0 0 no
httpd_cimc.sh running 0(4) 0 0 no
svc_sam_sessionmgrAG failed 5(4) 0 6 yes
svc_sam_pamProxy running 0(4) 0 0 no
dhcpd running 0(4) 0 0 no
sam_core_mon running 0(4) 0 0 no
svc_sam_netSnmpAG running 0(4) 0 0 no
svc_sam_rsdAG failed 5(4) 0 15 no
svc_sam_svcmonAG running 0(4) 0 0 no
svc_sam_samcproxy failed 5(4) 0 11 yes
svc_sam_samcstatsproxy running 0(4) 0 0 no
mtuTune running 0(10) 0 0 no
=============================================
We have a cluster of two fabric interconnects of 6454's with firmare of 4.2(3e) installed on them. It looks like Primary fabric interconnect is not taking any commands such as
Cluster lead a
Pmon stop
pmon start
Can anyone confirm what is cauing this issue and what would be the fix for this. Right now both fabric interconnects are taking the traffic.
Note:
Tac case is work in progress.
Solved! Go to Solution.
04-23-2024 07:48 AM
Fixed width output in non-fixed width font and consecutive spaces reduced to one space breaks my brain.
Cisco web tools don't do this any justice by not pasting properly.
Reformatted your output so I could see what was happening in that jumbled output:
SERVICE NAME STATE RETRY(MAX) EXITCODE SIGNAL CORE ------------ ----- ---------- -------- ------ ---- svc_sam_controller running 0(4) 0 0 no svc_sam_dme running 1(4) 0 15 no svc_sam_dcosAG failed 5(4) 0 15 no svc_sam_bladeAG running 0(4) 0 0 no svc_sam_portAG failed 5(4) 0 15 no svc_sam_statsAG running 0(4) 0 0 no svc_sam_hostagentAG running 0(4) 0 0 no svc_sam_nicAG running 0(4) 0 0 no svc_sam_licenseAG running 0(4) 0 0 no svc_sam_extvmmAG failed 5(4) 0 15 no httpd.sh running 0(4) 0 0 no httpd_cimc.sh running 0(4) 0 0 no svc_sam_sessionmgrAG failed 5(4) 0 6 yes svc_sam_pamProxy running 0(4) 0 0 no dhcpd running 0(4) 0 0 no sam_core_mon running 0(4) 0 0 no svc_sam_netSnmpAG running 0(4) 0 0 no svc_sam_rsdAG failed 5(4) 0 15 no svc_sam_svcmonAG running 0(4) 0 0 no svc_sam_samcproxy failed 5(4) 0 11 yes svc_sam_samcstatsproxy running 0(4) 0 0 no mtuTune running 0(10) 0 0 no
Doubt this is old bug CSCwa58954.
More likely newer bug:
CSCwf39250 :: samcproxy fails due to multiple failed SSH login attempts
which is fixed on UCSM 4.3(2b).
TAC should be able to clear the issue without rebooting the Fabric Interconnect.
04-21-2024 11:48 PM
- Possibly not a complete match but I noted it : https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwa58954
M.
04-23-2024 07:48 AM
Fixed width output in non-fixed width font and consecutive spaces reduced to one space breaks my brain.
Cisco web tools don't do this any justice by not pasting properly.
Reformatted your output so I could see what was happening in that jumbled output:
SERVICE NAME STATE RETRY(MAX) EXITCODE SIGNAL CORE ------------ ----- ---------- -------- ------ ---- svc_sam_controller running 0(4) 0 0 no svc_sam_dme running 1(4) 0 15 no svc_sam_dcosAG failed 5(4) 0 15 no svc_sam_bladeAG running 0(4) 0 0 no svc_sam_portAG failed 5(4) 0 15 no svc_sam_statsAG running 0(4) 0 0 no svc_sam_hostagentAG running 0(4) 0 0 no svc_sam_nicAG running 0(4) 0 0 no svc_sam_licenseAG running 0(4) 0 0 no svc_sam_extvmmAG failed 5(4) 0 15 no httpd.sh running 0(4) 0 0 no httpd_cimc.sh running 0(4) 0 0 no svc_sam_sessionmgrAG failed 5(4) 0 6 yes svc_sam_pamProxy running 0(4) 0 0 no dhcpd running 0(4) 0 0 no sam_core_mon running 0(4) 0 0 no svc_sam_netSnmpAG running 0(4) 0 0 no svc_sam_rsdAG failed 5(4) 0 15 no svc_sam_svcmonAG running 0(4) 0 0 no svc_sam_samcproxy failed 5(4) 0 11 yes svc_sam_samcstatsproxy running 0(4) 0 0 no mtuTune running 0(10) 0 0 no
Doubt this is old bug CSCwa58954.
More likely newer bug:
CSCwf39250 :: samcproxy fails due to multiple failed SSH login attempts
which is fixed on UCSM 4.3(2b).
TAC should be able to clear the issue without rebooting the Fabric Interconnect.
04-23-2024 09:31 AM
04-23-2024 10:04 PM
The debug shell on FI-6454 is only available to TAC via a challenge-response mechanism.
04-24-2024 08:10 PM
I am happy to inform you that the issue with the UCS GUI has been resolved. I worked with TAC and they confirmed that it was the same bug that Steven had identified. We followed the steps suggested to kill and restart the pmon process for user1&2 & and clear the cores. After that, the UCS GUI was accessible again. Thank you for your patience and support.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide