cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
7001
Views
0
Helpful
1
Replies

Check heatbeat from HAL process

andre.godoy
Level 1
Level 1

Hey Guys,

I'm facing the bug CSCvo45816,  where the switch has been up for a long time (607 days in my case). Once our timer (32-bit integer) wraps around, the Hal process will see this as a missed heartbeat and reload the switch.

 
The workaround is run  "vsh_lc -c 'debug platform internal hal infra thread heartbeat disable'" in all leafs.
 
Do you guys know how I can confirm if the thread heartbeat was disabled?
 
Thank you!
1 Accepted Solution

Accepted Solutions

andre.godoy
Level 1
Level 1

I recently had this answer

 

On the leaf, with root privilege,

cat /var/sysmgr/tmp_log/csco_hai_sb255_svc12_wk0_nub.log | log_decode | tail

 

If it's enabled, you will see the "punched HB" logs regularly. OR if it is disabled, you will see no output…

Jun 22 2019 CDT 19:20:16.495542[INF][TXT][sb:255-svc:12-wk:0] [HB] - Monitoring thread health

Jun 22 2019 CDT 19:20:16.495565[INF][TXT][sb:255-svc:12-wk:0] [HB] - thread sb:255-svc:7-wk:0 punched HB

Jun 22 2019 CDT 19:20:16.495577[INF][TXT][sb:255-svc:12-wk:0] [HB] - thread sb:255-svc:13-wk:0 punched HB

View solution in original post

1 Reply 1

andre.godoy
Level 1
Level 1

I recently had this answer

 

On the leaf, with root privilege,

cat /var/sysmgr/tmp_log/csco_hai_sb255_svc12_wk0_nub.log | log_decode | tail

 

If it's enabled, you will see the "punched HB" logs regularly. OR if it is disabled, you will see no output…

Jun 22 2019 CDT 19:20:16.495542[INF][TXT][sb:255-svc:12-wk:0] [HB] - Monitoring thread health

Jun 22 2019 CDT 19:20:16.495565[INF][TXT][sb:255-svc:12-wk:0] [HB] - thread sb:255-svc:7-wk:0 punched HB

Jun 22 2019 CDT 19:20:16.495577[INF][TXT][sb:255-svc:12-wk:0] [HB] - thread sb:255-svc:13-wk:0 punched HB

Save 25% on Day-2 Operations Add-On License