cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
3668
Views
5
Helpful
3
Replies

CSR HA Problem on Azure

Kamatamannnn
Level 1
Level 1

Hello.

 

We built CSR HA on Azure reffering to https://www.cisco.com/c/en/us/support/docs/routers/cloud-services-router-1000v-series/213687-csr1000v-ha-redundancy-deployment-guide.html

But errors recently has occured.

This logs are continuously output.

 

CSR's show logging

 

.Jun  3 2020 00:18:24.196 JST: SHELL-EXECUTION: Session timed out (command: "ha_api.py -c ping")
.Jun  3 2020 00:28:30.814 JST: %SYS-2-NOPROCESS: No such process 497 -Process= "Cloud HA", ipl= 0, pid= 86
-Traceback= 1#9ae4e6ddd6c2aa2a21997806924e75ee  :55CAF9B14000+33391E6 :55CAF9B14000+333EE04 :55CAF9B14000+3339649 :55CAF9B14000+33A4BE6 :55CAF9B14000+339EB11
*Jun 19 2020 01:51:32.739 JST: %IOSXE-3-PLATFORM: R0/0: kernel: Memory cgroup out of memory: Kill process 2806 (python) score 40 or sacrifice child
*Jun 19 2020 01:51:32.739 JST: %IOSXE-3-PLATFORM: R0/0: kernel: Killed process 2806 (python) total-vm:144844kB, anon-rss:19260kB, file-rss:1548kB

[guestshell@guestshell ~]$ cat azure/HA/azha.log

2020-06-03 08:55:02.068733 API: socket error [Errno 2] No such file or directory errno=2
2020-06-03 08:55:02.069618 node_event -i 0 -e revert  failed on attempt 1
2020-06-03 08:55:04.072413 API: socket error [Errno 2] No such file or directory errno=2
2020-06-03 08:55:04.073158 node_event -i 0 -e revert  failed on attempt 2
2020-06-03 08:55:06.075361 API: socket error [Errno 2] No such file or directory errno=2
2020-06-03 08:55:06.075450 node_event -i 0 -e revert  failed
2020-06-03 09:00:00.925982 API: socket error [Errno 2] No such file or directory errno=2
2020-06-03 09:00:00.927458 ping failed on attempt 1
2020-06-03 09:00:03.270577 API: socket error [Errno 2] No such file or directory errno=2
2020-06-03 09:00:03.322634 ping failed on attempt 2
2020-06-03 09:00:05.662322 API: socket error [Errno 2] No such file or directory errno=2
2020-06-03 09:00:05.663601 ping failed

Please tell me what is happening and what to do.

 

 

3 Replies 3

Francesco Molino
VIP Alumni
VIP Alumni
Hi

Can you run the following command:
show app-hosting utilization appid guestshell

Try to restart your guests hell to free memory.
See the following link:
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvq90876/?rfs=iqvred

Thanks
Francesco
PS: Please don't forget to rate and select as validated answer if this answered your question

Thank you for your replying.

 

I got "show app-hosting utilization appid guestshell"

 

CSR#show app-hosting utilization appid guestshellApplication: guestshell
CPU Utilization:
  CPU Allocation: 800 units
  CPU Used:       0.06 %
Memory Utilization:
  Memory Allocation: 512 MB
  Memory Used:       512456 KB
Disk Utilization:
  Disk Allocation: 1 MB
  Disk Used:       0.00 MB

Since the value of "Memory Used" is larger than "Memory Allocation", is it correct that the bug is applicable?

 

 

I can't be sure 100% but yes your matching big criteria.
I would recommend to open a tac case to validate. Rebooting the guestshell will solve the problem but it will be back.

Thanks
Francesco
PS: Please don't forget to rate and select as validated answer if this answered your question