03-20-2018 04:22 PM - last edited on 03-25-2019 01:04 PM by ciscomoderator
Hi there,
I've been able to install/activate/start my LXC based application on IOx until recently. I was hoping that someone might point me in the direction of what to look at that could help me diagnose its issue. The app appears to start up and then, within a few seconds, it stops with an exit code of 1 - which of course could be anything.
If I use the "system troubleshoot" page of the Local Manager UI, the CAF log reports activation and startup of my app. There is a worrying message around SIGTERM not killing the process, but I'm unsure if that's the issue:
2018-03-20 22:15:35,055 [runtime.hosting:INFO] [CP Server Thread-10] [platformcapabilities.py:227 - check_disk_status()] Calling disk status script: /home/root/iox/caf/scripts/disk_status.sh
2018-03-20 22:15:35,082 [runtime.hosting:INFO] [CP Server Thread-10] [platformcapabilities.py:230 - check_disk_status()] Executed /home/root/iox/caf/scripts/disk_status.sh: returncode: 0, message:
2018-03-20 22:15:38,032 [runtime.hosting:INFO] [CP Server Thread-8] [controller.py:656 - start_app_async()] Starting app : loraserver
2018-03-20 22:15:38,038 [runtime.hosting:INFO] [Thread-12] [libvirtcontainer.py:144 - start()] Starting lxc container : LXC Container : Name: loraserver, UUID: 5e93f734-4cdf-4577-afae-104305acb7a2
2018-03-20 22:15:43,280 [command_wrapper:INFO] [Thread-12] [commandwrappers.py:941 - kill()] Sigterm did not kill the process 3020, so passing sigkill
2018-03-20 22:15:43,787 [pdservices:INFO] [Worker-1] [autoconfigcli.py:458 - container_event_cb()] Autoconfig network_event_cb. Event [started] ignored.
2018-03-20 22:15:50,291 [command_wrapper:INFO] [Thread-12] [commandwrappers.py:941 - kill()] Sigterm did not kill the process 3101, so passing sigkill
2018-03-20 22:15:54,243 [runtime.hosting:INFO] [Thread-12] [connectorwrapper.py:854 - setConnectorState()] Connector 'loraserver' state change: STOPPED-->RUNNING
2018-03-20 22:15:54,271 [runtime.hosting:INFO] [Thread-12] [connectorwrapper.py:777 - startConnector()] Connector 'loraserver' successfully started.
2018-03-20 22:15:54,320 [runtime.api.resources:INFO] [CP Server Thread-8] [connector.py:714 - on_post()] App: loraserver started
2018-03-20 22:15:55,824 [pdservices:INFO] [Worker-1] [autoconfigcli.py:458 - container_event_cb()] Autoconfig network_event_cb. Event [started] ignored.
...and then about a second later my program is stopped:
2018-03-20 22:16:13,389 [runtime.hosting:INFO] [MonitoringService] [connectorwrapper.py:854 - setConnectorState()] Connector 'loraserver' state change: RUNNING-->STOPPED
2018-03-20 22:16:13,391 [runtime.hosting:INFO] [MonitoringService] [connectorwrapper.py:813 - stopConnector()] Connector 'loraserver' successfully stopped.
2018-03-20 22:16:13,392 [runtime.hosting:INFO] [MonitoringService] [monitoring.py:332 - start_monitoring()] Exhausted the max restart attempts 5 for the app loraserver
2018-03-20 22:16:13,863 [pdservices:INFO] [Worker-1] [autoconfigcli.py:449 - container_event_cb()] Autoconfig network_event_cb. app_id:[loraserver] Event container stopped Payload:[[]]
2018-03-20 22:16:13,864 [pdservices:INFO] [Worker-1] [autoconfigcli.py:394 - delete()] AutoconfigcliService service. delete()
2018-03-20 22:16:13,864 [pdservices:INFO] [Worker-1] [autoconfigcli.py:404 - delete()] Autoconfig delete not done. Service not enabled
So why was the app attempting to restart? Where should I look next as none of the other logs are yielding anything obviously interesting.
Thanks for your guidance.
Kind regards,
Christopher
Solved! Go to Solution.
03-20-2018 10:58 PM
Hi Christopher,
You may also access the app console to see what happened for the LXC and the running software.
See the part of "How to access application console?" in https://developer.cisco.com/docs/iox/#troubleshooting-guide/troubleshooting-guide.
03-20-2018 04:29 PM
I should add that non of the app's logs yield anything interesting to me (container log):
2018-03-20 22:15:38.073+0000: starting up
PATH=/sbin:/usr/sbin:/bin:/usr/bin LIBVIRT_DEBUG=3 LIBVIRT_LOG_OUTPUTS=3:stderr /usr/lib64/libvirt/libvirt_lxc --name loraserver --console 22 --security=smack --handshake 25 --veth vnet1
PATH=/bin:/sbin TERM=linux container=lxc-libvirt HOME=/ container_uuid=5e93f734-4cdf-4577-afae-104305acb7a2 LIBVIRT_LXC_UUID=5e93f734-4cdf-4577-afae-104305acb7a2 LIBVIRT_LXC_NAME=loraserver startcontainer.sh
2018-03-20 22:15:38.137+0000: 1: info : libvirt version: 1.2.19
2018-03-20 22:15:38.137+0000: 1: warning : lxcContainerUnmountSubtree:617 : Failed to unmount '/.oldroot/var/volatile/run/libvirt/lxc/loraserver.devpts', trying to detach subtree '/.oldroot': Invalid argument
and watch dog:
Waiting to get the ip for interfaces: eth0
Waiting to get the ip for interfaces: eth0
Waiting to get the ip for interfaces: eth0
Waiting to get the ip for interfaces: eth0
Waiting to get the ip for interfaces: eth0
Got the ip address for interface eth0
All interfaces got the ips
APP START TIME:1521584153
App loraserver started with PID : 92
Monitoring this process now
App loraserver completed with exit code: 1
APP END TIME:1521584170
Time taken by App : 0 minutes and 17 seconds.
I do note that syslog reports a problem eth0, and there was some configuration done with the 829 yesterday (my app requires eth0 as per its package.yaml):
Mar 20 22:15:12 qemux86-64 dhclient: XMT: Solicit on eth0, interval 129960ms.
Mar 20 22:15:12 qemux86-64 dhclient: send_packet6: No such device
Mar 20 22:15:12 qemux86-64 dhclient: dhc6: send_packet6() sent -1 of 56 bytes
Mar 20 22:15:38 qemux86-64 kernel: device vnet0 entered promiscuous mode
Mar 20 22:15:38 qemux86-64 kernel: IPv6: ADDRCONF(NETDEV_UP): vnet0: link is not ready
Mar 20 22:15:38 qemux86-64 kernel: IPVS: Creating netns size=1376 id=10
Mar 20 22:15:38 qemux86-64 kernel: eth0: renamed from vnet1
Mar 20 22:15:52 qemux86-64 dhclient: Internet Systems Consortium DHCP Client 4.3.2
Mar 20 22:15:52 qemux86-64 dhclient: Copyright 2004-2015 Internet Systems Consortium.
Mar 20 22:15:52 qemux86-64 dhclient: All rights reserved.
Mar 20 22:15:52 qemux86-64 dhclient: For info, please visit https://www.isc.org/software/dhcp/
Mar 20 22:15:52 qemux86-64 dhclient:
Mar 20 22:15:52 qemux86-64 kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vnet0: link becomes ready
Mar 20 22:15:52 qemux86-64 kernel: dpbr_0: port 3(vnet0) entered forwarding state
Mar 20 22:15:52 qemux86-64 kernel: dpbr_0: port 3(vnet0) entered forwarding state
Mar 20 22:15:52 qemux86-64 dhclient: Listening on LPF/eth0/52:54:dd:24:b6:c9
Mar 20 22:15:52 qemux86-64 dhclient: Sending on LPF/eth0/52:54:dd:24:b6:c9
Mar 20 22:15:52 qemux86-64 dhclient: Sending on Socket/fallback
Mar 20 22:15:52 qemux86-64 dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
Mar 20 22:15:52 qemux86-64 dhclient: DHCPACK from 192.168.1.1
Mar 20 22:15:52 qemux86-64 dhclient: bound to 192.168.1.9 -- renewal in 37440 seconds.
Mar 20 22:15:53 qemux86-64 dhclient: Internet Systems Consortium DHCP Client 4.3.2
Mar 20 22:15:53 qemux86-64 dhclient: Copyright 2004-2015 Internet Systems Consortium.
Mar 20 22:15:53 qemux86-64 dhclient: All rights reserved.
Mar 20 22:15:53 qemux86-64 dhclient: For info, please visit https://www.isc.org/software/dhcp/
Mar 20 22:15:53 qemux86-64 dhclient:
Mar 20 22:15:54 qemux86-64 dhclient: Listening on Socket/eth0
Mar 20 22:15:54 qemux86-64 dhclient: Sending on Socket/eth0
Mar 20 22:15:54 qemux86-64 dhclient: XMT: Solicit on eth0, interval 1050ms.
Mar 20 22:15:55 qemux86-64 dhclient: XMT: Solicit on eth0, interval 2170ms.
Mar 20 22:15:57 qemux86-64 dhclient: XMT: Solicit on eth0, interval 4220ms.
Mar 20 22:16:01 qemux86-64 dhclient: XMT: Solicit on eth0, interval 8200ms.
Mar 20 22:16:10 qemux86-64 dhclient: XMT: Solicit on eth0, interval 17100ms.
Mar 20 22:16:11 qemux86-64 kernel: dpbr_0: port 3(vnet0) entered disabled state
Mar 20 22:16:11 qemux86-64 kernel: device vnet0 left promiscuous mode
Mar 20 22:16:11 qemux86-64 kernel: dpbr_0: port 3(vnet0) entered disabled state
Mar 20 22:16:11 qemux86-64 dhclient: receive_packet failed on eth0: Network is down
Mar 20 22:16:27 qemux86-64 dhclient: XMT: Solicit on eth0, interval 33590ms.
Mar 20 22:16:27 qemux86-64 dhclient: send_packet6: No such device
Mar 20 22:16:27 qemux86-64 dhclient: dhc6: send_packet6() sent -1 of 56 bytes
Thanks.
03-20-2018 10:58 PM
Hi Christopher,
You may also access the app console to see what happened for the LXC and the running software.
See the part of "How to access application console?" in https://developer.cisco.com/docs/iox/#troubleshooting-guide/troubleshooting-guide.
03-20-2018 11:01 PM
Thanks for the reply Steve. I should have mentioned that I tried that. Unfortunately, the app doesn't stay up long enough to be able to - even when I activate it with debug.
03-21-2018 01:08 AM
It turns out that I used `debug` as an argument to activation, and not `--debug`. I was able to login to the app after all, and discovered my issue. Thanks.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide