cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
357
Views
0
Helpful
4
Replies

XRd on CML 2.9

ricketds
Level 1
Level 1

Hello,

I am having a day trying to get XRd working in CML 2.9. I can run the docker image fine on CML utilizing the launch-xrd script. 
Comparing the script to my node YAML config, I am stuck at this point as everything appears to be the same. 
I get one of the following 2 errors via CML when booting the node. Random which one pops up but assuming they are related.
xrd-0: Failed to start (Unable to start node: Unexpected response from put/http://[::1]:8090/api/v0/containers/fe41076f-623e-467e-b558-e13a1910fc4c/start: {"code":30,"message":"no such process"})
xrd-0: Failed to start (Unable to start node: Unexpected response from put/http://[::1]:8090/api/v0/containers/fe41076f-623e-467e-b558-e13a1910fc4c/start: {"code":30,"message":"numerical result out of range"})

Debug logging in cockpit:
[31mERROR: grpc_server:43:LLD call failed (2): Unexpected response from put/http://[::1]:8090/api/v0/containers/fe41076f-623e-467e-b558-e13a1910fc4c/start: {"code":30,"message":"failed to Statfs \"/proc/130224/ns/net\": no such file or directory"}
virl2-lowlevel-driver.sh
ERROR start create network device id: fe41076f-623e-467e-b558-e13a1910fc4c device: "MgmtEth0/0/CPU0/0" err: failed to Statfs "/proc/130224/ns/net": no such file or directory group: "start"
docker-shim
ERROR link delete id: fe41076f-623e-467e-b558-e13a1910fc4c device: "MgmtEth0/0/CPU0/0" err: no such device group: "start"
docker-shim
ERROR cleanup due to id: fe41076f-623e-467e-b558-e13a1910fc4c device: "MgmtEth0/0/CPU0/0" err: failed to Statfs "/proc/130224/ns/net": no such file or directory group: "start"

Node YAML attached. 

driver: key I have messed with doesn't seem to matter what simulation I have it on, same error. Have tried server, iosxrv, and unmanaged-switch. I have also tried moving the XR_MGMT_INTERFACES  from eth0 to match my interface (ens18), no luck, but the launch_xrd script is also eth0, so moved it back. 

I'm looking for any assistance or if anyone has gotten XRd working in CML. I am just not finding any information outside of the recent How-to Create your own Docker container for CML 2.9, but this is more of an upgrade on an existing node. XRd tutorial documents are helpful, but not on the CML side. 

 

4 Replies 4

Martin L
VIP
VIP

try redit forum

Torbjørn
VIP
VIP

I suspect this issue is caused by the container exiting immediately upon start. Can you try running it with the following config?

{
  "docker": {
    "image": "ios-xr/xrd-control-plane:24.4.2",
    "privileged": true
  }
}

This way we can eliminate as many variables as possible. As long as the container starts privileged you should not need the caps, devices, security_opts or volume definitions - and we can verify whether we are able to get the container running without the management interface. I don't think the busybox key needs to be defined, but try adding it again if you get a new error message.

Please report back whether you get this working or not! I suspect that you are not the only person who will be trying to get this working.

Happy to help! Please mark as helpful/solution if applicable.
Get in touch: https://torbjorn.dev

Thanks for the reply,

Busybox key is required, or it errors why I added it originally, I did try true or false didn't matter. 
I removed all the extra's, still throws the same errors. Verified the container pops up for a moment then closes. 

Just the config.json section that I updated. 

{
  "docker": {
    "image": "ios-xr/xrd-control-plane:24.4.2",
    "privileged": true
  },
  "busybox": true
}

I tried messing with the node configuration more to no avail. I have tried adjusting the # of serial ports and interfaces. Tried messing with the Property Inheritance, all resulted in the same error. However tonight I am only getting the 1 error back, but the debug in cockpit is still the same.

I tried messing with 'interface 0', in the node configuration as that is what is labeled as "MgmtEth0/0/CPU0/0". As that seems to be where the problem starts (at least in my simple brain). Debug output of where it starts the container and the eventual error it throws to CML. 

[31mERROR:[0m grpc_server:43:LLD call failed (2): Unexpected response from put/http://[::1]:8090/api/v0/containers/fe41076f-623e-467e-b558-e13a1910fc4c/start: {"code":30,"message":"numerical result out of range"}
virl2-lowlevel-driver.sh

ERROR start create network device id: fe41076f-623e-467e-b558-e13a1910fc4c device: "MgmtEth0/0/CPU0/0" err: numerical result out of range group: "start"
docker-shim

ERROR link delete id: fe41076f-623e-467e-b558-e13a1910fc4c device: "MgmtEth0/0/CPU0/0" err: no such device group: "start"
docker-shim

ERROR cleanup due to id: fe41076f-623e-467e-b558-e13a1910fc4c device: "MgmtEth0/0/CPU0/0" err: numerical result out of range group: "start"
docker-shim

Started docker-a05913303ee1a7cf9166315e6cdfa146f5d0e1bb32194f5598b681446e602ade.scope - libcontainer container a05913303ee1a7cf9166315e6cdfa146f5d0e1bb32194f5598b681446e602ade.
systemd

Anything other then "MgmtEth0/0/CPU0/0" for Interface 0 throws an error for invalid name.
xrd-0: Failed to validate and define (Interface #0 label MgmtEth0/0/CPU0/0 is not valid for node definiton xrd-docker: defined interface label is Mg0/PR0/CPU0/0).
Tried as a GigE0/0/0/0, and Mg0/PR0/CPU0/0, and some local interfaces just for the the sake of trying. 

I agree, I think the issue is related to the Mgmt interface specification. I think your best bet is looking at where it goes wrong in virl2-lowlevel-driver.sh, I unfortunately don't have any "bandwidth" to spare to look into this further for the time being.

Happy to help! Please mark as helpful/solution if applicable.
Get in touch: https://torbjorn.dev