Solved: Re: core-rtr01 and core-rtr02 are not reachable in my sandbox

henriklb · ‎06-30-2023

10.10.20.181 dev ens160 lladdr 52:54:00:0b:09:aa STALE
10.10.20.172 dev ens160 lladdr 52:54:00:1d:88:72 STALE
10.10.20.176 dev ens160 lladdr 52:54:00:1c:86:c9 STALE
10.10.20.254 dev ens160 lladdr 00:50:56:bf:41:f0 REACHABLE
10.10.20.171 dev ens160 lladdr 52:54:00:1c:ce:e6 STALE
10.10.20.174 dev ens160 FAILED
10.10.20.178 dev ens160 lladdr 52:54:00:0a:c5:80 STALE
10.10.20.173 dev ens160 FAILED
10.10.20.177 dev ens160 lladdr 52:54:00:0f:ef:c0 STALE
10.10.20.175 dev ens160 lladdr 52:54:00:1d:ae:b5 STALE

The core-rtr01 (.173) and core-rtr02 (.174) are not reachable in my sandbox. Found a similar issue from years ago here: https://community.cisco.com/t5/devnet-sandbox/connectivity-issues-to-network-devices-in-nso-sandbox/td-p/4497195
Please advice

Jesus Illescas · ‎07-03-2023

Hi, something I noticed in my case, is that the management interface in the XR nodes has no configuration and is shutdown. I just created a new sandbox and test it, it worked.

Reviewing the nodes in CML (https://10.10.20.161/) I can see there is a day-0 configuration for the nodes.

However if we go to the XR node, I see the interface is empty

RP/0/RP0/CPU0:core-rtr02#show run int MgmtEth0/RP0/CPU0/0
Mon Jul  3 13:42:49.144 UTC
interface MgmtEth0/RP0/CPU0/0
 shutdown
!

RP/0/RP0/CPU0:core-rtr02#

If you look closely, you can see the name of the mgmt interfaces are different in the day-0 and in the xr node. One is MgmtEth0/0/CPU0/0 and the other MgmtEth0/RP0/CPU0/0. XR is actually loading this config, but as a preconfigure interface since is not able to find it. I believe this is the result of a recent upgrade to the sandbox.

In my case, this was the issue I was having with the mgmt of the XR nodes when using the NSO reservable sandbox.

To fix it, copy the mgmt configuration that is under the "edit config" tab on the XR node in CML and apply it to the current XR mgmt interface MgmtEth0/RP0/CPU0/0.

Another issue you may find when entering the XR nodes from the console, is that it will ask you for an username & password, in my case I used root/cisco123 but I also configured a cisco/cisco user and pass since NSO is expecting to work with cisco/cisco. Feel free to experiment in NSO to use other credentials for the XR nodes, but if you want to use cisco/cisco for the learning lab you can use the following configuration to the XR nodes.

username cisco
 group root-lr
 group cisco-support
 secret 10 $6$79vo/0mCRUH56/0.$LBYhgKtuzl6zy6DOrCUgHewAQ4VC6084hinDJ1Z81T1m2fOgph0QKNWYqgsjI3.SCuAa0Uk04GE1w6hG1Q1jN.
!

I hope this helps, in my case it resolved my issues. I'm also informing the sandbox team about this, so this solution can be applied automatically.

By the way, for this sandbox, if you have issues with resources is better to release your current reservation and create a new one.

View solution in original post

henriklb · ‎06-30-2023

I will tear this one down and try to spin up a new one

bigevilbeard · ‎06-30-2023

I think you are using the NSO sandbox? The devices are in CML, if you can open the CML via the UI you might be able to console into them or reload/restart them via CML.

Please mark this as helpful or solution accepted to help others
Connect with me https://bigevilbeard.github.io

bigevilbeard · ‎06-30-2023

Also i saw in another thread an issue - https://community.cisco.com/t5/devnet-sandbox/nso-sandbox-does-not-start-due-to-cml-license-issue/td-p/4864706

Please mark this as helpful or solution accepted to help others
Connect with me https://bigevilbeard.github.io

henriklb · ‎07-03-2023

Thanks for your prompt reply.

After tearing down the first NSO sandbox/lab and (seemingly) getting a new one the same issue is present. The core-rtr01 (.173) and core-rtr02 (.174) are still not reachable. I stopped and started the rtr's via CML (I also manually reloaded core-rtr01 via console) without it helping. I also tried "stop lab" followed by "start lab" in the CML and the core-rtr01 boots briefly before it changes to status "stopped".

Not impressed with the lab env so far

EDIT:
For clarity "Learn NSO: The Easy Way" is the NSO lab guide that I am following

henriklb · ‎07-03-2023

After attempting "wipe node" on both rtr's they wont start again:

core-rtr01: Failed to choose a suitable compute node (Insufficient HW resources)	2023-07-03 10:40:12	warning
core-rtr02: Failed to choose a suitable compute node (Insufficient HW resources)	2023-07-03 10:40:06	warning

I guess I should tear down the entire lab once again from the devnet sandbox in hopes of that I was just unlucky and got assigned a faulty lab env twice

henriklb · ‎07-03-2023

After my second teardown and third fresh lab env.

My guess is that the sandbox is choking and unable to boot the IOS XR devices due to too strict hw/vm resource limitations.
core-rtr01 log output:

|0000059699|Cisco IOS XR console     will start on the 1st serial port
|0000059699|Cisco IOS XR aux console will start on the 2nd serial port
|0000059699|Cisco Calvados console   will start on the 3rd serial port
|0000059699|Cisco Calvados aux       will start on the 4th serial port
|0000639066|Telnet escape character is '^Q'.
|0000639067|Trying 127.0.0.1...
|0000639067|Connected to localhost.
|0000639067|Escape character is '^Q'.
|0000641190|init: Unable to create device: /dev/kmsg
|0000641291|mount: can't find /dev in /etc/fstab
|0000641291|mkdir: cannot create directory '/run': File exists
|0000674608|bootlogd: ioctl(/dev/pts/2, TIOCCONS): Device or resource busy
|0000675112|Running postinst /etc/rpm-postinsts/100-dnsmasq...
|0000675112|update-rc.d: /etc/init.d/run-postinsts exists during rc.d purge (continuing)
|0000675112| Removing any system startup links for run-postinsts ...
|0000675112|  /etc/rcS.d/S99run-postinsts
|0000675212|Configuring network interfaces... done.
|0000675313|Starting system message bus: dbus.
|0000675313|Starting OpenBSD Secure Shell server: sshd
|0000675313|  generating ssh RSA key...
|0000675413|  generating ssh ECDSA key...
|0000675413|  generating ssh DSA key...
|0000675413|  generating ssh ED25519 key...
|0000675413|sshd start/running, process 2494
|0000675413|Starting rpcbind daemon...done.
|0000675614|Starting random number generator daemon.
|0000675614|Starting system log daemon...0
|0000675614|Starting kernel log daemon...0
|0000675714|tftpd-hpa disabled in /etc/default/tftpd-hpa
|0000675714|Starting internet superserver: xinetd.
|0000675714|Libvirt not initialized for container instance
|0000675815|Starting crond: OK
|0000676719|SIOCSIFTXQLEN: No such device
|0000676719|SIOCSIFTXQLEN: No such device
|0000682767|
|0000682767|

|0000682767|
ios con0/RP0/CPU0 is now available
|0000682767|

|0000682767|

|0000682868|

|0000682868|

|0000682868|

|0000682868|
Press RETURN to get started.
|0000682868|

|0000682868|
eth0: SIOCETHTOOL ioctl: Operation not supported.
|0000707633|0/RP0/ADMIN0:Jul  3 09:18:48.298 UTC: inst_agent[3165]: %INFRA-INSTAGENT-4-XR_PART_PREP_IMG : SDR/XR image baking in progress  

|0000707935|
|0000707935|
|0000707935|
|0000707935|
|0000707935|This product contains cryptographic features and is subject to United 
|0000707935|States and local country laws governing import, export, transfer and 
|0000707935|use. Delivery of Cisco cryptographic products does not imply third-party 
|0000707935|authority to import, export, distribute or use encryption. Importers, 
|0000707935|exporters, distributors and users are responsible for compliance with 
|0000707935|U.S. and local country laws. By using this product you agree to comply 
|0000707935|with applicable laws and regulations. If you are unable to comply with 
|0000707935|U.S. and local laws, return this product immediately. 
|0000707935|
|0000707935|A summary of U.S. laws governing Cisco cryptographic products may be 
|0000707935|found at:
|0000707935|http://www.cisco.com/wwl/export/crypto/tool/stqrg.html
|0000707935|
|0000707935|If you require further assistance please contact us by sending email to 
|0000707935|export@cisco.com.
|0000707935|
|0000707935|
|0000707935|
|0000820996|0/RP0/ADMIN0:Jul  3 09:20:41.676 UTC: inst_agent[3165]: %INFRA-INSTAGENT-4-XR_PART_PREP_RPM : SDR/XR additional RPM installation is in progress  

|0001043591|0/RP0/ADMIN0:Jul  3 09:24:24.279 UTC: inst_agent[3165]: %INFRA-INSTAGENT-4-XR_PART_PREP_RESP : SDR/XR partition preparation completed successfully  

|0001058010|0/RP0/ADMIN0:Jul  3 09:24:38.713 UTC: vm_manager[3195]: %INFRA-VM_MANAGER-4-INFO : Info: vm_manager started VM default-sdr--2  

|0001182767|RP/0/RP0/CPU0:Jul  3 09:26:43.455 UTC: cvac[69143]: %MGBL-CVAC-4-CONFIG_START : Configuration is being applied from file /etc/sysconfig//iosxr_config.txt. Recommend user waits until a CVAC status message is displayed before attempting to manually configure. 

|0001184294|LC/0/0/CPU0:Jul  3 09:26:44.917 UTC: ifmgr[201]: %PKT_INFRA-LINK-3-UPDOWN : Interface GigabitEthernet0/0/0/1, changed state to Down 

|0001184294|LC/0/0/CPU0:Jul  3 09:26:44.917 UTC: ifmgr[201]: %PKT_INFRA-LINK-3-UPDOWN : Interface GigabitEthernet0/0/0/0, changed state to Down 

|0001184395|LC/0/0/CPU0:Jul  3 09:26:45.012 UTC: ifmgr[201]: %PKT_INFRA-LINK-3-UPDOWN : Interface GigabitEthernet0/0/0/0, changed state to Up 

|0001184395|LC/0/0/CPU0:Jul  3 09:26:45.014 UTC: ifmgr[201]: %PKT_INFRA-LINK-3-UPDOWN : Interface GigabitEthernet0/0/0/1, changed state to Up

https://learningnetwork.cisco.com/s/question/0D56e0000CptpGZCQY/xrv-9000-image-is-flaky-and-does-not-boot-moved-from-i7-desktop-hardware-to-used-server-with-dual-xeon-e52450-v2-16-cores-total-w64gb-ram-kicking-the-tires-after-install-to-build-trust-b...

bigevilbeard · ‎07-03-2023

It would appear from your diagnostic you are correct. The Cisco engineer team would need to look at this.

Please mark this as helpful or solution accepted to help others
Connect with me https://bigevilbeard.github.io

henriklb · ‎07-03-2023

How do I go by letting the Cisco engineer team know? Are you able to assist with that?

bigevilbeard · ‎07-03-2023

They read the threads for users with issues, i do not think there is a direct way to reach out anymore. You could post in the Webex space also with the thread? https://developer.cisco.com/site/devnet-chat/

Hope this helps.

Please mark this as helpful or solution accepted to help others
Connect with me https://bigevilbeard.github.io

Jesus Illescas · ‎07-03-2023

Hi, something I noticed in my case, is that the management interface in the XR nodes has no configuration and is shutdown. I just created a new sandbox and test it, it worked.

Reviewing the nodes in CML (https://10.10.20.161/) I can see there is a day-0 configuration for the nodes.

However if we go to the XR node, I see the interface is empty

RP/0/RP0/CPU0:core-rtr02#show run int MgmtEth0/RP0/CPU0/0
Mon Jul  3 13:42:49.144 UTC
interface MgmtEth0/RP0/CPU0/0
 shutdown
!

RP/0/RP0/CPU0:core-rtr02#

If you look closely, you can see the name of the mgmt interfaces are different in the day-0 and in the xr node. One is MgmtEth0/0/CPU0/0 and the other MgmtEth0/RP0/CPU0/0. XR is actually loading this config, but as a preconfigure interface since is not able to find it. I believe this is the result of a recent upgrade to the sandbox.

In my case, this was the issue I was having with the mgmt of the XR nodes when using the NSO reservable sandbox.

To fix it, copy the mgmt configuration that is under the "edit config" tab on the XR node in CML and apply it to the current XR mgmt interface MgmtEth0/RP0/CPU0/0.

Another issue you may find when entering the XR nodes from the console, is that it will ask you for an username & password, in my case I used root/cisco123 but I also configured a cisco/cisco user and pass since NSO is expecting to work with cisco/cisco. Feel free to experiment in NSO to use other credentials for the XR nodes, but if you want to use cisco/cisco for the learning lab you can use the following configuration to the XR nodes.

username cisco
 group root-lr
 group cisco-support
 secret 10 $6$79vo/0mCRUH56/0.$LBYhgKtuzl6zy6DOrCUgHewAQ4VC6084hinDJ1Z81T1m2fOgph0QKNWYqgsjI3.SCuAa0Uk04GE1w6hG1Q1jN.
!

I hope this helps, in my case it resolved my issues. I'm also informing the sandbox team about this, so this solution can be applied automatically.

By the way, for this sandbox, if you have issues with resources is better to release your current reservation and create a new one.

bigevilbeard · ‎07-03-2023

Nice catch, that configuration last i saw was 4.3.2!

The change in management interface from MgmtEth0/ to MgmtEth0/RP0/ occurred in Cisco XRv9000 IOS XR Release 6.6.2

Prior to this release, the management interface was accessed using the IP address of MgmtEth0. In Release 6.6.2, the management interface was changed to use the IP address of MgmtEth0/RP0/. This change was made to improve the scalability and security of the management interface.

Please mark this as helpful or solution accepted to help others
Connect with me https://bigevilbeard.github.io