cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
6571
Views
10
Helpful
18
Replies

9800-80 17.3.7 and Prime 3.8.1 connection problem

Gehrig_W
Level 1
Level 1

Hello Cisco WLAN Experts,

today I did an upgrade on our central 9800-80-WLC from 17.3.5a to 17.3.7.

After the upgrade the following Event-Message appeared several times in the Gui:

Chassis 1 R0/0: ncsshd_bp: NETCONF/SSH: fatal: mm_answer_sign: Xkey_sign failed: error in libcrypto

Also on our Prime I noticed that some of the APs that are connected to the 9800-80-WLC are reported

AP `xyz-123' disassociated from Controller 9800-80

I did already a Sync on Prime for the 9800-80, but the APs are still reported as being "Not Registered"

and Last Reboot Reason "Image Upgrade Success".

Did also a Reset on one of these WLAN-APs without improvement.

Who knows more about the Event-Message and the Prime problem ?

Thank You for any hints and Tipps.

Kind regards

Wini

 

 

 

 

 

 

 

 

 

 

 

18 Replies 18

balaji.bandi
Hall of Fame
Hall of Fame

Looks for me Bug - but before we go in to bug

what is the reason of upgrade ?

On WLC cat 9800 do you see AP associated ?

 

BB

***** Rate All Helpful Responses *****

How to Ask The Cisco Community for Help

Scott Fella
Hall of Fame
Hall of Fame

Most of the time you can Google the message to get an idea of the issue. If you can’t find anything then opening a TAC case is your best source. As far as Prime, typically that issue might be with the version and some compatibility issue. I have spun up a new PI 3.10 when I added the 9800’s never really tried with a lower version than that. If you have the resources, spin a new PI 3.10 and add the controller and see if the issue goes away. At least then you know the answer.. compatibility. 

-Scott
*** Please rate helpful posts ***

marce1000
VIP
VIP

 

 - It's probably similar to this onehttps://bst.cisco.com/bugsearch/bug/CSCvt43974 . I would try to go beyond 17.3.x such as 17.9.3 (if you still have Wave1 series APs and need support for them) , 

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Rich R
VIP
VIP

Agreed with Marce - no excuse to stay on 17.3 now that 17.9.3 supports the wave 1 ac APs.

Note 17.3 goes end of software maintenance in 2 days time and no more security fixes after September!
https://www.cisco.com/c/en/us/products/collateral/ios-nx-os-software/ios-xe-17/ios-xe-17-3-x-eol.html

Hello,

thank You for Your recommendation to use version 17.9.3 instead.

But this version does not support Cisco Prime 3.8.1 according to Your SW-compatibility-matrix. It shows Cisco Prime 3.10.2 instead. Would You still recommend to use 17.9.3 in this case ?

Today I digged a little deepe into our Prime 3.8.1. It now shows a failed telemetry status for the 9800-80-WLC.

I tried opened also a TAC. But our service provider missed to add the serial numbers of our 9800-WLCs to the service contract.

What a mess !! Who can help with Telemetry between a 9800 and Prime ?

It is intersting to note, that all 160 WLAN-APs are working on the 9800. Exatly 16 of them, all 3800-APs are reported to be gone on the Prime system.

And how can I rollback the software to 17.3.5a on a 9800-WLC?

Thank You for Your tipps

Greetings

Wini

 

 

 

 

 

 

  >...But this version does not support Cisco Prime 3.8.1 according to Your SW-compatibility-matrix. It shows Cisco Prime 3.10.2 instead. Would You still recommend to use 17.9.3 in this case ?
     Since Prime is kind of an ending product (line) , I would always advice to use it's latest version , it will support 17.9.3

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Hello Marce,

thank You for Your advice to use the latest Prime version. To setup a new one in Version 3.10 will take me weeks of burden.

I simply have trusted Cisco that 9800-80-HA using 17.3.7 will work with Prime 3.8.1 according to SW-compatibilty-matrix

to solve a Field-notice problem with new 9103 V03 not joining the WLC due to new drivers from new Cisco suppliers.

But apparently it does not. It seems to me that after the upgrade, the telemetry link is broken!!

I also cannot find any telemetry-info in the config regarding the connection to our Prime anymore.

For example, this block and many more, is missing after the update:

telemetry ietf subscription 113891536

 encoding encode-tdl

 filter tdl-transform BsnMobileStationStats

 stream native

 update-policy periodic 90000

 receiver ip address a.b.c.d 20830 protocol cntp-tcp

telemetry ietf subscription 117267423

 encoding encode-tdl

 filter tdl-transform LradIfChannelNoise

 stream native

 update-policy periodic 180000

 receiver ip address 10.200.67.67 20830 protocol cntp-tcp

The check also shows that Telemetry has been gone after the SW-upgrade:

WLC-9800#show telemetry internal protocol cntp-tcp manager 10.200.67.67 20830 protocol cntp-tcp source-address 10.222.126.4

% Error: Connection '10.200.67.67:20830::10.222.126.4' doesn't exist

 

How can i make telemetry work agian after the SW-Upgrade ?

We are using a HA-setup by the way. After the upgrade, we are running on the former standby now.

Does it makes sense to force a failover ?

Will the second machine do a better job ?

Please check and come back with information.

Kind regards

Wini

 

 

 

 

 

 

 

Prime-WLC problems often require a delete and re-add.

Note 3.8 is approaching EOL so you need to be planning the move to 3.10 anyway - software maintenance ending 15 July 2023:
https://www.cisco.com/c/en/us/products/collateral/cloud-systems-management/prime-infrastructure/prime-infrastructure-pids-pb.html
And the entire product is going EOL - you'll need to use 3.10 until then because there won't be any new major releases and software maintenance only till 28 September 2024:
https://www.cisco.com/c/en/us/products/collateral/cloud-systems-management/prime-infrastructure/prime-infrast-gen-appliance-lic-eol.html

 

Gehrig_W
Level 1
Level 1

Hello Marce,

in the meantime I found Bug 9800 Controller telemetry failure issue   CSCvx46784

Symptom: Prime Infrastructure may fail to collect telemetry data from a managed Catalyst 9800 wireless LAN controller. On closer inspection, the "tdlcold" process in the Coral service became stuck in the CNDP_STATE_CON_INIT state. When the Coral container was restarted, the eWLC controller was able to reconnect and the Coral service's state was able to transition to CNDP_STATE_CON_CONNECTED. Conditions: This was observed in Prime Infrastructure 3.8, managing a Catalyst 9800 wireless LAN controller.

Can someone please tell me how to restart a Coral Container on a Prime?!?

What a mess

Wini

 

 

                         Can someone please tell me how to restart a Coral Container on a Prime?!?
   Ref : https://www.cisco.com/c/en/us/support/docs/wireless/catalyst-9800-series-wireless-controllers/214286-managing-catalyst-9800-wireless-controll.html
           >...Note: On Prime 3.8, Coral service can be restarted outside of container using 'sudo /opt/CSCOlumos/coralinstances/coral2/coral/bin/coral restart 1'

   Appendix : below are a number of other useful commands for debugging telemetry
           show telemetry ietf subscription all 
           show telemetry ietf subscription 23 detail (e.g.)
           show telemetry internal subscription all stats
           show telemetry internal connection <0-4294967294> detail
           show telemetry ietf subscription configured


 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Restart the coral service has solved the issue on my side. Thx.

UKW-NK-Cisco
Level 1
Level 1

Hello marce1000,

thank You very much for Your explanations and commands. I have restarted the Coral container on Prime but it did not help.

When I run the command "show telemetry ietf subscription configured" on the 9800-80 I see only four IDs:

WLC-9800#show telemetry ietf subscription configured
Telemetry subscription brief

ID Type State Filter type
--------------------------------------------------------
124 Configured Valid tdl-uri
125 Configured Valid transform-name
126 Configured Valid transform-name
127 Configured Valid tdl-uri

These entries are used by DNA-Space and point to our DNA Spaces connector.

When I do this on the 9800-L-C-Guest-WLC, running still on 17.3.6, I can see a big list of entries like:

WLC-9800-Guest#show telemetry ietf subscription configured
Telemetry subscription brief

ID Type State Filter type
--------------------------------------------------------
102687940 Configured Valid tdl-uri
147114719 Configured Valid transform-name
221543406 Configured Valid transform-name
276706442 Configured Valid transform-name
310032319 Configured Valid transform-name

......

Looking into the detail of these IDs, I can find the IP of our Prime.

So in principal, Telemetry is running on our Prime towwrds the 9800-L-C-Guest-WLc with 17.3.6 software.

But no longer to the upgraded 9800-80-HA running 17.3.7 now!

As already said, the whole telemetry-block for connections to our Prime is missing in the running-config of the 9800-80.

Can You explain to me please, how this big block of commands is established in the running config ?

Can I copy this block of commands from my backup-config of the 9800-80 manually into the running-config instead ?

The 9800-80 has changed it's active unit after the issu-upgrade. Does it make sense to test a "redundancy force-failover" to change the active unit to the unit which was active unit before the SW-Upgrade to 17.3.7 ?

Will this bring back telemetry function ?

I also tried the Telnet-connection in comparison to the working 9800-L-C for Guest:

WLC-9800-Guest#telnet Prime-IP 20830
Trying Prime-IP, 20830 ... Open

[Connection to Prime-IP closed by foreign host]

WLC-9800#telnet Prime-IP 20830
Trying Prime-IP, 20830 ... Open

Keeps hanging in Open

From Prime to the non-working 9800-80 and working 9800-L-C

prime1/admin# ssh 9800-IP admin port 830

protocol identification string lack carriage return

Connection closed by 9800-IP port 830

prime1/admin#

 

prime1/admin#ssh 9800-L-C-IP admin port 830

protocol identification string lack carriage return

admin@9800-L-C-IP's password:

<?xml version="1.0" encoding="UTF-8"?>

<hello xmlns="urn:ietf:params:xml:ns:netconf:base:1.0">

<capabilities>

.......

Please check and advise

Kind regards

Wini

 

 

 

 

 

 

 

 

  >...The 9800-80 has changed it's active unit after the issu-upgrade. Does it make sense to test a "redundancy force-failover" to change the active unit to the unit which was active unit before the SW-Upgrade to 17.3.7 ?
  Don't think that will help  , and it's not advisable you may want to try 17.9.3 from which you can directly upgrade when on 17.3.x and  also has the support for Wave 1 series APs again available ,

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

UKW-NK-Cisco
Level 1
Level 1

Hello Marce1000,

thank You for Your advise. But Your recommended version 17.9.3 does not support Wave 1 2702-APs, of which we have around 600 pieces in our zoo. Also our Prime 3.8.1 is not supported by 17.9.3 !! Therefore the upgrade to 17.9.3 is not a good idea in my opinion.

By the way, the complaining Process on the 9800-80 is called "ncsshd_bp".

I would like to debug this process, but cannot find it using the command

WLC-9800#set platform software trace ?

Who knows how I can do a Per-process-Debugging this process ?

 

I also havent't found  the Alarm-Message in the Error-and-Messages-Guide for 9800-WLCs:

%DMI-2-NETCONF_SSH_CRITICAL: Chassis 1 R0/0: ncsshd_bp: NETCONF/SSH: fatal: mm_answer_sign: Xkey_sign failed: error in libcrypto

What does this mean and how can I solve this Telemetry-mistake?

Thank You for any tipps.

Kind regards

Wini

 

 

 

Review Cisco Networking for a $25 gift card