cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
901
Views
20
Helpful
16
Replies

AXL: listRemoteDestination causing CUCM hang after upgrade to 14

lwrightssl
Level 1
Level 1

Hi,

We've recently upgraded our CUCM infrastructure to 14.0.1.12900-161 from 11.5 SU10, and since that point we've found that a script we use to list out all Remote Destinations is causing the CUCM node which is queried to CPU flatline at 100% without returning a response - the node then has to be rebooted. Example AXL query below - note the issue appears independent of the API version that's used (XML below has 11.5 in it).

This issue is directly attributable to the CUCM upgrade as it's been working fine unchanged for the past 6 years or so.

Sample, simple, XML query which demonstrates the issue via a CURL query below:

<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:ns0="http://www.cisco.com/AXL/API/11.5" xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"><ns1:Body><ns0:listRemoteDestination><searchCriteria><remoteDestinationProfileName>%</remoteDestinationProfileName></searchCriteria><returnedTags><name></name><destination></destination><answerTooSoonTimer></answerTooSoonTimer><answerTooLateTimer></answerTooLateTimer><delayBeforeRingingCell></delayBeforeRingingCell><remoteDestinationProfileName></remoteDestinationProfileName><enableMobileConnect></enableMobileConnect><mobilityProfileName></mobilityProfileName></returnedTags></ns0:listRemoteDestination></ns1:Body></SOAP-ENV:Envelope>

Support appreciated please - we have a formal paid-for Cisco support contract for CUCM but TAC have directed us to post here.

Thanks,

Lawrence.

16 Replies 16

Obviously it is SUPPOSED to be backward compatible with older schema versions, but have you tried changing your query to use the newer schema from 14.0 and see if that resolves the problem? Specifically changing something like this.

WSDL_FILE = 'schema/AXLAPI.wsdl'

Yup, first thing I tried I'm afraid (sorry I should have stated that in the above!)

Well it was worth asking! What is the system that is making the query? I am guessing some kind of shell script that is then calling CURL. FWIW, I have been using more and more Python for this sort of thing although sometimes I duct tape Python, shell, AWK, etc together.

Disclaimer: I do these kind of scripting/automation things intermittently. I'll have something where I need it, but then not have to touch anything like it for months and months.

Translation: long time Unix person and long time Cisco Collab person. Python only in the last year or two.

So we're actually doing it in Python, however from a "getting support" point of view I thought it best to express in something universal (i.e. query sent via CURL) to remove the programming language from the equation and get right down to "here's the XML flying across the wire"

We're doing a whole bunch of other API calls as well (essentially we've fully automated our user, soft and hardphone, device profile, directory number, hunt group management etc, etc provisioning) but listRemoteDestination is the only one we've hit a problem with post-upgrade.

I trust you have looked at all the basics like cluster DB replication all being happy? I assume you are sending the query to the PUB, but have you tried one of the SUB's? Does the query work properly and quickly if you do a 'run sql' version of it from the CLI?

Yup DB replication is happy, and we have indeed been sending queries to the PUB.

Given we've had to reboot the PUB to recover from this issue we're going to arrange an experimentation window where we can safely experiment with the other queries. I'm particularly interested in the "run sql" result (whether from CLI or via AXL-SQL) as that gives me a potential workaround (not solution, due to SQL interface potentially being unstable vs stable API!) to this issue. 

mradell
Level 1
Level 1

I'm curious, does the server return anything at all before locking up? As far as troubleshooting I'd maybe check the AXL logs in RTMT and see if anything stands out. I did this myself as I've recently been doing some Python to create a tool to export/import speed dials from our 12.5 cluster. I see the following in the AXL logs from a successful query:

2023-01-25 08:06:10,729 WARN  [http-nio-1026-exec-141] wrappers.RequestHeaderWrapper - Client sent an unsupported header, SOAPAction= [null], attempting to auto correct.
2023-01-25 08:06:10,729 WARN  [http-nio-1026-exec-141] wrappers.RequestHeaderWrapper - SOAPAction header correction needed: Version is malformed or missing
2023-01-25 08:06:10,730 WARN  [http-nio-1026-exec-141] wrappers.RequestHeaderWrapper - SOAPAction header correction needed: Api is malformed or missing
2023-01-25 08:06:10,798 INFO  [http-nio-1026-exec-141] servletRouters.AXLAlpha - Executing api: getPhone in axis
2023-01-25 08:06:10,945 INFO  [http-nio-1026-exec-141] axlapiservice.GetPhoneHandler - Checking Capf fields for partial response
2023-01-25 08:06:10,961 INFO  [http-nio-1026-exec-141] filters.TimingFilter - Request 1668807300875 received from axl_api at ip 10.116.44.113 was processed in 232ms

I'm using the correct API/Header (12.5) so I'm not sure why I'm getting the WARN messages, but the request works fine and returns the correct information.

So I'm not seeing anything coming back from CURL - I'll get the server side logs checked, good call.

I looked at your query and it looks like you are trying to get ALL remote destination profiles (search criteria = '%'). Is that really what you intend? Rather than debugging CURL, it seems like it would make more sense to debug the Python you are actually using. Are you using the Zeep libraries? If not, I would suggest them. I have had really good luck with those. Something like this.

 

 

    source_axl, source_history = axl_bind(source_server, source_userid, source_password)

    my_rdp = source_axl.getRemoteDestinationProfile(name=my_rdp_name_var)

 

 

Yep, getting everything is absolutely what we intend and need to do here. We're (unfortunately) still using SUDS and will need to do a big old project to migrate everything across to something new (with Zeep absolutely being what we've looked at). At the end of the day, that'll still be (effectively) making the same query.

BUT in terms of workarounds we're looking at splitting the queries so we filter the remote destination profiles on an A% thru Z% basis and then joining the result set up, or alternatively (as above) we can hopefully fallback to using an AXL-SQL query instead (though I'd really rather minimise our use of those).

Key thing is that these are workarounds - if CUCM 14 doesn't want us to query all Remote Destinations in one hit, it should give us an error, not just hang. Plus then there's the additional question of even if we split up the search space into 26 separate queries, will it be happy with that in the (perhaps inconceivable) situation that we had an awful lot of people whose name began with one particular letter Don't really want to get to the stage where we need to make 26^2 queries!!

I haven't used it, but I know the AXL stuff is SUPPOSED to support pagination. I haven't used that yet either. I agree that it SHOULDN'T just hang up and spike the CPU, but I don't have anything else to suggest.

lwrightssl
Level 1
Level 1

Update:

* If I change the query from % to A% then I successfully get data back, therefore the workaround of doing 26 separate queries appears viable, although there's obviously the risk that a particular letter prefix is "too popular" and results in the same hang

* The hang happens on both publisher and subscriber targets, in both cases they just sit at ~100%, and don't return any output to CURL

* No sign of anything strange in AXL logs, last thing related to the query was:

2023-01-30 20:11:26,147 INFO [http-nio-9446-exec-24] servletRouters.AXLAlpha - Executing api: listRemoteDestination in axis

Note that logging continues for other functions, even with CPU nailed to 100%

* Looks like AXL SQL queries might be a viable workaround but I confused myself trying to work out exactly what I need to query due to all the different tables that need joining up. Will try this again in quieter time as a backup workaround.

Any ideas? This absolutely looks like a CUCM bug, but other than posting here there doesn't appear to be any way to get support. TAC aren't interested.

 

Thanks!

You may be able to accomplish this by utilizing the 'skip' and 'first' tags near the end of the request. I believe they are designed to help you send smaller queries in the case of the returned data being too big (over 8mb) for AXL to handle, although I've never used them. See this thread (https://community.cisco.com/t5/audio-and-video-endpoints/alx-read-requests-query-too-large-how-to-split-request/m-p/4739143#M4370). Looks like you could send an initial request for x amount of records, then 'skip' that many records in the next request, etc. etc.

Also, I've found the developer forums to be much more useful when looking for answers related to APIs like AXL. They have some specifically for collaboration (https://community.cisco.com/t5/developer-collaboration/ct-p/j-developer-collab)

Will have a look at skip/first - may provide another workaround, ta! Didn't realise the developer forums were separate, will post in there too, thanks.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: