cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
339
Views
10
Helpful
2
Replies
EricMonson
Beginner

ydk-cpp deadlock xmlCleanupParser

Hi all,

 

I have an application that connects to multiple NCS55a2s. Occasionally, the software will request status from all of the devices (in my case, it's gathering all the alarms on the devices). To make the request go faster, each of the NETCONF requests occur simultaneously in a separate thread. My application has ended up in a deadlock. The thread that all others are waiting on has the following stack trace:

 

 

#0  0x00007fee590f74ed in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00007fee590f483e in _L_lock_39 () from /lib64/libpthread.so.0
#2  0x00007fee590f4778 in pthread_cond_destroy@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#3  0x00007fee58a08d3e in xmlFreeRMutex () from /lib64/libxml2.so.2
#4  0x00007fee58a5d160 in xmlDictCleanup () from /lib64/libxml2.so.2
#5  0x00007fee589af6e3 in xmlCleanupParser () from /lib64/libxml2.so.2
#6  0x000000000061e958 in ydk::path::RootSchemaNodeImpl::populate_new_schemas_from_payload(std::string const&, ydk::EncodingFormat) ()
#7  0x0000000000611862 in ydk::path::Codec::decode(ydk::path::RootSchemaNode&, std::string const&, ydk::EncodingFormat) ()
#8  0x00000000005f79d4 in ydk::path::netconf_output_to_datanode(std::string const&, ydk::path::RootSchemaNode&) ()
#9  0x00000000005fa7e4 in ydk::path::NetconfSession::handle_netconf_operation(ydk::path::Rpc&) const ()
#10 0x00000000005fabd4 in ydk::path::NetconfSession::invoke(ydk::path::Rpc&) const ()
#11 0x0000000000623eb3 in ydk::path::RpcImpl::operator()(ydk::path::Session const&) ()
#12 0x00000000005f1a8f in ydk::get_entity(ydk::NetconfServiceProvider&, ydk::DataStore, ydk::Entity&, char const*) ()
#13 0x00000000005f2297 in ydk::NetconfService::get(ydk::NetconfServiceProvider&, ydk::Entity&) ()
#14 0x00000000005ef463 in ydk::NetconfServiceProvider::execute_operation(std::string const&, ydk::Entity&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > >) ()
#15 0x00000000005df9dd in ydk::CrudService::read(ydk::ServiceProvider&, ydk::Entity&) ()

I am not a libxml2 expert by any means. But I've been trying to read why we would end up in a deadlock inside of an XML routine. I've discovered that libxml2 requires xmlInitParser to be called once in the main thread of a multi threaded application before any other calls to its API, and xmlCleanupParser to be called once at the end of the program. Specifically, I found the following statement on their API reference page: 

 

----------------------------------------------------------

(http://xmlsoft.org/html/libxml-parser.html)

Function: xmlCleanupParser
void xmlCleanupParser (void)

This function name is somewhat misleading. It does not clean up parser state, it cleans up memory allocated by the library itself. It is a cleanup function for the XML library. It tries to reclaim all related global memory allocated for the library processing. It doesn't deallocate any document related memory. One should call xmlCleanupParser() only when the process has finished using the library and all XML/HTML documents built with it. See also xmlInitParser() which has the opposite function of preparing the library for operations. WARNING: if your application is multithreaded or has plugin support calling this may crash the application if another thread or a plugin is still using libxml2. It's sometimes very hard to guess if libxml2 is in use in the application, some libraries or plugins may use it without notice. In case of doubt abstain from calling this function or do it just before calling exit() to avoid leak reports from valgrind !

----------------------------------------------------------

 

This leads me to a couple of questions:

  1. xmlInitParser is never called in the ydk-cpp codebase. Is the application supposed to call xmlInitParser before using the ydk-cpp libraries if it is multithreaded?
  2. Why is xmlCleanupParser called in the file https://github.com/CiscoDevNet/ydk-cpp/blob/master/core/ydk/src/path/root_schema_node.cpp line 67. In my case, since I have multiple threads all making NETCONF requests at the same time, it is completely possible that the response of two network devices can came back in a time frame where both threads are processing the response (ie both are decoding XML at the same time). It sounds like the call to xmlCleanupParser would cause undefined and bad behavior in this use case ... which would explain my deadlock.

Any thoughts or insights to this issue would greatly be appreciated.

 

Thanks,

Eric

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
yangorelik
Participant

I checked libxml2 documentation and found this note:

Generally xmlCleanupParser() is safe assuming no parsing is ongoing and no document is still being used, if needed the state will be rebuild at the next invocation of parser routines (or by xmlInitParser()), but be careful of the consequences in multithreaded applications.

I then commented out line 67 in the file https://github.com/CiscoDevNet/ydk-cpp/blob/master/core/ydk/src/path/root_schema_node.cpp and reran memory leak test. I did not find any memory leak related to the use of libxml2 library, therefore the line can be safely commented out. Please rerun your test after recompiling YDK C++ core library and let us know if that resolved your issue. 

Yan Gorelik
YDK Solutions

View solution in original post

2 REPLIES 2
yangorelik
Participant

I checked libxml2 documentation and found this note:

Generally xmlCleanupParser() is safe assuming no parsing is ongoing and no document is still being used, if needed the state will be rebuild at the next invocation of parser routines (or by xmlInitParser()), but be careful of the consequences in multithreaded applications.

I then commented out line 67 in the file https://github.com/CiscoDevNet/ydk-cpp/blob/master/core/ydk/src/path/root_schema_node.cpp and reran memory leak test. I did not find any memory leak related to the use of libxml2 library, therefore the line can be safely commented out. Please rerun your test after recompiling YDK C++ core library and let us know if that resolved your issue. 

Yan Gorelik
YDK Solutions

View solution in original post

We removed the xmlCleanupParser call. We still saw an instance of a deadlock, but looking at the documentation for libxml2, we decided we should call "xmlInitParser" at the beginning of our "main" routine and "xmlCleanupParser" at the end of our "main" routine. Once we did that, the application ran cleanly over the last week. It seems like removing xmlCleanupParser from root_schema_node.cpp did help.

Content for Community-Ad

This widget could not be displayed.