01-06-2021 06:54 AM
Hi all,
I have an application that connects to multiple NCS55a2s. Occasionally, the software will request status from all of the devices (in my case, it's gathering all the alarms on the devices). To make the request go faster, each of the NETCONF requests occur simultaneously in a separate thread. My application has ended up in a deadlock. The thread that all others are waiting on has the following stack trace:
#0 0x00007fee590f74ed in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007fee590f483e in _L_lock_39 () from /lib64/libpthread.so.0 #2 0x00007fee590f4778 in pthread_cond_destroy@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #3 0x00007fee58a08d3e in xmlFreeRMutex () from /lib64/libxml2.so.2 #4 0x00007fee58a5d160 in xmlDictCleanup () from /lib64/libxml2.so.2 #5 0x00007fee589af6e3 in xmlCleanupParser () from /lib64/libxml2.so.2 #6 0x000000000061e958 in ydk::path::RootSchemaNodeImpl::populate_new_schemas_from_payload(std::string const&, ydk::EncodingFormat) () #7 0x0000000000611862 in ydk::path::Codec::decode(ydk::path::RootSchemaNode&, std::string const&, ydk::EncodingFormat) () #8 0x00000000005f79d4 in ydk::path::netconf_output_to_datanode(std::string const&, ydk::path::RootSchemaNode&) () #9 0x00000000005fa7e4 in ydk::path::NetconfSession::handle_netconf_operation(ydk::path::Rpc&) const () #10 0x00000000005fabd4 in ydk::path::NetconfSession::invoke(ydk::path::Rpc&) const () #11 0x0000000000623eb3 in ydk::path::RpcImpl::operator()(ydk::path::Session const&) () #12 0x00000000005f1a8f in ydk::get_entity(ydk::NetconfServiceProvider&, ydk::DataStore, ydk::Entity&, char const*) () #13 0x00000000005f2297 in ydk::NetconfService::get(ydk::NetconfServiceProvider&, ydk::Entity&) () #14 0x00000000005ef463 in ydk::NetconfServiceProvider::execute_operation(std::string const&, ydk::Entity&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > >) () #15 0x00000000005df9dd in ydk::CrudService::read(ydk::ServiceProvider&, ydk::Entity&) ()
I am not a libxml2 expert by any means. But I've been trying to read why we would end up in a deadlock inside of an XML routine. I've discovered that libxml2 requires xmlInitParser to be called once in the main thread of a multi threaded application before any other calls to its API, and xmlCleanupParser to be called once at the end of the program. Specifically, I found the following statement on their API reference page:
----------------------------------------------------------
(http://xmlsoft.org/html/libxml-parser.html)
Function: xmlCleanupParser
void xmlCleanupParser (void)
This function name is somewhat misleading. It does not clean up parser state, it cleans up memory allocated by the library itself. It is a cleanup function for the XML library. It tries to reclaim all related global memory allocated for the library processing. It doesn't deallocate any document related memory. One should call xmlCleanupParser() only when the process has finished using the library and all XML/HTML documents built with it. See also xmlInitParser() which has the opposite function of preparing the library for operations. WARNING: if your application is multithreaded or has plugin support calling this may crash the application if another thread or a plugin is still using libxml2. It's sometimes very hard to guess if libxml2 is in use in the application, some libraries or plugins may use it without notice. In case of doubt abstain from calling this function or do it just before calling exit() to avoid leak reports from valgrind !
----------------------------------------------------------
This leads me to a couple of questions:
Any thoughts or insights to this issue would greatly be appreciated.
Thanks,
Eric
Solved! Go to Solution.
01-06-2021 02:26 PM
I checked libxml2 documentation and found this note:
Generally xmlCleanupParser() is safe assuming no parsing is ongoing and no document is still being used, if needed the state will be rebuild at the next invocation of parser routines (or by xmlInitParser()), but be careful of the consequences in multithreaded applications.
I then commented out line 67 in the file https://github.com/CiscoDevNet/ydk-cpp/blob/master/core/ydk/src/path/root_schema_node.cpp and reran memory leak test. I did not find any memory leak related to the use of libxml2 library, therefore the line can be safely commented out. Please rerun your test after recompiling YDK C++ core library and let us know if that resolved your issue.
01-06-2021 02:26 PM
I checked libxml2 documentation and found this note:
Generally xmlCleanupParser() is safe assuming no parsing is ongoing and no document is still being used, if needed the state will be rebuild at the next invocation of parser routines (or by xmlInitParser()), but be careful of the consequences in multithreaded applications.
I then commented out line 67 in the file https://github.com/CiscoDevNet/ydk-cpp/blob/master/core/ydk/src/path/root_schema_node.cpp and reran memory leak test. I did not find any memory leak related to the use of libxml2 library, therefore the line can be safely commented out. Please rerun your test after recompiling YDK C++ core library and let us know if that resolved your issue.
01-19-2021 10:18 AM
We removed the xmlCleanupParser call. We still saw an instance of a deadlock, but looking at the documentation for libxml2, we decided we should call "xmlInitParser" at the beginning of our "main" routine and "xmlCleanupParser" at the end of our "main" routine. Once we did that, the application ran cleanly over the last week. It seems like removing xmlCleanupParser from root_schema_node.cpp did help.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide