Open a couple of ssh cli sessions to the UCSM with the local admin account. Make sure you enable logging the sessions to file.
On one of the ssh sessions run the following:
(nxos)# test aaa group ldap <username> <password>
(nxos)# test aaa server ldap <LDAP-server-IP-address> <username> <password> (Do this for all your listed ldap servers in your ucsm config)
Exit the nxos mode, and log into the local mgmt for each FI:
#connect local-mgmt a/b
local-mgmt#ping x.x.x.x (Do this for all ldap servers IPs and fqdn)
local-mgmt#telnet x.x.x.x <ldap-port> (do this for all ldap servers IPs and fqdn)
Example #telnet 192.168.1.12 389
The ping and telnet exercises are looking for name resolution/connectivity issues to ldap servers.
On your 2nd SSH session enable some debug output (make sure you have this ssh session being logged to file as these next steps will generate a lot of output)
nxos#debug aaa aaa-requests
nxos#debug ldap aaa-request-lowlevel
nxos#debug ldap aaa-request
Test your ldap logins through the GUI or CLI.
When you want to disable the debug output, go to the other ssh session not showing the ldap debug output, and run following command:
Once you are seeing the log in problems (as it sounds like it's sporadic), I would run through all steps just outlined, and in that order.
If all the ping/telnet tests are fine while the problems seem to be active, and the debug output isn't clear on what kind of problem is occurring, then you will probably want to open a TAC case and provide a summary of your testing, along with the debug output.