cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1326
Views
3
Helpful
7
Replies

Database Server not-running Cisco ISE

oum-odom
Level 1
Level 1

Dear Cisco ISE lover, 

Recently we were testing lab to re-issue related to Cisco ISE "Database Server not-running",
while on user on Secure client UI "Posture Failed due to server issues"

We configure the Distributed Deployment 2 nodes PAN and SAN. 
ISE v3.1 P10
Both PSN are active-active during issue

+ Node 1 (PAN)
Database Server not-running  
   - oracle.jdbc.driver.T4CTTIoer11.processError(T4CTTIoer11.java:509)
   - root: notice:[application:operation:isehourlycron.sh] ISE Database is not running, aborting cleanup…

+ Node 2 (SAN)
 Database Server running

So, we used the workaround solution to resolve by stop and start application ise on issue node (PAN). 
Then everything was okay. 
While we open case with TAC, they suspect this was memory exhaustion
By the ways, we back date to report of healthy report on ISE. there was not any peak performance especially with memory. 

Please kindly share us your experience and provide the suggest ensuring never happen again in future.
Thank you,

 

1 Accepted Solution

Accepted Solutions

appreciate your support @Aref Alsouqi 
After working with tac, this is the bug CSCwr50566 also suggest to upgrade to 3.4 latest patch.

View solution in original post

7 Replies 7

@oum-odom hi, as TAC explained this can be due to memory exhaustion or memory leak bugs. i were not able to find specific bug in bug tool for exact 3.1 patch 10. but there is few for other versions. you can check the bug list here.

Bug Search Tool

additionally, i am personally recommending to test latest ISE major version if you are planning to test and deploy in production. (i am not sure about the current use case at its on LAB). 3.1 and 3.2 versions are to be go End of support in 2027 Nov and development will stop 2026 Nov.

Please rate this and mark as solution/answer, if this resolved your issue
Good luck
KB

Thank @Kasun Bandara for your comment as TAC can't define the root cause yet while required to wait until issue again and Generate heap & Thread dump. 

It doesn't make sound while we need to wait until issue happen.
First, we think it was the bug, bug was resolved by exiting v3.1 P10
https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwm48867 

Hi @oum-odom 

 both commands are workarounds for Memory Leaking:

ise/admin# reload

and

ise/admin# application stop ise
ise/admin# application star ise

also use the following command to check for Memory Leaking:

ise/admin# tech top
top - 12:11:18 up 26 days, 16:28, 2 users, load average: 4.01, 4.53, 4.92
Threads: 1534 total, 1 running, 1533 sleeping, 0 stopped, 0 zombie
%Cpu(s): 14.5 us, 2.5 sy, 0.0 ni, 82.4 id, 0.1 wa, 0.0 hi, 0.5 si, 0.0 st
KiB Mem : 98832816 total, 708812 free, 11291044 used, 86832960 buff/cache
KiB Swap: 8191996 total, 8180988 free, 11008 used. 73835392 avail Mem
...

 

What HW are you using (SNS 36xx, SNS 37xx, ...) ? Disk size ?

 

Hope this helps !

 

@Marcelo Morais appreciate your support.
After working with tac, this is the bug CSCwr50566

@oum-odom ,

 thanks for the feedback !

 What HW are you using (SNS 36xx, SNS 37xx, ...) ? Disk size ?

 

 ISE 3.3 P8 was released, but reading the CSCwr50566 Database server not running after oracle crash, it looks like that 3.3 P8 still have the issue because:

" .. this issue is not applicable to 3.4 where we already have updated Oracle version running ... "

Can you confirm this with the TAC ?

 

If I were you I would just go ahead and spin up a new node and replace it with the one that is giving that database issue. I think that would be a cleaner and faster approach to sort it out.

appreciate your support @Aref Alsouqi 
After working with tac, this is the bug CSCwr50566 also suggest to upgrade to 3.4 latest patch.