Hi,
Since the ACE stops forwarding traffic i assume it still communicates with secondary and that's why failover never happens and you manually have to shut down the ACTIVE. Now, if you are not able to login into it, it could be due to no minimum allocation for the management connections or Admin context to which you may be trying to connect. Can you try and telnet from secondary to this ACTIVE when the problem happens? We cannot say for sure what exactly is the problem unless we have DATA. You can enable syslog and send us the output for review.
Did you check if VMAC for VIP's was learned on switches and peripheral devices during the time of the issue? ACE would only stop forwarding traffic when it is out of resources but still existing connections should continue, high cpu, ACE running out of buffers etc . But we still should be able to login to ACE. You can also keep a console connection ready and get in via console to collect information like show tech and check other information. Best would be opening a TAC case for this.
Regards,
Kanwal