cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
986
Views
1
Helpful
11
Replies

Rogger Side B restarts every time , when we decide to make it active

AnoopKrishnan
Level 1
Level 1

Hi All, 

The ICM Rogger Side B restarts every time when it is supposed to be active, and while analyzing the logs, what we can find is "Connectivity with a duplexed partner has been lost due to a failure of the private network, or duplexed partner is out of service" We have checked every possible log and couldn't even find any other cause apart from this. We also involved our Network team and vendor and also did continuous pings on the Rogger A and B on the Private and the Public IP's. That also led us to a dead end. When it is active on Rogger A, it is very much stable. The issue only happens when Side B is active.

Looking for suggestions and answers if anyone has came across this kind of issue earlier. 

 

Regards, 

Anoop 

11 Replies 11

Have you checked some of the basic things like making sure the hostnames are correct on the B side.
Also, in a maintenance window have you tried shutting down both sides, and then start B, vs. having it be a fail over situation? That isn't a fix, but may help you determine where the issue is.

The hostnames are correct on the B side, also we had run the Setup many times and verified the configurations. We shut down both sides during a Microsoft patching on a Weekend and brought Side B as active. But after a day or two, it bounced back to Side A. During the analysis, it gave us the same error "Private Link Failure". Even increased the Trace levels on all the ICM components which got bounced along with the Rogger, still no help.

The UCCE version is 12.6 (1)

Hi, so that's a little different than what you said earlier. Is the issue that you were surprised that you had a private link failure/took a network hit?
So if A is active and B is standby, your private network link is good
But if B is active, it always fails over to A after a day or two, is that the scenario?

Hi Bill, 

It is not about the network, it's about Side B which always failover to Side A. when it is kept as active. In the MDS logs, we can see that it is "Connectivity with a duplexed partner has been lost due to a failure of the private network, or duplexed partner is out of service". Whereas Side A do not have any issues like this, when it is Active.

This might not be feasible due to the risk of if there is a failure, but have you tried having only B active, do you see this issue? In other words, does B only have a problem when A is also running, vs. if you have B running but services are stopped on A?

We didn't keep Side B running alone, by shutting down the A side, as this is a production and one failure can impact the Business. The last flap did open a Pandora's Box, but we recovered immediately.

Hi, 
some thoughts on this:
- Check DNS configuration also for public Interaces. The Checkbox "autregister DNS" must not be set in Network Interface configuration. 
- Are there any monitoring applications (Prognosis, Nagios etv) installed? => uninstall them
- Disable Antivirus
- Disable Windows Defender Firewall (this always is causing trouble!)
- Check Firewall between both sides, if there is one present. (refer to Port Utilization Guide)
- Are there any hardening configuraitons done on the Windows OS? => revert them
- Update to 12.6(2) (this includes almost all of the Engineering Specials for 12.6(1).

Kindly let me know if anything of the above helps/solved the issue.

Kind regards
Dennis

Hi Dennis, 

Please Check below 

- Check DNS configuration also for public Interaces. The Checkbox "autregister DNS" must not be set in Network Interface configuration.- Do you mean "Register this connections addresses in DNS"?
- Are there any monitoring applications (Prognosis, Nagios etv) installed? => uninstall them -->Not Installled
- Disable Antivirus-->As per Security only Crowdstrike Sensor is installed on all Servers
- Disable Windows Defender Firewall (this always is causing trouble!)-->Disabled long way back
- Check Firewall between both sides, if there is one present. (refer to Port Utilization Guide)-->No Firewall exist
- Are there any hardening configuraitons done on the Windows OS? => revert them-->had done, Need to check 
- Update to 12.6(2) (this includes almost all of the Engineering Specials for 12.6(1).-->in Pipeline, but not sure it'll fix the issue, seeing this from 12.5 

Post some logs that will help a ton.

david

Muhammed Ashiq
Level 1
Level 1

Put a continues ping also to check any connectivity issues.

Make a batch file like below and monitor between Side B private and Side A private IPs.

=================================================

set D=%date:~-4,4%%date:~-10,2%%date:~-7,2%

ping /S source IP /t destination IP -n 60 >> C:\Pingresponse\Pingresponse_RoggerB-to-RoggerA(Private)-%D%.txt

time /T >> C:\Pingresponse\Pingresponse_RoggerB-to-RoggerA(Private)-%D%.txt

C:\Pingresponse\Pingresponse_RoggerB-to-RoggerA(Private).bat

AnoopKrishnan
Level 1
Level 1

Hi All, 

We had run the continuous ping through the Power Shell, with the help of Cisco TAC and found out that there were Packet Drops in the Private link, so right now we are working with the Service Provider to fix this, Thank You everyone for your support on this.