cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
920
Views
0
Helpful
10
Replies

Flapping between expressway C and E after regular firewall maintenanc

matthew-aw
Level 1
Level 1

Our customer are suffering an transversal zone flapping issue between expressway C and E for more than two month starting from Aug this year

We have upgraded the expressway version from 12.6 to 12.7.1 to 14.0.11 according to the TAC reply which should be a version bug. However, the problem is still exists. Please find our finding so far

1.this issue happens after customer have their firewall maintenance every week.
2.there is no firewall between C and E

3.firewall only exist between E and public network

4.service can be resumed if we restart both express C and E

5.before TAC suspect it is an bug, they check the log and found NTP error when incident happens which has 40 seconds different between C and E. But we have check the NTP status during incident is normal. It is strange why the log should NTP sudden error in very short time

6. TAC found the PCAP the sip option has 40 seconds delay between C and E so why the flapping  happens before they identify it is system bug.

 

We are so frustrating the problem has outstanding more then two month which not yet can be resolved

 

If this is network issue, our customer has checked the network status which is all normal. If we want point this issue to network issue we need evidence. However, telnet is not allowed in expressway C and E

 

Also, if this is network issue, it is hard to say the service can be resume after we restart expressway C and E

 

The symptom when flapping is jabber can be login but the phone service will keep failed and the voicemail and IM will be keep flapping

 

Hope if anyone can help us which this problem has troubles us so long

 

 

10 Replies 10

matthew-aw
Level 1
Level 1

Attached are the log capture when incident happens for the expressway C and E with PCAP. We also have sysdump but it is so large

Expressway C IP:
10.26.26.30

10.26.36.30

10.26.36.31

Expressway E IP:

10.26.3.27

10.26.7.27

10.26.7.28

When I ran the logs you shared through the CSA tool, I observed clock synchronization issues, an XMPP connection failure on Expressway C, and an SSL certificate eror indicating it does not meet the requirements.

I am a bit confused. According to your message, you have three C and three E servers, but the logs only show two C and two E servers.

Have there been any recent changes to the configurations or setup? Additionally, could you provide more details about  your certificates?

 

NithinEluvathingal_0-1730617362539.png

NithinEluvathingal_1-1730617373319.png

 

 



Response Signature


Dear Nithin,

Thank you for your feedback

The certificate has no problem. As I mentioned, the flapping happens when customer do their regular firewall maintenance. The zone connection works well as usual and the certificate is working well too.

Once the issue happens, we just restart any one of the expressway E and the connection can be restored. TAC had checked the certificate and all them are fine and fulfill Cisco standard 

sorry I missing one expressway log uploaded. There should be six log and I have only upload 5 of them in previous reply

I have upload again the six expressway server.

Attached are the full dump I upload to google drive Please download them by below link

There is no configuration changes so far, our customer has used them for more than two years but this incident happens on Aug in this year.

https://drive.google.com/file/d/1lZTgYWLV_spctw3huBV2rHDXO1i9m7Fe/view?usp=sharing

 

Not really related to your issue as such, but I cannot refrain from asking why you don’t have a firewall between the C and E? The firewall traversal that the tunnel formed by the C to E is basically the main use case for using Expressways. Not having that would pretty much negate the use case of the Expressways. At least in the sense of if someone penetrates your external firewall perimeter they would have a larger attack surface.



Response Signature


Thanks for your feedback. From the customer replied, there is no firewall between C and E on day one. If there is a firewall between C and E, I can suspect the root cause is from the firewall but in fact they said no. The next step if we still cannot sort out solution, we are trying migrate the C to E same subnet for trying

What is actually in-between the E and C currently?



Response Signature


There is three ESXi and both have one E and C. The transversal zone is set to 3 to 3 and no firewall between them

matthew-aw
Level 1
Level 1

Just clarify with customer, there is a firewall between C and E and the flapping happens every regular firewall maintenance. We are wondering if this is a HA issue