While preparing a test bed for a customer before implementing PfRv3 solution I stumbled upon quite odd behavior of the border routers when an TCA is generated after simulating an impairment on the WAN network.
At the moment I have one DC site with 2 BRs on ISR4331 and MC running on CSR1000v. The remote sites with routers 1921 and 2911, 2 WAN connections each. The 2 WAN connections are configured as INET1 and INET2. For the testing purposes and in normal WAN conditions I send the EF traffic over INET1 (DMVPN Tunnel1011) and CS3 with CS2 over INET2 (DMVPN Tunnel1012). Routing protocol is EIGRP. The PfRv3 domain is configured as IWAN and applied to "TRANSIT" VRF in HUB location and "LAN" VRF in branch location. To a apply a WAN network impairment I use WAN Bridge.
Here is the test bed network topology I used in my lab:
So I make a phone call between the main site (DC) and the branch site (I tried initiating the traffic in either site) and also have PING running on both site to generate CS2 traffic. I can observe the correct channels and correct traffic classes on the HUB MC and the Branch MC with the correct exits. Now when I apply the delay of 600ms (using WAN Bridge) only for the INET1 transport in the branch site) so I affect the EF traffic only. Within the 4 seconds (as per HUB MC configuration) the branch MC receives the TCA from the HUB BR1 that there is one-way-delay higher as per policy on the INET1 path (DMVPN Tuneel1011) to the branch and the branch MC instructs the branch BR to route the EF traffic to the main site over the INET2 path which corresponds to Tunnel1012 of the DMVPN. The HUB MC does not receive any notification from the branch BR, or should it receive the notification from hub BR? What I noticed is that the HUB BRs route the EF traffic as there were normal conditions that being said the BR1 keeps sending the EF traffic over INET1 (Primary Channel Tunnel1011) and BR2 keeps redirecting the EF traffic over Tunnel0 to BR1 (Its Primary Channel). This way the EF traffic (say from the main site to the branch site) enters BR1 (which is also DMVPN HUB router for Tunnel1011) and travels over INET1 (Tunnel1011) then enters BR on branch site but as per PfRv3 policy path change notification on branch BR the EF traffic returns over INET2 (Tunnel1012) into the HUB BR2 and then redirected back to BR1 over Tunnel0, thus I get asymmetric routing. Same happens when I affect the INET2 transport on the branch site and the CS2 and CS3 traffic are routed over INET1, same asymmetric routing happens.
When I shut down the branch BR router interface whether it is one that corresponds to INET1 or INET2 transport network this time the HUB MC received unreachable notification from the HUB BR and the EF or CS2 traffic is routed in a correct way, HUB BR1 not longer routes the EF traffic over INET1 but redirects it over Tunnel0 to BR2 and INET2 (Tunnel1012) to the branch site.
I attach the routers relevant configs and also the PfR outputs like route overrides, channels, traffic-classes and also TCA syslog messages generated.
Why is this happening? Why the HUB MC does not receive any TCAs when there is high delay on the network? Does it depend on the PfRv3 policy configured on the HUB MC? Say for EF traffic there are 3 conditions to be met like delay, packet loss or jitter (if I remember correctly) but I guess only one condition needs to be met (high delay) because the branch MC does receive the TCA notifications from the HUB BR.
On HUB BRs and MC I'm running Everest 16.6.4 and also tried Fuji 16.9.1. On branch 1921 and 2911 I'm running 15.7.3(M3).
Thank you in advance for any comments and clues.
Thank you for looking into my problem. I'm not my desk at the moment but as soon as I'm back I will update this post with the digram and the relevant configs.
I attached the network diagram and also the relevant config parts of all the 4 routers. Also in the original post I attached the PfR outputs for analysis so you can see that the HUB BRs do not generate any TCA messages to the HUB MC when the network conditions change, therefore I end up with asymmetric routing.
Here in this post I attach 2 more PfR outputs (could not attach all the files in the original post) when I shut down one of the branch router interfaces. When this happens the HUB BRs generate TCAs to the HUB MC, as opposed when only high delay is introduced into the WAN links.
Thanks in advance for any clues.