cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
853
Views
5
Helpful
15
Replies

inter-VRF PBR - blackhole safeguard?

mickpro77
Level 1
Level 1

Hi,

I work for an ISP and we use inter-VRF PBR to route forward and route leaking via BGP to route back.

We add/remove route-map sequences into/from our PBR, and related ACLs, daily and, sometimes, we suffer traffic blackholes due to route-map sequences not matching any existing ACL.

In most cases (if not all), it's because someone has added a new route-map sequence into the PBR matching an ACL that doesn't exist (if misspelt in the RM sequence's matching clause for example), or because someone has removed an ACL before removing any route-map matching it in the PBR first...

Logic would say that if a RM sequence is not matching any existing ACL it shouldn't match any traffic, be ignored basically, or, the opposite, deny it all, which means it would then have to use normal routing to go forward (which would cause the same issue though, i.e a traffic blackhole, as the source VRF, where PBR is applied, doesn't have routing to the required destinations in other VRFs), but no!

The RM sequence matches ALL traffic then... which causes it to be sent to the VRF associated to that specific faulty RM sequence (as we use inter-VRF as a reminder), preventing all RM sequences after the faulty one from being hit at all and, therefore, stopping VRFs associated (as in in their "set clauses") to those RM sequences further down the PBR order from receiving any traffic at the same time...

Some kind of "match-all" hidden implicit rule basically.

So my question is simple, is there any safeguard, or way-around, preventing that "match-all" implicit rule?

In the meantime, on top of internal deployment processes such as cfg peer-review before implementations, which greatly help in reducing the frequence of occurrence of those traffic blackholes, in order to quickly fix them when they happen, we have implemented a detection "mechanism", which is just an IP SLA constantly pinging something that is exclusively reachable via the very last RM sequence in the PBR, and that is monitored via SNMP, so if a PBR blackhole happens and the IP SLA can therefore not ping its destination anymore, our monitoring system will alert us about that IP SLA timing out...

We have also tried to match a second dummy ACL in a RM sequence hoping that if at least 1 of the 2 ACLs exist, the "match-all" implicit rule will not trigger but it doesn't work sadly. Both ACLs must exist...

15 Replies 15

Harold Ritter
Cisco Employee
Cisco Employee

Hi @mickpro77 ,

This is a day one behavior in IOS. If the ACL configured in the route-map does not exist, all traffic matches the route-map instance. Many have been hit by this behavior in the past. Extra care should be taken when creating or modifying the route-map.

Regards,

Harold Ritter
Sr Technical Leader
CCIE 4168 (R&S, SP)
harold@cisco.com
México móvil: +52 1 55 8312 4915
Cisco México
Paseo de la Reforma 222
Piso 19
Cuauhtémoc, Juárez
Ciudad de México, 06600
México

Hi Harold,

Thanks for the prompt feedback.

So there is no way around it at all? That's surprising... as this is quite a big flaw... or if not a flaw per-se it's surely a strange design!

Hi @mickpro77 ,

Yes, you could see it as a strange design, but given that this behavior has been around since day 1 in Cisco IOS (many many years ago), I would not wait for this behavior to change any time soon. I think configuration automation is the way to avoid this human error issue, especially if these changes are frequent as you mentioned.

Regards,

Harold Ritter
Sr Technical Leader
CCIE 4168 (R&S, SP)
harold@cisco.com
México móvil: +52 1 55 8312 4915
Cisco México
Paseo de la Reforma 222
Piso 19
Cuauhtémoc, Juárez
Ciudad de México, 06600
México

I think months ago I see same issue and that time I dont have time to check but today I was have some time 

instead of using route-map you can use prefix-list 
as lab below 
I use prefix-list MHM and I add seq 1000 ( in end of prefix-list) deny 0.0.0.0/0 le 32 

then when I need to add any prefix I add prefix with lower seq this prevent R1 send all prefix unless we manually permit that prefix 

Screenshot (836).png

Screenshot (837).pngScreenshot (838).pngScreenshot (839).png

Hi @MHM Cisco World ,

The behavior will be exactly the same whether you use an ACL or a prefix list. In other words if the ACL or prefix list used in the route-map does not exist, all packet will match the route-map instance.

Regards,

Harold Ritter
Sr Technical Leader
CCIE 4168 (R&S, SP)
harold@cisco.com
México móvil: +52 1 55 8312 4915
Cisco México
Paseo de la Reforma 222
Piso 19
Cuauhtémoc, Juárez
Ciudad de México, 06600
México

Just workaround as he ask.

Add high seq number with deny 0.0.0.0/0 le 32 make prefix-list not advertise any prefix unless we permit it

MHM

Hi @MHM Cisco World ,

The example you are giving is for BGP advertisement with a prefix list and does not use a route-map. His use case is a route-map being used in the context of PBR.

Regards, 

Harold Ritter
Sr Technical Leader
CCIE 4168 (R&S, SP)
harold@cisco.com
México móvil: +52 1 55 8312 4915
Cisco México
Paseo de la Reforma 222
Piso 19
Cuauhtémoc, Juárez
Ciudad de México, 06600
México

mickpro77
Level 1
Level 1

@MHM Cisco World as Harold said it's not for BGP advertisement but for PBR indeed.

Plus, as we police traffic based on both source and destination, which PLs can't do, we need to use (extended) ACLs.

Thanks for taking the time to look into it and suggest ideas anyways!

@Harold RitterThanks for your feedback again, I did think about automation and started looking into netconf/the YANG suite but it's beyond my knowledge/competences to be honest.

My idea was to get netconf to check the ACL matched exists by looking up all existing ACLs within the router running cfg, whenever a new RM sequence is added. And, if it doesn't, either prevent the implementation or give a warning/error message.

For deletions of ACLs involved in PBR/matched in RM sequences, I had in mind to get netconf to make sure that the ACL isn't matched anywhere by looking up keywords "match ip address ACLNAME" from the router running cfg. And, if it is, again, either prevent the deletion or give a warning/error message.

But I don't even know if it's possible/achievable.

If it is, would you be kind enough to provide guidance/docs that may help me achieve this please?

Or perhaps you have better/more realistic ideas to do this? I'm all ears if you do.

Dont worry' I thought it for bgp prefix advertise.

Anyway you use pbr for route leak between vrf and global?

MHM

Hi @MHM Cisco World ,

Not for route leaking, but rather for PBR.

Regards,

Harold Ritter
Sr Technical Leader
CCIE 4168 (R&S, SP)
harold@cisco.com
México móvil: +52 1 55 8312 4915
Cisco México
Paseo de la Reforma 222
Piso 19
Cuauhtémoc, Juárez
Ciudad de México, 06600
México

We use inter-VRF PBR for forward routing and BGP leaking for routing back.

So from one VRF to another in other words.

We don't use PBR for routing between VRFs and global, no.

We use global exclusively for routing between our public IP ranges and the Internet.

Friends 

Instead of pbr ge can try use vrf receiver

MHM

Please develop but keep in mind we are talking about an established network heavily relying on PBR, as a matter of fact we currently have hundreds (possibly thousands) of PBR RM sequences and ACLs routing/policing production traffic as we speak, so we obviously cannot simply get rid of PBR, not easily at least.

Ab26
Level 1
Level 1

@mickpro77 i’m trying to implement inter VRF PBR, so would you please share with me a sample configuration you use?

@Harold Ritter i’d be grateful if you could help  

I’ve used some from this link but I haven’t really succeeded 

 https://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst9500/software/release/17-15/configuration_guide/rtng/b_1715_rtng_9500_cg/configuring_vrf_aware_pbr.html?dtid=osscdc000283

Review Cisco Networking for a $25 gift card