06-07-2016 08:22 PM - edited 03-05-2019 04:11 AM
All,
I've never done what I'm describing before, and because there may be proprietary information in the actual output, I have to be unfortunately vague, but here's the situation:
A customer noticed an asymmetric path through his BGP cloud while investigating an issue. It wasn't causing a problem, but it was odd. Upon investigation, it was discovered that a BGP community which was carrying some missing routes accounting for the asymmetry was not appearing in the route entry table for the IP address the asymmetry was noted for.
To resolve the issue, the customer went to the route map used to set the community, removed the tag, re-applied it and did a soft reset of BGP.
Everything is working now and has been for two weeks, but the customer is requesting an explanation for why the peer simply stopped acknowledging the community in question.
BGP is not my strength, so I'm at a loss for how to set up a monitor for such an event - first - defining the problem, second - constructing a list of commands or debug initiation to determine whether the router we control is at fault or if there may be a problem with the peer (It's an ISP router)
Below are some relevant screenshots with RFC 1918 addressing:
HOSTNAME#sh ip bgp 10.77.216.2
BGP routing table entry for 10.77.208.0/20, version 7922490
Paths: (8 available, best #3, table default)
Advertised to update-groups:
36 132 138
Refresh Epoch 9
3549 10000, (received & used)
10.4.0.12 (metric 2) from 10.4.0.12 (10.4.0.12)
Origin incomplete, metric 0, localpref 100, valid, internal
Community: 100:10001 201:10000
Refresh Epoch 1
1803 10000, (received-only)
192.168.100.250 from 192.168.100.250 (10.247.4.58)
Origin incomplete, localpref 100, valid, external
Community: 201:10000
Extended Community: RT:1803:1929
Refresh Epoch 1
65000 10000
192.168.150.169 from 192.168.150.169 (192.168.150.169)
Origin incomplete, localpref 100, valid, external, best
Community: 100:10001
Refresh Epoch 1
65000 10000, (received-only)
192.168.150.169 from 192.168.150.169 (192.168.150.169)
Origin incomplete, localpref 100, valid, external
Refresh Epoch 1
2828 10000
192.168.160.9 from 192.168.160.9 (192.168.160.9)
Origin incomplete, localpref 100, valid, external
Community: 200:10001 201:10000
Refresh Epoch 1
2828 10000, (received-only)
192.168.160.9 from 192.168.160.9 (192.168.160.9)
Origin incomplete, localpref 100, valid, external
Community: 201:10000
Refresh Epoch 1
65000 10000
192.168.220.9 from 192.168.220.9 (192.168.220.9)
Origin incomplete, localpref 100, valid, external
Community: 100:10001 201:10000
Extended Community: RT:65000:65172
Refresh Epoch 1
65000 10000, (received-only)
192.168.220.9 from 192.168.220.9 (192.168.220.9)
Origin incomplete, localpref 100, valid, external
Community: 201:10000
Extended Community: RT:65000:65172
HOSTNAME(config)#route-map NAME_OF_ROUTE_MAP permit 100
HOSTNAME(config-route-map)#no set community 100:10001 additive
HOSTNAME(config-route-map)#do clear ip bgp * soft
HOSTNAME(config-route-map)#set community 100:10001 additive
HOSTNAME(config-route-map)#do clear ip bgp * soft
HOSTNAME(config-route-map)#^Z
HOSTNAME#sh ip bgp 10.77.216.2
BGP routing table entry for 10.77.208.0/20, version 7924362
Paths: (9 available, best #1, table default)
Advertised to update-groups:
132 134 138
Refresh Epoch 1
Local, (received & used)
10.5.0.44 (metric 20) from 10.5.0.10 (10.5.0.10)
Origin incomplete, metric 1, localpref 100, valid, internal, best
Community: 201:10001
Originator: 10.5.0.44, Cluster list: 10.5.0.10
Refresh Epoch 1
Local, (received & used)
10.5.0.44 (metric 20) from 10.5.0.11 (10.5.0.11)
Origin incomplete, metric 1, localpref 100, valid, internal
Community: 201:10001
Originator: 10.5.0.44, Cluster list: 10.5.0.11
Refresh Epoch 1
1803 10000, (received-only)
192.168.100.250 from 192.168.100.250 (10.247.4.58)
Origin incomplete, localpref 100, valid, external
Community: 201:10000
Extended Community: RT:1803:1929
Refresh Epoch 1
65000 10000
192.168.150.169 from 192.168.150.169 (192.168.150.169)
Origin incomplete, localpref 100, valid, external
Community: 100:10001
Refresh Epoch 1
65000 10000, (received-only)
192.168.150.169 from 192.168.150.169 (192.168.150.169)
Origin incomplete, localpref 100, valid, external
Refresh Epoch 1
2828 10000
192.168.160.9 from 192.168.160.9 (192.168.160.9)
Origin incomplete, localpref 100, valid, external
Community: 200:10001 201:10000
Refresh Epoch 1
2828 10000, (received-only)
192.168.160.9 from 192.168.160.9 (192.168.160.9)
Origin incomplete, localpref 100, valid, external
Community: 201:10000
Refresh Epoch 1
65000 10000
192.168.220.9 from 192.168.220.9 (192.168.220.9)
Origin incomplete, localpref 100, valid, external
Community: 100:10001 201:10000
Extended Community: RT:65000:65172
Refresh Epoch 1
65000 10000, (received-only)
192.168.220.9 from 192.168.220.9 (192.168.220.9)
Origin incomplete, localpref 100, valid, external
Community: 201:10000
Extended Community: RT:65000:65172
Can anyone advise a strategy that might explain this? The customer had no idea when the original loss of community occurred so we have no time reference to search logs for a potential event. There WAS a power outage in the DC where THIS router was housed and the traceroutes the customer was running were verification of operation following the outage.
I know this is vague, I would appreciate any recommendations. Thank you very much in advance.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: