Solved: Re: Fabric Edge Node not forwarding DHCP Request to PETR/Border Node

pdomingues · ‎02-22-2024

We have a very simple SDA setup. One fabric edge node which clients connect too, and one Border/Control Plane Node that connects out to the rest of our Non-Fabric environment, specifically a N7K Fusion Router. Our DHCP server is out in the Non-Fabric environment.

When our client sends a DHCP request, it hits the edge node (confirmed by packet capture) and then sets the Option 82/GIaddr. I checked the circuit and remote circuit ids through ip dhcp debugs and confirmed that it is setting the correct RLOC, VLANs, etc (confirmed by embedded packet capture).

However, after this... the process seems to stop. No more debugs appear. Embedded packet capture does not find any packets leaving the interfaces to our Border/PETR node, and no packets are captured inbound on the Border/PETR node that have to do with DHCP/UDP 67+68. I'm thinking it might have something to do with our LISP table?

DHCP server is 172.20.100.110.

#sh ip cef vrf DEV_VN 172.20.0.0 det
172.20.0.0/16, epoch 0, flags [check lisp eligibility]
LISP remote EID: 48 packets 15266 bytes fwd action fwd native
LISP fwd-native source
Dependent covered prefix type LISP-FWD, cover 0.0.0.0/0
1 IPL source [no flags]
attached to LISP0.4099

#sh ip lisp map-cache 172.20.100.0 instance-id 4099
LISP IPv4 Mapping Cache for LISP 0 EID-table vrf DEV_VN (IID 4099), 1 entries

172.20.0.0/16, uptime: 00:29:58, expires: 00:14:38, via map-reply, forward-native
Sources: map-reply
State: forward-native, last modified: 00:29:58, map-source: 172.21.255.1
Active, Packets out: 48(15266 bytes), counters are not accurate (~ 00:00:22 ago)
Encapsulating to proxy ETR

Not sure which steps to take or where to go from here.

I can ping the DHCP server from fusion router.

I can ping DHCP server from global routing table and from the DEV_VN VRF on the border node (as long as it is sourced from the Loopback DNA center created on the border).

I can ping the DHCP server from the global routing table but NOT from the DEV_VN VRF on the fabric edge.

Should I be able to ping the PETR RLOC address from the DEV_VN VRF on the fabric edge? Is it possible that it doesn't know how to reach the PETR from the VRF and that is the issue? I can ping PETR RLOC from global routing table on fabric edge but not from the VN VRF.

DHCP snooping is enabled with correct vlans attached, fabric edge is configured as PITR, option 82 originate is on... not sure where I'm going wrong. Thanks.

More info:

Fabric Edge LISP -

Information applicable to all EID instances:
Router-lisp ID: 0
Locator table: default
Ingress Tunnel Router (ITR): disabled
Egress Tunnel Router (ETR): enabled
Proxy-ITR Router (PITR): enabled RLOCs: 172.21.255.2
Proxy-ETR Router (PETR): disabled
NAT-traversal Router (NAT-RTR): disabled
Mobility First-Hop Router: disabled
Map Server (MS): disabled
Map Resolver (MR): disabled
Mr-use-petr: disabled
First-Packet pETR: disabled
Multiple IP per MAC support: disabled
Delegated Database Tree (DDT): disabled
Multicast Flood Access-Tunnel: disabled
Publication-Subscription: enabled
Publisher(s): *** NOT FOUND ***
ITR Map-Resolver(s): 172.21.255.1
ETR Map-Server(s): 172.21.255.1
xTR-ID: 0x570C1D3C-0x75B2B221-0x8FB89F16-0x428F42C4
site-ID: unspecified
ITR local RLOC (last resort): *** NOT FOUND ***
ITR use proxy ETR RLOC(Encap IID): 172.21.255.1

Border Node LISP -

Information applicable to all EID instances:
Router-lisp ID: 0
Locator table: default
Ingress Tunnel Router (ITR): disabled
Egress Tunnel Router (ETR): enabled
Proxy-ITR Router (PITR): enabled RLOCs: 172.21.255.1
Proxy-ETR Router (PETR): enabled
NAT-traversal Router (NAT-RTR): disabled
Mobility First-Hop Router: disabled
Map Server (MS): enabled
Map Resolver (MR): enabled
Mr-use-petr: enabled
Mr-use-petr locator set name: default-etr-locator-set-ipv4
First-Packet pETR: disabled
Multiple IP per MAC support: disabled
Delegated Database Tree (DDT): disabled
Multicast Flood Access-Tunnel: disabled
Publication-Subscription: enabled
Publisher(s): *** NOT FOUND ***
ITR Map-Resolver(s): 172.21.255.1
ETR Map-Server(s): 172.21.255.1

pdomingues · ‎03-01-2024

I changed back to Lisp PUB/SUB as opposed to LISP/BGP.

I couldn't get it working until I ensured that there was a default route on the DEV VRF on the Border+CP Node. After checking the routing table, I realized there was no default route set for that VRF routing table and so added a static 0.0.0.0/0 route, although I'm sure I could have dynamically done that via BGP. Maybe next week I'll get that setup.

From DMs with jedolphi:

You absolutely should not need petr CLI on the EN, it's regressive, effectively hard coding default route, definitively remove it, and stay with Pub/Sub.

This says default route is down, priority 255:

172.21.255.1 255/10 /- 4099 640816641/5633 Default

In Pub/Sub an External Border Node will on register itself as a default ETR if it has a default route in RIB. Please add 0/0 to RIB on BN and then re-check the same CLI command. If that's the fix it would be great if you could update the communities discussion please, so others can learn and it's closed out. Cheers!

This fixed it.

If you guys are having issues with Fabric Edge node connectivity in a VRF and are using PUB/SUB control plane, make sure that the VRF routing table on the border node has a default route in it.

View solution in original post

jedolphi · ‎02-22-2024

Hi. What IOS XE version? On which device did you capture those show commands please? Does your Fabric Edge Node have a /32 underlay/GRT route for all Border Node Lo0s? If not please add. You could also check control plane request/response/state, from Fabric Edge Node CLI:

lig instance-id 4099 172.20.100.110

show lisp instance-id 4099 ipv4 map-cache 0.0.0.0/0

pdomingues · ‎02-27-2024

Fabric Edge: IOS XE 17.90.04a C9407R
Border/Control Node: IOS XE 17.9.4a C9410R

Commands were captured on the respective nodes listed in bold. As for the CLI show commands near the top, those were captured on the fabric edge.

Fabric Edge does NOT have a /32 underlay/GRT route for the Border Node Lo0, it just has a ISIS L2 default route pointing to the border node. It can reach the Lo0 fine.

I added a static route to the Lo0 on FE but nothing changed.

FE1#sh lisp instance-id 4099 ipv4 map-cache 0.0.0.0/0

LISP IPv4 Mapping Cache for LISP 0 EID-table vrf DEV_VN (IID 4099), 1 entries

0.0.0.0/0, uptime: 4d21h, expires: never, via static-send-map-request

Sources: static-send-map-request

State: send-map-request, last modified: 4d21h, map-source: local

Exempt, Packets out: 1(318 bytes), counters are not accurate (~ 4d21h ago)

Configured as EID address space

Encapsulating to proxy ETR

FE1#lig instance-id 4099 172.20.100.110

Mapping information for EID 172.20.100.110 from 172.21.255.1 with RTT 1 msecs

172.20.0.0/16, uptime: 00:02:55, expires: 00:14:59, via map-reply, forward-native

Encapsulating to proxy ETR

jedolphi · ‎02-27-2024

TAC case would be faster than a forum.

/32 in Fabric Edge Node GRT for all BN Lo0s is mandatory, summary route or default route insufficient. Better to use dynamic protocol over static if possible, suggest adding BN Lo0 to ISIS.

Michel has a t/shooting presentation that might help, https://www.ciscolive.com/c/dam/r/ciscolive/emea/docs/2024/pdf/BRKTRS-3820.pdf

Feel free to send me a DM if you prefer to solve here.

jedolphi · ‎02-28-2024

Is there a reason you're not using LISP Pub/Sub? That is the recommended CP routing architecture moving forward.

Managed to get the lab set to same state as yours. From the Edge Node CLI, running same IOS XE version where PETR /32 is missing from RIB:

S-EN#show run | i petr
use-petr 192.168.8.38
S-EN#
S-EN#show ip route 192.168.8.38
% Network not in table
S-EN#
S-EN#lig instance-id 4099 172.20.100.110
Mapping information for EID 172.20.100.110 from 192.168.8.38 with RTT 1 msecs
128.0.0.0/1, uptime: 00:00:00, expires: 00:14:59, via map-reply, forward-native
Encapsulating to proxy ETR
S-EN#
S-EN#show lisp instance-id 4099 ipv4 map-cache
LISP IPv4 Mapping Cache for LISP 0 EID-table vrf CORP_VN (IID 4099), 3 entries

0.0.0.0/0, uptime: 00:42:44, expires: never, via static-send-map-request
Encapsulating to proxy ETR
10.4.6.0/24, uptime: 00:42:44, expires: never, via dynamic-EID, send-map-request
Encapsulating to proxy ETR
128.0.0.0/1, uptime: 00:00:10, expires: 00:14:50, via map-reply, forward-native
Encapsulating to proxy ETR
S-EN#
S-EN#show ip cef vrf CORP_VN 172.20.100.110 detail
128.0.0.0/1, epoch 0, flags [check lisp eligibility]
LISP remote EID: 2 packets 1152 bytes fwd action fwd native
LISP fwd-native source
Dependent covered prefix type LISP-FWD, cover 0.0.0.0/0
1 IPL source [no flags]
attached to LISP0.4099
S-EN#

And now same commands when PETR /32 is present in RIB. Note the nexthop is populated in CEF in the second scenario, unlike the first:

S-EN#show run | i petr
use-petr 192.168.8.38
S-EN#
S-EN#show ip route 192.168.8.38
Routing entry for 192.168.8.38/32
Known via "isis", distance 115, metric 20, type level-2
Redistributing via isis
Last update from 172.31.218.66 on TenGigabitEthernet1/1/3, 00:00:17 ago
Routing Descriptor Blocks:
* 172.31.218.66, from 172.31.218.64, 00:00:17 ago, via TenGigabitEthernet1/1/3
Route metric is 20, traffic share count is 1
S-EN#
S-EN#lig instance-id 4099 172.20.100.110
Mapping information for EID 172.20.100.110 from 192.168.8.38 with RTT 2 msecs
128.0.0.0/1, uptime: 00:02:07, expires: 00:14:59, via map-reply, forward-native
Encapsulating to proxy ETR
S-EN#
S-EN#show lisp instance-id 4099 ipv4 map-cache
LISP IPv4 Mapping Cache for LISP 0 EID-table vrf CORP_VN (IID 4099), 3 entries

0.0.0.0/0, uptime: 00:44:48, expires: never, via static-send-map-request
Encapsulating to proxy ETR
10.4.6.0/24, uptime: 00:44:49, expires: never, via dynamic-EID, send-map-request
Encapsulating to proxy ETR
128.0.0.0/1, uptime: 00:02:14, expires: 00:14:53, via map-reply, forward-native
Encapsulating to proxy ETR
S-EN#
S-EN#show ip cef vrf CORP_VN 172.20.100.110 detail
128.0.0.0/1, epoch 0, flags [subtree context, check lisp eligibility]
SC owned,sourced: LISP remote EID - locator status bits 0x00000000
LISP remote EID: 2 packets 1152 bytes fwd action encap
LISP source path list
nexthop 192.168.8.38 LISP0.4099
2 IPL sources [no flags]
nexthop 192.168.8.38 LISP0.4099
S-EN#

pdomingues · ‎03-01-2024