Re: THRESHOLD_EXCEEDED_WARN: Utilization of 93 percent OSPF LSA

Madhu · ‎06-26-2024

Problem Statement :

We get notified of these logs from the Core switch

2024 Jun 23 20:07:55.084 DCCore-1-N9K %ICAM-4-SCALE_THRESHOLD_EXCEEDED_WARN: Utilization of 93 percent for feature OSPF LSAs is over the warning threshold.
2024 Jun 23 22:07:58.794 DCCore-1-N9K %ICAM-4-SCALE_THRESHOLD_EXCEEDED_WARN: Utilization of 93 percent for feature OSPF LSAs is over the warning threshold.
2024 Jun 24 00:08:01.277 DCCore-1-N9K %ICAM-4-SCALE_THRESHOLD_EXCEEDED_WARN: Utilization of 93 percent for feature OSPF LSAs is over the warning threshold.

Further checking, Switch has limitation of 100000 on Nexus 9500 and currently we are utilizing full.

(VDC:1,TAG:700) - - 1 0.09 0.00 0.09 2024-06-11 12:31:37 0.09 2022-08-20 09:16:53

OSPF LSAs 100000 100000 85330 85.33 101.15 92.20 2024-06-18 18:35:23 117.68 2023-09-14 23:07:41

(VDC:1,TAG:100) - - 75372 75.37 88.71 81.41 2024-06-18 18:35:23 106.74 2023-09-14 23:07:40

As per the screenshot, the OSPF process ID 100 is consuming more than 88% ..
Technically, the Type 5 LSA is consuming more routes in the database. Please find the attached type 5 SLA routes.

Process 100 database summary
LSA Type Count
Opaque Link 0
Router 36
Network 18
Summary Network 10539
Summary ASBR 289
Type-7 AS External 0
Opaque Area 0
Type-5 AS External 64481
Opaque AS 0
Non-self 70753
Total 75363

We have identified external routes being redistributed into OSPF, causing increased database utilization.

Upon inspection of several Type-5 routes, we found that they are learned from various locations.

Summarizing these routes is complex, as we lack visibility into which routes are actively used by location teams.

We have DC and DR setup across all the locations. Summarization is bit hard.. We may encounter an outage.

Note:

All these segments are learned from other locations. Each segment is also different in each location, hence summarizing these routes is a bit complex in origin ABR/ASBR

Please provide a possible alternate solution instead of summarization. Thank you…

marce1000 · ‎06-26-2024

- FYI : https://bst.cloudapps.cisco.com/bugsearch/bug/CSCwh43960

You may want to investigate further using these commands :
show hardware internal forwarding table utilization
show icam scale
show icam resource fib-tcam mod 1 inst 0

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Madhu · ‎06-26-2024

Thank you for the response.

I have verified the the below command output and got identified OSPF process id 100 causing the database utilization.

#show icam scale utilization

I have attached the logs, could you please review and confirm with me does it affect by bug (or) any alternate solution available instead of route summarization.

Note: All these segments are learned from other locations. Each segment is also different in each location, hence summarizing these routes is a bit complex in origin ABR/ASBR

marce1000 · ‎06-26-2024

- I do not have enough expertise to comment on those outputs ; best is to always use the latest advisory software version for the involved platforms. You will need TAC support I think , to get further solutions (if needed)

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Madhu · ‎06-27-2024

FYI , Tac suggested to do the route summarization

marce1000 · ‎06-27-2024

Tx for the info,

M.

-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '