cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
756
Views
0
Helpful
2
Replies

IP_LEAF grows on the trident line card on the A9K. Only reloading the line card helps.

Anton Popov
Level 1
Level 1

A few months ago, a problem appeared on my ARS9k-RSP4 with trident line card a9k-8t-l. IOS XR 4.2.3.

I use only BGP functionality. My router receives two FW v4 + v6. 

 

The logs began to issue a message:
LC / 0/0 / CPU0: Jun 8 15: 50: 57.033 fib_mgr [171]:% ROUTING-FIB-2-OOR: CEF has run out of DATA_TYPE_TABLE_SET resource memory. No more route updates will be handled by CEF. CEF.
LC / 0/0 / CPU0: Jun 8 15: 50: 57.033 fib_mgr [171]:% ROUTING-FIB-6-RSRC_OK: CEF resource state has returned to normal. CEF has exited resource constrained operation

It also increases the load CPU on line card up to 40%

I ran a few diagnostic commands:

show cef platform resource location 0/1/CPU0

Node: 0/1/CPU0
----------------------------------------------------------------
RPF_STRICT usage is same on all NPs

NP: 0 struct 9: RPF_STRICT (maps to ucode stru = 15)

Used Entries: 0 Max Entries: 65536
-------------------------------------------------------------

IPV4_LEAF_P usage is same on all NPs

NP: 0 struct 23: IPV4_LEAF_P (maps to ucode stru = 54)

Used Entries: 766515 Max Entries: 1314000
-------------------------------------------------------------

IPV6_LEAF_P usage is same on all NPs

NP: 0 struct 24: IPV6_LEAF_P (maps to ucode stru = 55)

Used Entries: 69958 Max Entries: 656500
-------------------------------------------------------------

IP_LEAF usage is same on all NPs

NP: 0 struct 4: IP_LEAF (maps to ucode stru = 8)

Used Entries: 1458081 Max Entries: 1460000
-------------------------------------------------------------

NP: 0 struct 12: TX_ADJ (maps to ucode stru = 8)

Used Entries: 1458081 Max Entries: 1460000
-------------------------------------------------------------



show cef resource hardware ingress detail location 0/1/CPU0

CEF resource availability summary state: RED
CEF will drop route updates
No. of times HW caused oor: 251
CEF entered oor at : Jun 8 16:33:45.574
CEF came out of oor at : Jun 8 16:33:25.622
ipv4 shared memory resource:
CurrMode GREEN, CurrAvail 1050771256 bytes, MaxAvail 1231384576 bytes
ipv6 shared memory resource:
CurrMode GREEN, CurrAvail 1050771256 bytes, MaxAvail 1231384576 bytes
mpls shared memory resource:
CurrMode GREEN, CurrAvail 1050771256 bytes, MaxAvail 1231384576 bytes
common shared memory resource:
CurrMode GREEN, CurrAvail 1050771256 bytes, MaxAvail 1231384576 bytes
DATA_TYPE_TABLE_SET hardware resource: RED
DATA_TYPE_TABLE hardware resource: RED
DATA_TYPE_IDB hardware resource: RED
DATA_TYPE_IDB_EXT hardware resource: RED
DATA_TYPE_LEAF hardware resource: RED
DATA_TYPE_LOADINFO hardware resource: RED
DATA_TYPE_PATH_LIST hardware resource: RED
DATA_TYPE_NHINFO hardware resource: RED
DATA_TYPE_LABEL_INFO hardware resource: RED
DATA_TYPE_FRR_NHINFO hardware resource: RED
DATA_TYPE_ECD hardware resource: RED
DATA_TYPE_RECURSIVE_NH hardware resource: RED
DATA_TYPE_TUNNEL_ENDPOINT hardware resource: RED
DATA_TYPE_LOCAL_TUNNEL_INTF hardware resource: RED
DATA_TYPE_ECD_TRACKER hardware resource: RED
DATA_TYPE_ECD_V2 hardware resource: RED
DATA_TYPE_ATTRIBUTE hardware resource: RED
DATA_TYPE_LSPA hardware resource: RED
DATA_TYPE_LDI_LW hardware resource: RED
DATA_TYPE_LDSH_ARRAY hardware resource: RED
DATA_TYPE_TE_TUN_INFO hardware resource: RED
DATA_TYPE_DUMMY hardware resource: RED
DATA_TYPE_IDB_VRF_LCL_CEF hardware resource: RED
DATA_TYPE_TABLE_UNRESOLVED hardware resource: RED
DATA_TYPE_MOL hardware resource: RED
DATA_TYPE_MPI hardware resource: RED
DATA_TYPE_SUBS_INFO hardware resource: RED
DATA_TYPE_GRE_TUNNEL_INFO hardware resource: RED
DATA_TYPE_LISP_RLOC hardware resource: RED
DATA_TYPE_LSM_ID hardware resource: RED
DATA_TYPE_INTF_LIST hardware resource: RED
DATA_TYPE_TUNNEL_ENCAP_STR hardware resource: RED

 

I tried to clear the CEF table, but it did not help. The only solution I found is to reload the line card.

What could be the problem? How to solve it?

2 Replies 2

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello Anton,

you are running a quite old version of IOS XR on your device >> IOS XR 4.2.3.

 

You say you are running eBGP with two ISPs

>> I use only BGP functionality. My router receives two FW v4 + v6.

It is correct that you are receiving two full BGP tables both for IPv4 and IPv6 ?

 

The error messages say that the linecard runs out of memory for CEF tables.

>>

The logs began to issue a message:
LC / 0/0 / CPU0: Jun 8 15: 50: 57.033 fib_mgr [171]:% ROUTING-FIB-2-OOR: CEF has run out of DATA_TYPE_TABLE_SET resource memory. No more route updates will be handled by CEF. CEF.
LC / 0/0 / CPU0: Jun 8 15: 50: 57.033 fib_mgr [171]:% ROUTING-FIB-6-RSRC_OK: CEF resource state has returned to normal. CEF has exited resource constrained operation

>>It also increases the load CPU on line card up to 40%

 

Note: as a result of having gone out of CEF resources the linecard cpu is going to process in software some prefixes this is the reason for the CPU usage increase.

 

From the other show commands that you have provided it looks like that the resources are not near to limits

 

IPV4_LEAF_P usage is same on all NPs

NP: 0 struct 23: IPV4_LEAF_P (maps to ucode stru = 54)

Used Entries: 766515 Max Entries: 1314000

However, the used Entries 766515 are more then 50% of Max Entries 1314000.

 

So the first suggestion is to to upgrade your ASR 9000 to a newer version of IOS XR as 4.2.3 is really old, and to perform also firmware upgrade on all linecards. This may solve your issue that has been triggered by latest increase in total number of IPv4 prefixes in BGP full table(s).

Can you check if one of your eBGP sessions flapped just before the event?

I wonder if the linecard has been overwhelmed in an attempt to recompute the CEF table for all IPv4 prefixes this should happen only in case like one eBGP session falled down.

 

Edit:

looking again at your diagnostic show commands we see that in following sections you are close to max entries

 

see

IP_LEAF usage is same on all NPs

NP: 0 struct 4: IP_LEAF (maps to ucode stru = 8)

Used Entries: 1458081 Max Entries: 1460000
-------------------------------------------------------------

NP: 0 struct 12: TX_ADJ (maps to ucode stru = 8)

Used Entries: 1458081 Max Entries: 1460000

 

If you receive 720,000 routes from both upstream ISPs at CEF level the device may need to use one entry for each received prefix in order to be ready to use the backup path.

Are you using BGP multipath ? This might explain the double usage of these tables.

 

Hope to help

Giuseppe

 

 

Hello Giuseppe,


@Giuseppe Larosa wrote:

Hello Anton,

you are running a quite old version of IOS XR on your device >> IOS XR 4.2.3.

 

You say you are running eBGP with two ISPs

>> I use only BGP functionality. My router receives two FW v4 + v6.

It is correct that you are receiving two full BGP tables both for IPv4 and IPv6 ?

Yes

The error messages say that the linecard runs out of memory for CEF tables.

>>

The logs began to issue a message:
LC / 0/0 / CPU0: Jun 8 15: 50: 57.033 fib_mgr [171]:% ROUTING-FIB-2-OOR: CEF has run out of DATA_TYPE_TABLE_SET resource memory. No more route updates will be handled by CEF. CEF.
LC / 0/0 / CPU0: Jun 8 15: 50: 57.033 fib_mgr [171]:% ROUTING-FIB-6-RSRC_OK: CEF resource state has returned to normal. CEF has exited resource constrained operation

>>It also increases the load CPU on line card up to 40%

 

Note: as a result of having gone out of CEF resources the linecard cpu is going to process in software some prefixes this is the reason for the CPU usage increase.

 

From the other show commands that you have provided it looks like that the resources are not near to limits

 

IPV4_LEAF_P usage is same on all NPs

NP: 0 struct 23: IPV4_LEAF_P (maps to ucode stru = 54)

Used Entries: 766515 Max Entries: 1314000

However, the used Entries 766515 are more then 50% of Max Entries 1314000.

 

So the first suggestion is to to upgrade your ASR 9000 to a newer version of IOS XR as 4.2.3 is really old, and to perform also firmware upgrade on all linecards. This may solve your issue that has been triggered by latest increase in total number of IPv4 prefixes in BGP full table(s).

Can you check if one of your eBGP sessions flapped just before the event?

I wonder if the linecard has been overwhelmed in an attempt to recompute the CEF table for all IPv4 prefixes this should happen only in case like one eBGP session falled down.

I looked at the logs and did not see the BGP flaps over the past few days.

Edit:

looking again at your diagnostic show commands we see that in following sections you are close to max entries

 

see

IP_LEAF usage is same on all NPs

NP: 0 struct 4: IP_LEAF (maps to ucode stru = 8)

Used Entries: 1458081 Max Entries: 1460000
-------------------------------------------------------------

NP: 0 struct 12: TX_ADJ (maps to ucode stru = 8)

Used Entries: 1458081 Max Entries: 1460000

 

If you receive 720,000 routes from both upstream ISPs at CEF level the device may need to use one entry for each received prefix in order to be ready to use the backup path.

Are you using BGP multipath ? This might explain the double usage of these tables.

This is an interesting thought about multipath. This was the very first thing I started checking out. On this router, I did not set up multipath. Perhaps it is enabled by default? How can this be verified?
But, I think, since I have eBGP from different upstream, then the BGP path will be different and multipath should not work in any case.

Thnx!

Review Cisco Networking for a $25 gift card