cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1885
Views
20
Helpful
4
Replies
David Balseca
Beginner

CISCO 3800X// traffic loss in MPLS network

In device ME-3800X is PE in MPLS network show following logs, and experimente traffic loss 

 

18289145: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289146: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289147: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289148: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289149: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289150: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289151: Oct  9 19:09:09.565: label allocation failed for label val 22
18289152: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289153: Oct  9 19:09:09.565: label allocation failed for label val 22
18289154: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289155: Oct  9 19:09:09.565: label allocation failed for label val 22
18289156: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289157: Oct  9 19:09:09.565: label allocation failed for label val 22
18289158: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289159: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289160: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289161: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289162: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289163: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289164: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289165: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289166: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289167: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289168: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289169: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed

18289170: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289171: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed

18289172: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289173: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289174: Oct  9 19:09:09.565: nmpls_next_label_check: Label allocation Failed
18289175: Oct  9 19:09:09.569: nmpls_next_label_check: Label allocation Failed
18289176: Oct  9 19:09:09.569: nmpls_next_label_check: Label allocation Failed
18289182: Oct  9 19:09:12.553: nmpls_next_label_check: Label allocation Failed
18289183: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289184: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289185: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289186: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289187: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289188: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289189: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289190: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289191: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289192: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289193: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289194: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289195: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289196: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289197: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289198: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289199: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289200: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289201: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289202: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289203: Oct  9 19:09:12.593: nmpls_next_label_check: Label allocation Failed
18289204: Oct  9 19:09:12.597: nmpls_next_label_check: Label allocation Failed
18289205: Oct  9 19:09:12.597: nmpls_next_label_check: Label allocation Failed
18289206: Oct  9 19:09:12.597: nmpls_next_label_check: Label allocation Failed
18289207: Oct  9 19:09:12.597: nmpls_next_label_check: Label allocation Failed
18289208: Oct  9 19:09:12.597: nmpls_next_label_check: Label allocation Failed
18289209: Oct  9 19:09:14.589: %PARSER-5-CFGLOG_LOGGEDCMD: User:pbermudez  logged command:isis metric 500
18289210: Oct  9 19:09:15.221: label allocation failed for fib 10.11.51.48/30 Tbl:1 label val 1756
18289211: Oct  9 19:09:15.221: label allocation failed for fib 10.11.51.48/30 Tbl:1 label val 1756
18289212: Oct  9 19:09:31.325: %SYS-5-CONFIG_I: Configured from console by pbermudez on vty1 (10.90.54.2)

 

 

It´s a bug? what´s the fix?

 

Regards,

4 REPLIES 4
Vinit Jain
Cisco Employee

Could you please share the below output:

sh platform aspdma template | include MPLS
sh platform nile adjmgr all | include EMPLS
sh plat nile adjmgr fid_usage | in EMPLS
show ip route summary
show version | in image

Were there any recent events like addition of new paths, shift of traffic from primary to secondary or any link flaps that have occurred on the router or the network?

Once we have these outputs, we shall have a much better picture of why these logs are appearing.

Thanks

Vinit

Thanks
--Vinit

Hi Vinit

Between the 2 devices (3800) are 2 links L3 (no port channel and ISIS as IGP) traffic is fordwarding in the 2 interfaces (load balancing) normally but one device showed the logs and the traffic is forwarding by 1 interface, I have to change the metric (more value) in the interface with problem and stopped the show log.

 

MCHPASJE01#sh platform aspdma template | include MPLS
NILE_NUM_EOMPLS_TUNNELS                  =  4000
NILE_NUM_ROUTED_EOMPLS_TUNNELS           =  128
NILE_NUM_MPLS_VPN                        =  2000
NILE_NUM_MPLS_SERVICES                   =  4000
NILE_NUM_MPLS_INGRESS_LABELS             =  29500
NILE_NUM_MPLS_EGRESS_LABELS              =  36000
MPLSD_TABLE                                   = 34816
EMPLS3LD_TABLE                                = 36864
MCHPASJE01#sh platform nile adjmgr all | include EMPLS
EMPLS3LD Total Alloc:4718837 Total Free:4699926 Usage:18911
EMPLSINTD Total Alloc:4465 Total Free:3938 Usage:527
MCHPASJE01#sh plat nile adjmgr fid_usage | in EMPLS
EMPLS3LD Total Alloc:4718837 Total Free:4699926 Usage:18911
EMPLSINTD Total Alloc:4465 Total Free:3938 Usage:527
MCHPASJE01#show ip route summary
IP routing table name is default (0x0)
IP routing table maximum-paths is 4
Route Source    Networks    Subnets     Replicates  Overhead    Memory (bytes)
connected       0           9           0           648         1620
static          0           0           0           0           0
isis 1          2           2577        0           185688      464220
  Level 1: 0 Level 2: 2579 Inter-area: 0
eigrp 106       0           0           0           0           0
bgp 28006       0           0           0           0           0
  External: 0 Internal: 0 Local: 0
eigrp 105       0           0           0           0           0
internal        21                                              108420
Total           23          2586        0           186336      574260
MCHPASJE01#show version | in image
System image file is "flash:/me380x-universalk9-mz.153-3.S2/me380x-universalk9-mz.153-3.S2.bin"
MCHPASJE01#

Hello David

Thanks for the logs. I think the logs and your description explains the problem. 

MCHPASJE01#sh platform aspdma template | include MPLS
NILE_NUM_EOMPLS_TUNNELS                  =  4000
NILE_NUM_ROUTED_EOMPLS_TUNNELS           =  128
NILE_NUM_MPLS_VPN                        =  2000
NILE_NUM_MPLS_SERVICES                   =  4000
NILE_NUM_MPLS_INGRESS_LABELS             =  29500
NILE_NUM_MPLS_EGRESS_LABELS              =  36000
MPLSD_TABLE                                   = 34816
EMPLS3LD_TABLE                                = 36864  <<<<< LIMIT

In the above output, the value 36864 is the limit that the platform presently have for the labels.

MCHPASJE01#sh platform nile adjmgr all | include EMPLS
EMPLS3LD Total Alloc:4718837 Total Free:4699926 Usage:18911

If you notice the usage, its 18911 but this is in the state where you have increased the metric of other link. thus there is no ECMP in picture at the moment. If you have ECMP, you will have double this count and you actually ran out of resources. Thus those messages were getting printed.

In such situations, you can expect traffic loss and management connectivity loss for the device.

Hope this clarifies.

Regards

Vinit

Thanks
--Vinit

Dear Vinit,

 

I would like to have a better understanding of problem and what is advisable longterm solution. recently have encountered same error message with 3800x and as david's, drop all traffic, only resolved after reboot. There about 7 IGP adjacencies on it, which is it is aggregating from several ring topologies.

 

this current output of  last 2  commands that you requested from David;

 

XXXX-UPE-02#sh platform aspdma template | include MPLS
NILE_NUM_EOMPLS_TUNNELS = 4000
NILE_NUM_ROUTED_EOMPLS_TUNNELS = 128
NILE_NUM_MPLS_VPN = 2000
NILE_NUM_MPLS_SERVICES = 4000
NILE_NUM_MPLS_INGRESS_LABELS = 29500
NILE_NUM_MPLS_EGRESS_LABELS = 36000
MPLSD_TABLE = 34816
EMPLS3LD_TABLE = 36864
XXXX-02#sh platform nile adjmgr all | include EMPLS
EMPLS3LD Total Alloc:1199 Total Free:1073 Usage:126
EMPLSINTD Total Alloc:1080 Total Free:396 Usage:684

 

I will appreciate your feedback and looking for replace device considering 2 similar outages with same symptom.

 

Regards