03-01-2016 08:04 AM - edited 08-28-2017 03:02 AM
In IOS XR release 5.3.3 new serviceability enhancements were introduced to help troubleshoot packet drops in Network Processor (NP) microcode. This documents explains in details the enhancements and touches briefly on other input drops troubleshooting techniques.
We recommend to use the new troubleshooting features "monitor np interface" and "show controllers np capture". Use the old "monitor np counter" only as last resort, under direct guidance from TAC.
Input drops were previosly quite challenging for troubleshooting. Starting with IOS XR release 5.3.3, per-interface packet drops in NP microcode can be investigated quite easily.
There are currently close to 1200 various NP statistics counters that provide great insight into what actions is the NP microcode performing. These counters are stored in a memory that must be very fast and very close to the NP cores, which has an impact on the size of the memory. As a consequence, NP statistics counters are global. To achieve per-interface drop counitng, we have carved out a portion of the statistics memory for per-interface drop counters. In NP microcode terminology, these are per-uidb drop counters. UIDB stands (or μIDB) stands for Microcode Interface Descriptor Block, or other words this is NP's view of a (sub)interface.
Your starting point for this kind of troubleshooting would probably be when you observe input drops on an interface, e.g.:
GigabitEthernet0/0/1/6.1 is up, line protocol is up <..output omitted..> 307793 packets input, 313561308 bytes, 227987 total input drops |
The following command monitors the drops on this particular sub-interface. In this example two iterations are executed, each lasting one second.
RP/0/RSP0/CPU0:our9001#monitor np interface g0/0/1/6.1 count 2 time 1 location 0/0/CPU0 Monitor NP counters of GigabitEthernet0_0_1_6.1 for 2 sec <..output omitted..> **** Sun Jan 31 22:14:32 2016 **** Monitor 2 non-zero NP1 counters: GigabitEthernet0_0_1_6.1 Offset Counter FrameValue Rate (pps) ------------------------------------------------------------------------------- 262 RSV_DROP_MPLS_LEAF_NO_MATCH_MONITOR 101 49 1307 PARSE_DROP_IPV4_CHECKSUM_ERROR_MONITOR 101 50 (Count 2 of 2) RP/0/RSP0/CPU0:our9001# |
Now you have an idea of the drop reason. Some drops are self explanatory, like the IPv4 checksum error. For the remaining ones you need to look into the dropped packet header to investigate further. This is where the next new fature comes to the rescue.
Starting with IOS XR release 5.3.3, packets recently dropped by NP microcode are saved for further troubleshooting.
The NP microcode is saving the headers of the recent dropped packets into a circular buffer. On Tomahawk and Typhoon line card family we are saving the most recent 128 and 32 dropped packets respectively.
You can view the recent dropped packets using the show controllers np capturecommand.
RP/0/RSP0/CPU0:our9001#sh controllers np capture np1 location 0/0/CPU0 NP1 capture buffer has seen 426268 packets - displaying 32 Sun Jan 31 22:55:13.935 : RSV_DROP_MPLS_LEAF_NO_MATCH From GigabitEthernet0_0_1_6: 1222 byte packet on NP1 0000: 84 78 ac 78 ca 3e 30 f7 0d f8 af 81 81 00 03 85 0010: 88 47 05 dc 11 ff 45 00 00 64 01 ae 00 00 ff 01 0020: 62 c3 ac 12 00 02 ac 10 ff 02 00 00 02 3a 00 0a <..output omitted..> |
In the snapshot shown above you can see that the dropped packet was received on port GigabitEthernet0/0/1/6. As the pervasive packet drop feature works on port level, to figure out the subinterface on which the packet was received you have to look into the L2 encapsulation header. In the above snapshot the encapsulaton was 802.1Q, indicated by Ethernet Type 0x8100. The two octects that follow contain the 3-bit PCP, 1-bit DEI and 12-bit VLAN ID. The VLAN ID happened to be 0x385, which is 901 in decimal.
Using any off-line packet decoder tool, the above captured frame is easily decoded to:
Ethernet II, Src: 30:f7:0d:f8:af:81, Dst: 84:78:ac:78:ca:3e Type: 802.1Q Virtual LAN (0x8100) 802.1Q Virtual LAN, PRI: 0, CFI: 0, ID: 901 Type: MPLS label switched packet (0x8847) MultiProtocol Label Switching Header, Label: 24001, Exp: 0, S: 1, TTL: 255 MPLS Label: 24001 MPLS Experimental Bits: 0 MPLS Bottom Of Label Stack: 1 MPLS TTL: 255 Internet Protocol, Src: 172.18.0.2 (172.18.0.2), Dst: 172.16.255.2 (172.16.255.2) Internet Control Message Protocol Type: 0 (Echo (ping) reply) Code: 0 () |
The next step would be to check the MPLS forwarding table for the label 24001.
You can disable/enable the capturing of specific drop rasons by using the filter option:
RP/0/RSP0/CPU0:our9001#sh controllers np capture np1 filter RSV_DROP_MPLS_LEAF_NO_MATCH disable location 0/0/CPU0 Disable NP1 packet capture for: RSV_DROP_MPLS_LEAF_NO_MATCH |
You can see which drop counters are eligible for capture by using the help option:
RP/0/RSP0/CPU0:our9001#sh controllers np capture np1 help location 0/0/CPU0 NP1 Status Capture Counter Name ---------------------+------------------------------ Capturing PARSE_UNKNOWN_DIR_DROP Capturing PARSE_UNKNOWN_DIR_1 <..output omitted..> |
Starting with XR release 4.3.x, the content of a packet processed by the NP can be dumped using the monitor np counter CLI. This method is explained in detail in ASR9000/XR: How to capture dropped or lost packets.
Use this only as the last resort, when the two new troubleshooting features can't help.
The drawback of this approach, compared to the two new methods described in this document, is that all captured packets are dropped. In addition, NP reset is required upon capture completion (~50ms traffic outage on Typhoon, ~150 on Tomahawk).
Important Note: In some XR releases the NP reset after the execution of "monitor np counter" is optional. We strongly recommend to always select the reset option after running the monitoring. The NP reset will be unconditional starting with XR release 6.1.4 and 6.2.2.
We're working on further drop troubleshooting enhancements. In the meantime we hope the two new drop troubleshooting tools will already help you a lot in packet drops troubleshooting.
Let us know your comments/questions.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: