05-28-2013 01:52 PM - edited 08-29-2017 12:19 PM
With Xander Thuijs
During the event, Cisco expert Xander Thuijs provides an in-depth overview of the Cisco ASR 9000 Series Aggregation Services Routers. He will also show the packet walkthrough and explain troubleshooting best practices and tips. Xander will also discuss quality of service (QoS) implementation and forwarding architecture.
Xander Thuijs is a principal engineer for the Cisco ASR 9000 Series and Cisco IOS-XR product family at Cisco. He is an expert and advisor in many technology areas, including IP routing, WAN, WAN switching, MPLS, multicast, BNG, ISDN, VoIP, Carrier Ethernet, System Architecture, network design and many others. He has more than 20 years of industry experience in carrier Ethernet, carrier routing, and network access technologies. Xander holds a dual CCIE certification (number 6775) in service provider and voice technologies. He has a master of science degree in electrical engineering from Hogeschool van University in Amsterdam.
The following experts were helping Xander to answer few of the questions asked during the session: Aleksandar Vidako, Sadananda Phadke, and Krishna Eranti. Aleksandar, Sadananda, and Krishna are members of the ASR9000 Escalation team and have vast knowledge.
Webcast related links:
A. This is answered in the Ask the Expert Event.
A. No, super-frames is implemented in the hardware of the Fabric Interface ASIC (FIA). Also, there is not a show command that provides the number of packets aggregated into super-frames. The question has been raised before whether it makes sense from a troubleshooting point of view to create counters. Cisco determined that this was not value-added, so that ability is not available. Super-framing by itself provides the efficiency of fabric forwarding and it does not have an impact on performance.
A. That depends on the QoS configuration. Each network processing unit (NPU) has frame memory attached to it and this frame memory is where packet buffering is seen. The Trident -L and -B cards send a 50ms burst of traffic.The Trident -E card sends a maximum 150ms burst of traffic. The Typhoon base line card is about three times as much as the Trident, so it sends 300ms per NPU and it is serve first-come first-served. So, if on a Trident card there is one interface on NPU, that interface can use either 50ms or 150ms of buffering. If you created two sub-interfaces, then 150ms will shared by those two sub-interfaces and then it depends on the queue limit configuration in the Q0S policy as to how much of that buffer you want to be assigned for the interface. You can allow oversubscription, but then you run into packet anarchy.
A. This is answered in the Ask the Expert Event.
A. NBAR is not supported.
A. Yes, both Fabric connections from LC to RSP are used to send data at the same time in a loadbalancing fashion
A. This gives 1:1 redundancy on feed failure (like the AC modules). You need to ensure in this mode that if you lose one feed that the available power bricks can still provide enough power required for the cards.
A. This is answered in the Ask the Expert Event.
A. There are no plans for the EOL of the RSP2 as of now. The extra RAM is for control plane processing, the routing table, and so on.
A. Yes. The Fabric of both RSPs can be used simultaneously to forward traffic. It is active/active Fabric.
A. Multicast is replicated on the MOD160 the same as on other types of cards. Modular port adapters (MPAs) do not make L2/L3 forwarding decisions. In general, multicast is replicated by Fabric (to egress LCs) and within egress LC, by Fabric Interface ASIC ( FIA), bridge (in the case of trident cards), and the network processor for egress ports.
A. This is not required. If a packet needs to be dropped due to virtual queue index (VQI) overflowing (or flow-control from the egress LC), then it will be dropped in ingress FIA itself. High priority traffic is always preserved as marked on the ingress interface. VQI flow off happen on a per priority basis.
A. The SIP-700 type of line card supports shared port adapters (SPAs).
A. The decision is made on the ingress LC based on a hash computation derived from packet header contents.
A. In per-prefix allocation, the label is directly associated with a forwarding adjacency which cannot be on BVI. Therefore, a lookup must be enforced after the MPLS label is popped for that reason we need to have per-ce or per-vrf labels to force that extra lookup.
A. Fat-pw is more useful in IP routers since without fat-pw, even if there are multiple ECMP links, the L2 traffic uses only one ECMP link.
Note that on a PE router even if you use fat PW we still select the egress path based on inner label before the fat label is inserted, so the decission is made on PW label.
A. Cisco does not plan to allow a CLI command to disable any drop counter.
A. This is answered in the Ask the Expert Event.
A. This is answered in the Ask the Expert Event.
A. No ISSU is supported starting XR 4.3.0
A. The biggest super-frame size is 9K.
A. No, Address Resolution Protocol (ARP) tables, and hence the adjacency tables, are local to a line card. The ingress card needs to know only that the prefix is associated with a certain output interface. Adjacency lookup and L2 rewrites are performed on the egress card. MAC addresses have egress LC/ports associated on which these address are learned.
For L2VPN, all learnt macs in every bridge domain is sent to all NPU's L2 tables.
A. This is answered in the Ask the Expert Event.
A. No, each LC only keeps the ARP and Adjacency entries of addresses attached to the line cards. This is not exchanged between line cards.
A. It will be line-rate if it is put into MUX160 because MUX160 provides two NPUs. Then you can do one NPU per 4X10G out of that MPA. This MPA is scheduled to be released along with XR 4.31 in May 2013.
A. In the ASR 9001, there are two NPUs and these NPUs each serve two of the four on-board 10G ports. Since Typhoon NPU can do about 16G and 44 million packets per second, that basically leaves you with two 10G ports that are fixed and provide 40G of bandwidth out of that NPU's which is still available for the bay. You can either use 1x40G or 4x10G or 2x10G or 20x1G and you would not oversubscribe the NPUs.
A. It is not oversubscribed, but just like the Mod80 LC it is dependent on the features you enable. You may not get the same performance.
with a 4 port 10G MPA you have 6x10G per NPU same as the 36x10 LC.
A. Yes, you can use different generation of line cards in the same chassis.
A. Cluster is a system-wide functionality. From a logical perspective it is still a single router. VRFs are configured in the same way as on a single chassis configuration.
A. The 8T/L card is an oversubscribed card and will be linerate for larger packet sizes and limited features with dual RP.
The second RP can be used to minimize the oversubscription level to 1:2. On a signle RP the bW is limited to 46G, with dual RP's to 92G..
The NPU is limited to about 15G, so with a single RP the bottleneck is the fabric links, with dual RP the limit is caused by the NPU.
A. This is answered in the Ask the Expert Event.
A. Yes, all the line cards support IP over DWDM. CWDM can also be completed since there are different optics. You can use either color optics with a fixed wavelength or there are tunable optics for which you can configure the wavelength. Also, we have the ability to do G709 FEC. However that requires a software license and that support on all Typhoon line cards, but only on the Trident line card A9K-2T20G and 8x10.
A. This is answered in the Ask the Expert Event.
A. This is answered in the Ask the Expert Event.
A. Generally, two nodes of the cluster will do what we call Rack locality. The low call member of the bundle is preferred, or low call ECMP down to CE. In such cases, the inter-chassis bandwidth requirements are very minimal. Only single-homed devices use Inter Rack Link (IRL) when packet receiving on the peer node.This depends on the amount of CE and the bandwidth requirement on those. It also depends on how many are single-homed, which would basically constitute your IRL requirement. It is advisable to have two minimum 10G for redundancy because IRL also uses both for keep alive to detect their aliveness.
A. There is a tremendous amount of interest in nV edge clustering as it is a very popular concept. The clustering is rather new, but a lot of customers are deploying it. Especially in the US region where a lot of providers leverage this capability. This is definitely something worth considering and is embraced by the community.
A. Inter-chassis traffic for downstream--> No.
A. All ASR 9000 Series chassis, ASR 9922, ASR 9010, ASR 9006, and ASR 9001, can work as an nV host.
Hi, Xander:
I'm currently experiencing a problem with a 2x100G typhoon card. I have a PW (regular xconnect) that starts at a 100G port goes through an MPLS/LDP cloud and ends at an NCS5501. I have a traffic generator that is trying to inject a single L2 flow above 13GB to this port (say 40GB). It is my understanding that as long as you use a single L2 flow (single src-mac/dst-mac), the flow will be assigned to a single NP and anything above 13GB will be dropped. This because of VQi behavior/limits.
Is there a way to workaround this behavior for a single flow? Will it change if I use IP traffic and change the l2vpn load-balacing settings?
Thanks for your help!
Regards,
c.
hi carlos,
the pw starts at the 2x100 typhoon card as an attachment circuit or the core transport?
on the PE that starts the pw, you get ehter/ip in. if you have the default lb scheme (mac) you produce a hash based on the macs and that will select a single vqi, path or member.
if you go for the l3-src-dest, you take the L3+L4 from the AC and produce a hash on that and you'll get per flow loadbalancing over the same 3 items.
In short, on the 2x100 typhoon card, a signle flow, that is L3/L4 unique tuple (in l3 balancing mode) or a single mac combo (in l2 mode) is limited to 13G (can configure high bw mode up to 17G or so, but still limited).
there is an attractive trade in program btw for the 2x100 typhoon to help with a migration to MOD200 with 1x100 MPA's.
xander
Thank you , Xander.
Yes, the 100G port is an AC. And what gather is: if I'm using a traffic generator, create multiple different MAC-based or IP-based flows. No other way around it.
If the ACs connect to an internet router on both sides of the PW, at least on the side of the 9904 (under l2vpn) configure:
#load-balancing flow ?
src-dst-ip Use source and destination IP addresses for hashing
Correct?
Thanks again!
c.
Hello Xander,
We run into an issue with forwarding on the LC when we moved one prefix from static routing pointing to NH a.a.a.2 to a BGP routing pointing to NH b.b.b.2 on different subinterface. RIB and FIB in the box converged, but the FIB entry in the LC got stuck at the old NH and the traffic is being wrongly forwarded out of the old subinterface pointing to NH a.a.a.2.
RP/0/RSP0/CPU0:a9k6#sh route vrf inet x.x.x.0/24
Thu Jul 13 16:36:47.547 CEST
Routing entry for x.x.x.0/24
Known via "bgp 1", distance 20, metric 0
Tag 111, type external
Installed Jul 12 11:35:40.360 for 1d05h
Routing Descriptor Blocks
b.b.b.2, from b.b.b.2, BGP external
Route metric is 0
No advertising protos.
RP/0/RSP0/CPU0:a9k6#sh cef vrf inet x.x.x.0/24
Thu Jul 13 16:42:57.056 CEST
x.x.x.0/24, version 6698698982, internal 0x4000001 (ptr 0xb0d34a14) [1], 0x0 (0x0), 0x0 (0x0)
Updated Jul 12 11:35:40.369
Prefix Len 24, traffic index 0, precedence n/a, priority 3
via b.b.b.2, 7 dependencies, recursive, bgp-ext [flags 0x6020]
path-idx 0 [0xb2200544 0x0]
next hop b.b.b.2 via b.b.b.2/32
RP/0/RSP0/CPU0:a9k6#sh cef vrf inet x.x.x.0/24 location 0/1/CPU0
Thu Jul 13 16:48:55.236 CEST
x.x.x.0/24, version 3115832151, internal 0x4000001 (ptr 0x8a147d64) [1], 0x0 (0x0), 0x0 (0x0)
Updated Nov 9 00:24:24.757
Prefix Len 24, traffic index 0, precedence n/a, priority 3
via a.a.a.2, 3 dependencies, recursive [flags 0x0]
path-idx 0 [0x90220fe4 0x0]
next hop a.a.a.2 via a.a.a.2/32
I have tried several things to reprogram the LC but no success, the record is still stuck into the LC HW:
When I check cef summary, the counter for route update drops due to version mis-match increases each time I change the route for the particular prefix:
RP/0/RSP0/CPU0:a9k6#sh cef vrf inet summary location 0/1/CPU0 | in CEF
Fri Jul 14 10:40:34.125 CEST
IP CEF with switching (Table Version 0) for node0_1_CPU0
0 CEF route update drops, 371317788 revisions of existing leaves
16 CEF route update drops due to version mis-match
No other prefixes have this problem and they get programmed into the same LC without problems.
The box is ASR9006 with RSP-4G and MOD160-TR+MPA8x10GE running XR 4.3.4 SP4 (and this is the first problem we've had with the box running non-stop for 2y25w).
Thank you.
Regards,
Miro
it could be that you are forwarding against the l2 adj, instead of the rib entry.
check cisco live id 2904 from sanfran 2014 slide 59 onwards for some details on that.
xander
Hi Xander,
thank you for the provided info. I've checked the slides but this did not seem as rib vs. l2 adj issue. As a suggested fix from c-nsp I've restarted ipv4_rib process and the FIB in LC was successfully reprogrammed for the correct next-hop.
RP/0/RSP0/CPU0:a9k6#sh cef vrf inet x.x.x.0/24 location 0/1/CPU0 Fri Aug 4 02:13:33.666 CEST x.x.x.0/24, version 3115832151, internal 0x4000001 (ptr 0x8a147d64) [1], 0x0 (0x0), 0x0 (0x0) Updated Nov 9 00:24:29.163 Prefix Len 24, traffic index 0, precedence n/a, priority 3 via a.a.a.2, 3 dependencies, recursive [flags 0x0] path-idx 0 [0x90220fe4 0x0] next hop a.a.a.2 via a.a.a.2/32 RP/0/RSP0/CPU0:a9k6#process restart ipv4_rib location 0/RSP0/CPU0 Fri Aug 4 02:13:35.896 CEST RP/0/RSP0/CPU0:Aug 4 02:13:35.994 : sysmgr_control[65854]: %OS-SYSMGR-4-PROC_RESTART_NAME : User [me] (con0_RSP0_CPU0) requested a restart of process ipv4_rib at 0/RSP0/CPU0 RP/0/RSP0/CPU0:a9k6#sh processes ipv4_rib detail location 0/RSP0/CPU0 Fri Aug 4 02:14:52.508 CEST Job Id: 1121 PID: 374133031 Executable path: /disk0/iosxr-infra-4.3.4/0x100000/bin/ipv4_rib Instance #: 1 Version ID: 00.00.0000 Respawn: ON Respawn count: 2 Max. spawns per minute: 12 Last started: Fri Aug 4 02:13:36 2017 Process state: Run (last exit due to SIGTERM) Package state: Normal Process group: v4-routing core: MAINMEM Max. core: 0 Level: 170 Mandatory: ON Placement: Placeable startup_path: /pkg/startup/ipv4_rib.startup Ready: 0.556s Available: 0.571s Running path: /disk0/iosxr-infra-4.3.4/0x100000/bin/ipv4_rib Package path: /pkg/bin/ipv4_rib Job-id-link: 1121 group_jid: 0,0,0,0 fail_count: 0 this pcb: 0xe7ffe848 next pcb: 0x601ef3e0 jobid on RP: 0 Standby Status: 2 Send Avail: YES role_event_clinfo: 5008bc1c Proc role state: primary cleanup_event_clinfo: 0x5008bc1c Process cpu time: 15.035 user, 0.387 kernel, 15.422 total JID TID CPU Stack pri state TimeInState HR:MM:SS:MSEC NAME 1121 1 0 188K 10 Receive 0:00:02:0828 0:00:05:0267 ipv4_rib 1121 2 1 188K 10 Receive 0:01:16:0039 0:00:00:0000 ipv4_rib 1121 3 0 188K 10 Receive 0:01:16:0003 0:00:00:0000 ipv4_rib 1121 4 0 188K 10 Receive 0:00:52:0288 0:00:00:0022 ipv4_rib 1121 5 0 188K 10 Sigwaitinfo 0:01:15:0698 0:00:00:0000 ipv4_rib 1121 6 0 188K 10 Receive 0:01:00:0557 0:00:02:0828 ipv4_rib 1121 7 0 188K 10 Receive 0:01:15:0686 0:00:00:0000 ipv4_rib 1121 8 0 188K 10 Receive 0:01:15:0685 0:00:00:0000 ipv4_rib 1121 9 0 188K 10 Receive 0:01:15:0685 0:00:00:0000 ipv4_rib 1121 10 0 188K 10 Receive 0:01:15:0684 0:00:00:0000 ipv4_rib 1121 11 0 188K 10 Receive 0:01:15:0683 0:00:00:0000 ipv4_rib 1121 12 0 188K 10 Receive 0:00:30:0514 0:00:00:0000 ipv4_rib 1121 13 1 188K 10 Receive 0:00:02:0996 0:00:05:0460 ipv4_rib 1121 14 0 188K 10 Receive 0:01:00:0721 0:00:00:0000 ipv4_rib 1121 15 0 188K 10 Receive 0:00:03:0104 0:00:01:0612 ipv4_rib 1121 16 0 188K 10 Receive 0:01:01:0014 0:00:00:0002 ipv4_rib 1121 17 0 188K 10 Receive 0:00:02:0889 0:00:00:0224 ipv4_rib 1121 18 0 188K 10 Receive 0:00:48:0738 0:00:00:0001 ipv4_rib 1121 19 0 188K 10 Receive 0:00:10:0688 0:00:00:0001 ipv4_rib ------------------------------------------------------------------------------- RP/0/RSP0/CPU0:a9k6#sh cef vrf inet x.x.x.0/24 location 0/1/CPU0 Fri Aug 4 02:15:30.831 CEST x.x.x.0/24, version 479658, internal 0x4000001 (ptr 0x8a147d64) [1], 0x0 (0x0), 0x0 (0x0) Updated Aug 4 02:14:07.267 Prefix Len 24, traffic index 0, precedence n/a, priority 3 via b.b.b.2, 8 dependencies, recursive, bgp-ext [flags 0x6020] path-idx 0 [0x8a1443e4 0x0] next hop b.b.b.2 via b.b.b.2/32
The process restart was hitless during which the traffic was forwarded as usual.
Regards,
Miro
hi miro,
ah ok, good to hear the issue is resolved, but did we happen to check whether the cef entry on the lc was showing good data, if not RCC could have caught that.
on the point of the nh inconsistency, I have seen that on 434 before.
I'll try to see if i can find an actual precise ddts on that, but to note it may be a good idea to consider a recent emr such as 534 or 614.
cheers!
xander
RIB and CEF on RP was showing good next-hop, CEF on LC (sh cef vrf inet x.x.x.0/24 location 0/1/CPU0) and HW FIB (sh cef vrf inet x.x.x.0/24 hardware ingress/egress location 0/1/CPU0) were both showing wrong next-hop. You reckon RCC or LCC could have fixed that as well if run? As I pasted previously, CEF on LC failed to reprogram the next-hop due to version mis-match for this particular route entry.
16 CEF route update drops due to version mis-match
Thanks.
yeah RCC will check the route consistency between RP and LC. since you have a mismatch there, RCC can help there.
but this problem was known in 434 I recall. if you are not smu current, it might be good to load all the latest.
xander
Thanks Xander, I will try RCC on the next FIB update failure (it might take a while as this was first forwarding problem we had on 10+ A9K running 4.3.4 over two years :)
We are in the process of upgrading to 5.3.4 so hopefully we won't see it at all.
Hello Xander @xthuijs
I do have a cluster which is missing the 'services-infra' package and the mgbl. my question is in order to install those two packege it will reuqire teh system to restart or it will be hitless.
Hello Xander! @xthuijs
I'm little confused about FIB convergence time on ASR9k. PE recive full internet table (650k prefixes) from peer in VRF. When BGP session goes down (admin down) packets continue going through this peer about 15 minutes. Some problem i found in rib trace:
Oct 16 06:24:14.825 rib/ipv4_rib/rib-bcdl 0/RSP0/CPU0 t15 Bcdl (0x1): update group frozen 1 context max buffers exceeded
Oct 16 06:24:29.483 rib/ipv4_rib/rib-bcdl 0/RSP0/CPU0 t9 Bcdl (0x1): update group unfrozen 0 context resume callback rcvd
... and this messages repeated until FIB on LC download all updates:
Oct 16 06:41:20.487 rib/ipv4_rib/rib-bcdl 0/RSP0/CPU0 t15 Bcdl (0x15): bcdl_agent unregistering from all tables
For RSP FIB converged time is about 6 minutes.
What is the real speed of FIB programming for NP? Why there is such a difference between the convergence time of RSP FIB and LC FIB?
ASR-9001
Cisco IOS XR Software, Version 5.3.3
We use 'advertise best-external' and 'additional-paths selection' with policy 'set path-selection backup 1 install' for BGP PIC. And when interface goes down convergnce time is less than 10 seconds...
Best regards,
Niki.
Hello,
what is the command of checked how long interface is down in cisco asr router
Hi Xander how are you...I have some questions refering to how an LNS works..
Do you know any law.. any standard industrial agreement or any IETF RFC that declare or explain the following.. That when the LNS having received the AVP’s, in fact AVP100 from the LAC through a L2TP tunnel, the LNS retransmit to Radius the nasport-type as virtual.. : NAS-Port-Type [61] 6 Virtual [5]
Or are all these matters proprietary definitions?
Is there a mechanism in the LNS, a way to manipulate the value of this attribute and send another value to Radius? What does it mean as Virtual?
I look forward to hearing from you asap..
Best regards,
Javier
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: