04-01-2020 05:18 AM
We noticed utilization spikes on one of our port channels yesterday, but discovered that the traffic did not actually spike to the levels indicated. In fact, the two physical interfaces that comprise the port channel did not reflect these spikes. But, EOBC0/0 did have associated spikes. To rule out MRTG, Solarwinds showed the same spikes on the interface. Our concern is that there may be a physical issue in the fabric, or maybe an IOS issue with SNMP.
As we were digging into this, I noticed that our 6704 line cards appear to only have two 8G fabric connections. Since we are spanning our port channel across two 6704's, could this cause an issue as we start to push towards 16G of traffic. Should we have our in and out 20G port channels all on one 6704?
Attached are the MRTG graphs, as well as the 7600 architecture document. "show mod" is below:
Mod Ports Card Type Model Serial No.
--- ----- -------------------------------------- ------------------ -----------
2 8 8 port 1000mb GBIC Enhanced QoS WS-X6408A-GBIC SAL0827B44T
3 8 8 port 1000mb GBIC Enhanced QoS WS-X6408A-GBIC SAL0718CF0W
4 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SAL1114KYME
5 2 Route Switch Processor 720 (Active) RSP720-3CXL-GE SAL17299PHR
6 2 Route Switch Processor 720 (Hot) RSP720-3CXL-GE SAL17299PGC
9 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SAL1528JG4W
Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ----- ------------- ------------ -------
2 0011.5c33.13d8 to 0011.5c33.13df 3.5 5.4(2) (sierra_main Ok
3 000c.ce57.bc10 to 000c.ce57.bc17 3.5 5.4(2) (sierra_main Ok
4 001a.a10e.bbf0 to 001a.a10e.bbf3 2.5 12.2(14r)S 15.2(4)S Ok
5 2894.0fd1.816c to 2894.0fd1.816f 5.14 12.2(33r)SR 15.2(4)S Ok
6 2894.0fd1.8170 to 2894.0fd1.8173 5.14 12.2(33r)SR 15.2(4)S Ok
9 30e4.db78.7e84 to 30e4.db78.7e87 3.2 12.2(14r)S 15.2(4)S Ok
Mod Sub-Module Model Serial Hw Status
---- --------------------------- ------------------ ----------- ------- -------
4 Distributed Forwarding Card WS-F6700-DFC3B SAD11060CLM 4.4 Ok
5 Policy Feature Card 3 7600-PFC3CXL SAL17299LM7 1.2 Ok
5 C7600 MSFC4 Daughterboard 7600-MSFC4 SAL17299NNL 5.0 Ok
6 Policy Feature Card 3 7600-PFC3CXL SAL17141Y1M 1.2 Ok
6 C7600 MSFC4 Daughterboard 7600-MSFC4 SAL17299NNE 5.0 Ok
9 Distributed Forwarding Card WS-F6700-DFC3B SAL1528JFLH 4.9 Ok
04-01-2020 05:19 AM
04-01-2020 08:43 AM
04-01-2020 08:55 AM
All our traffic that matters is traversing only the 6704's. Every 2-port port channel is split across the two 6704's.
04-02-2020 01:12 AM - edited 04-02-2020 01:19 AM
Hello @philclemens1835 ,
I agree with @Joseph W. Doherty your 6704 linecards are fabric enabled and they also host DFC so they do not need to consult the supervisor CEF table as the DFC contains a complete copy of the CEF table.
Your network graphs are showing that the port-channel is able to reach 19.53 Gbps in one direction.
As noted previously you can have issues only if the traffic over the port-channel is destined or originated by the two older 48 1 GE linecards because that would cause the use of the shared bus.
About the spikes on the OOBC you need to take in account that the being distributed CEF over the two C6704 linecards every event that causes the supervisor to send a complete full CEF table to the linecards will cause a spike over the OOBC, but this does not mean that user traffic will be dropped.
In order to improve your C7600 chassis performance you should change the C6408 modules with C6708 with the correct DFC linecards.
Edit:
I have reviewed your show module again and you have an issue with DFC sub-modules installed on C6708 liencards the DFC type is stated as DFC3B but your supervisor type is 3CXL, this means that the supervisor can build a larger CEF table then the one that can be hosted on the DFCs.
This is very dangerous because if this happens some IP prefixes will be process switched causing the supervisor CPU to go to 100%. The DFC type must match the supervisor type for this reason.
This may be the root cause for excessive traffic over the OOBC interface.
Your current HW configuration is not recommended and there is no other way to fix it then replacing the DFCs with the correct type DFC3CXL.
Hope to help
Giuseppe
04-02-2020 05:52 AM
This is great info, and much appreciated. Fortunately in this case we have fairly small tables on this box.
Even more fortunately, we have a couple of ASR 9904's on order to replace this router and its sibling!
While I have your attention on this, please check out the graphs that indicate the high bandwidth on the port channel, but not on the two involved Te4/1 and Te9/1 interfaces. Like it's mis-reporting the utilization on the port channel. Have you ever seen this behavior?
Thanks,
Phil
04-02-2020 08:15 AM
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide