cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
729
Views
15
Helpful
6
Replies

Erroneous Bandwidth Reporting - 7609 - EOBC Issue?

philclemens1835
Level 1
Level 1

We noticed utilization spikes on one of our port channels yesterday, but discovered that the traffic did not actually spike to the levels indicated.  In fact, the two physical interfaces that comprise the port channel did not reflect these spikes.  But, EOBC0/0 did have associated spikes.  To rule out MRTG, Solarwinds showed the same spikes on the interface.  Our concern is that there may be a physical issue in the fabric, or maybe an IOS issue with SNMP.

 

As we were digging into this, I noticed that our 6704 line cards appear to only have two 8G fabric connections.  Since we are spanning our port channel across two 6704's, could this cause an issue as we start to push towards 16G of traffic.  Should we have our in and out 20G port channels all on one 6704?

 

Attached are the MRTG graphs, as well as the 7600 architecture document.  "show mod" is below:

 

 

Mod Ports Card Type Model Serial No.
--- ----- -------------------------------------- ------------------ -----------
2 8 8 port 1000mb GBIC Enhanced QoS WS-X6408A-GBIC SAL0827B44T
3 8 8 port 1000mb GBIC Enhanced QoS WS-X6408A-GBIC SAL0718CF0W
4 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SAL1114KYME
5 2 Route Switch Processor 720 (Active) RSP720-3CXL-GE SAL17299PHR
6 2 Route Switch Processor 720 (Hot) RSP720-3CXL-GE SAL17299PGC
9 4 CEF720 4 port 10-Gigabit Ethernet WS-X6704-10GE SAL1528JG4W

Mod MAC addresses Hw Fw Sw Status
--- ---------------------------------- ----- ------------- ------------ -------
2 0011.5c33.13d8 to 0011.5c33.13df 3.5 5.4(2) (sierra_main Ok
3 000c.ce57.bc10 to 000c.ce57.bc17 3.5 5.4(2) (sierra_main Ok
4 001a.a10e.bbf0 to 001a.a10e.bbf3 2.5 12.2(14r)S 15.2(4)S Ok
5 2894.0fd1.816c to 2894.0fd1.816f 5.14 12.2(33r)SR 15.2(4)S Ok
6 2894.0fd1.8170 to 2894.0fd1.8173 5.14 12.2(33r)SR 15.2(4)S Ok
9 30e4.db78.7e84 to 30e4.db78.7e87 3.2 12.2(14r)S 15.2(4)S Ok

Mod Sub-Module Model Serial Hw Status
---- --------------------------- ------------------ ----------- ------- -------
4 Distributed Forwarding Card WS-F6700-DFC3B SAD11060CLM 4.4 Ok
5 Policy Feature Card 3 7600-PFC3CXL SAL17299LM7 1.2 Ok
5 C7600 MSFC4 Daughterboard 7600-MSFC4 SAL17299NNL 5.0 Ok
6 Policy Feature Card 3 7600-PFC3CXL SAL17141Y1M 1.2 Ok
6 C7600 MSFC4 Daughterboard 7600-MSFC4 SAL17299NNE 5.0 Ok
9 Distributed Forwarding Card WS-F6700-DFC3B SAL1528JFLH 4.9 Ok

 

 

6 Replies 6

philclemens1835
Level 1
Level 1

MRTG Files attached this time.

Joseph W. Doherty
Hall of Fame
Hall of Fame
The 6704 line cards are dual 20G fabric cards. However, your 6408A line cards are classic bus. (Which means your chassis is operating in mixed mode, and traffic between those your two types of line cards needs to transit your supervisor.) Where does the traffic on your port-channel go to/from?

All our traffic that matters is traversing only the 6704's.  Every 2-port port channel is split across the two 6704's.

 

 

Hello @philclemens1835 ,

I agree with @Joseph W. Doherty  your 6704 linecards are fabric enabled and they also host DFC so they do not need to consult the supervisor CEF table as the DFC contains a complete copy of the CEF table.

Your network graphs are showing that the port-channel is able to reach 19.53 Gbps in one direction.

As noted previously you can have issues only if the traffic over the port-channel is destined or originated by the two older 48 1 GE linecards because that would cause the use of the shared bus.

About the spikes on the OOBC you need  to take in account that the being distributed CEF over the two C6704 linecards every event that causes the supervisor to send a complete full CEF table to the linecards will cause a spike over the OOBC, but this does not mean that user traffic will be dropped.

 

In order to improve your C7600 chassis performance you should change the C6408 modules with C6708 with the correct DFC linecards.

Edit:

I have reviewed your show module again and you have an issue with DFC sub-modules installed on C6708 liencards the DFC type is stated as DFC3B but your supervisor type is 3CXL, this means that the supervisor can build a larger CEF table then the one that can be hosted on the DFCs.

This is very dangerous because if this happens some IP prefixes will be process switched causing the supervisor CPU to go to 100%. The DFC type must match the supervisor type for this reason.

This may be the root cause for excessive traffic over the OOBC interface.

Your current  HW configuration is not recommended and there is no other way to fix it then replacing the DFCs with the correct type DFC3CXL.

 

Hope to help

Giuseppe

 

@Giuseppe Larosa 

 

This is great info, and much appreciated.  Fortunately in this case we have fairly small tables on this box.

 

Even more fortunately, we have a couple of ASR 9904's on order to replace this router and its sibling!

 

While I have your attention on this, please check out the graphs that indicate the high bandwidth on the port channel, but not on the two involved Te4/1 and Te9/1 interfaces.  Like it's mis-reporting the utilization on the port channel.  Have you ever seen this behavior? 

 

Thanks,

Phil

Giuseppe, great catch on the DFC and sup mismatch! Besides the difference between sups being XLs and the DFC not, they are different gens to, 3B vs. 3C (although you can mix them, it causes the system to run in 3B mode).

Although OP mentions having ASR9904s to replace these 7600s, for other readers, since there's only the two fabric cards, another option might be to replace the DFCs with CFCs (as the DFCs are a bit pricey and they may not be truly needed in this case).
Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Review Cisco Networking products for a $25 gift card