05-08-2011 12:01 PM - edited 03-06-2019 04:57 PM
Hi there,
Recently we say a dramatic increase in packet drops on the EOBC interface, followed by a err-disable on multible 10G interfaces. Theese interfaces were not located on the same linecard, but rather across several cards. There were no apparent errors detected on theese interfaces, e.g. udld. This has happend twice within a period of about 6 months and several hw-replacements have been performed. The first time, the chassis and a 6704 linecard was replaced, as theese were initially suspected of being the root cause. The next failure caused us to replace the supervisor.
The switch, a 6509-E/Sup720-3B, represent one half of a distributionslayer composed of two of theese switches. The neighboring switch is an exact match of this one, both with regards to hw and ios release. But we've never seen that particular problem on that one. Where the failing switch can produced more than 5'000 drops on the eobc in less than 3 months, the other one has been running for little over a year and so far, we've only recorded some 60 drops.
We've also checked the cables and transreceivers on both our 10G linecard, one being a 6704, the other a 6708. No errors of any kind have been registered on any of the interfaces, but as a preventive strike, we've replaced the 10G-transreceivers on the interfaces, that have been err-disabled on both occations. We suspect that somehow udld results are not passed on to the Sup, as we've seen UDLD-errormsg in the syslog on both occasions just prior to the failure, but subsequent troubleshooting of udld reveals nothing that would indicate a udld-error.
We've reported both incidents to TAC and since last time, we haven't seen any problems, but I still see a considerable amount of drops on the EOBC interface and fear, that this incident will repeat itself. I'm working on some interim countermeasures to work around the problem, should it repeat ifself.
But aside from replacing the before mentioned hardware-parts, can anyone think of anything else that could cause theese kind of drops?
Any suggestions will be greatly appreciated
Thanx
/Ulrich
Solved! Go to Solution.
05-11-2011 12:09 AM
That's strange, try thisURL
Regards,
Sunil
05-09-2011 07:19 AM
Can you provide some output verifying the drops or any error message you might have encountered.
05-10-2011 05:57 AM
Hi Sunil,
Below is the output from 'sh eobc'
Interface information:
Interface EOBC0/0 (idb = 0x50E9C920)
Hardware is Mistral EOBC (revision 5)
Address is 0000.1500.0000 (bia 0000.1500.0000)
Encap size = 14 hardware status = 0x210840
IDB type = 18 IDB state = 4
Encap type = 0x1 Span encap size = 0
Error threshold = 5000 Error count = 0
Counters:
rxring = 0x8D3C140 rx ring entries = 512
rx_head = 408 rx_tail = 0
inputs = 470239172 rx_cumbytes = 123374641679
hw inputs = 0 hw rx_cumbytes = 0
rx rate (bits/sec) = 255000 rx rate (packets/sec) = 97
rx_buf_unavail = 0 rx input drops = 6058
input broadcast = 31 input resource = 105596932
input error = 0 input giants = 0
input crc = 6058 rx illegal length = 0
rxr eobc shadow = 0x50FAA01C txr eobc shadow = 0x47CD1B40
txring = 0x8D3E180 tx ring entries = 0x200
tx_head = 482 tx_tail = 482
outputs = 465663999 tx_cumbytes = 31068706335
hw outputs = 0 hw tx_cumbytes = 0
tx rate (bits/sec) = 58000 tx rate (packets/sec) = 95
tx_retry_error = 0 tx_retry_count = 101354
tx_process_stopped = 0 tx total drops = 0
Mistral Registers
soft_reset_cfg = 0x000000 dma_buffer_size_reg = 0x000000
int_mask_hi = 0x00007E int_mask_lo = 0xE7001A58
rxdscp_cnt = 512 txdscp_cnt = 0
rxwork_dscp = 0xDAC0 txwork_dscp = 0xF098
mistral_eobc_ds = 0x47BC41B8 mistral_dma_register = 0x30000000
mistral_glbl_reg = 0x10020000
Misc. Global Registers:
global_cfg = 0x20 mis_init_sts = 0xF
dimm_parm_cfg_hi = 0x00000566 dimm_parm_cfg_lo = 0x42040F5A
tm_init_size_cfg = 0x8000
/Ulrich
05-10-2011 06:37 AM
Too hard to pin point the cause. It might be the supervisor, due to some non-fatal internal events, causing the software to run out of memory and hence crashing.
Here is good documentation on
05-10-2011 09:47 AM
Hi Sunil,
Thanks for the feedback.
However, I'm unable to request the url, getting a 403-error. I have a valid cco-account, but I'm still unable to read the document. Is there an alternative way to locate it?
/Ulrich
05-11-2011 12:09 AM
That's strange, try thisURL
Regards,
Sunil
05-11-2011 12:33 AM
Hi Sunil,
Much better, thanks a million.
/Ulrich
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide