cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
32721
Views
21
Helpful
7
Replies

strange BUFFER_THRESHOLD_EXCEEDED log on N9K

from88
Level 4
Level 4

Hello,

 

We've this type of device: 

Software
  BIOS: version 05.39
  NXOS: version 7.0(3)I7(8)
  BIOS compile time:  08/30/2019
  NXOS image file is: bootflash:///nxos.7.0.3.I7.8.bin
  NXOS compile time:  3/3/2020 20:00:00 [03/04/2020 04:49:49]


Hardware
  cisco Nexus9000 C93240YC-FX2 Chassis 
  Intel(R) Xeon(R) CPU D-1526 @ 1.80GHz with 24571696 kB of memory.
  Processor Board ID FDO24080WJB

  Device name: CORE01
  bootflash:  115805708 kB
Kernel uptime is 259 day(s), 20 hour(s), 12 minute(s), 34 second(s)

 

 

And sometimes logs like these appears:

%TAHUSD-SLOT1-4-BUFFER_THRESHOLD_EXCEEDED: Module 1 Instance 0 Pool-group buffer 90 percent threshold is exceeded!

Could this bug which is descripted here: https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/7-x/release/notes/70379_nxos_rn.html and here https://bst.cloudapps.cisco.com/bugsearch/bug/CSCvu69850 be the reason, why new 100G link goes sometimes flaps ? or this bug doesn't do anything with flapping it just not correctly reacts to it ? Thanks

 

could

 

 

 

 

7 Replies 7

marce1000
VIP
VIP

                             >....why new 100G link goes sometimes flaps 

 Probably not , check the logs on the switch when this happens, look for additional info's if any. As far as 

TAHUSD-SLOT1-4-BUFFER_THRESHOLD_EXCEEDED

 is concerned, you may try using the recommended software version as mentioned in this document :

        https://www.cisco.com/c/en/us/td/docs/switches/datacenter/nexus9000/sw/recommended_release/b_Minimum_and_Recommended_Cisco_NX-OS_Releases_for_Cisco_Nexus_9000_Series_Switches.html

 

          check if the problem remains in place or not afterwards

 

 M.



-- Each morning when I wake up and look into the mirror I always say ' Why am I so brilliant ? '
    When the mirror will then always repond to me with ' The only thing that exceeds your brilliance is your beauty! '

Hi @marce1000 i have this exact issue even after upgrading the NXOS to 9.3(8) as recommended?

What's the command to see the current configured buffer depth on the interface?

Hello!

The show hardware internal buffer info pkt-stats command when attached to the relevant module with the attach module <x> command will display an instantaneous snapshot of each ASIC slice's buffer utilization.

For more information about the BUFFER_THRESHOLD_EXCEEDED syslog, I highly recommend reviewing the Understand the TAHUSD BUFFER_THRESHOLD_EXCEEDED Syslog and Congestion on Nexus 9000 Cloud Scale ASIC NX-OS Switches document. This document contains details abut what this syslog means and how you can identify congested egress interfaces on Cisco Nexus 9000 Series switches with the Cloud Scale ASIC.

I hope this helps - thank you!

-Christopher

something is definitely up with these later updates.  We were running stable on a much older version of nx-os but when we updated to 9.3(7) our switches became unstable to the point traffic wasn't passing as expected disrupting both services and storage traffic.

We were running stable for up to 600 days on the much older OS version.

9.3(8) is just as unstable as 9.3(7) and there is definitely something wrong in it causing our switches to slow, drop traffic, and spam buffer full logs with not much change to our network architecture since prior to the update with the exception of 10-20 additional 10G links which we had capacity for our 2 x 2 side by side VPC design.

Due to how my company interacts with cisco I'm unable to directly open a TAC case right now, but this needs escalation as the error is appearing within hours of a reboot whereas the first instance showed up 4 months after first reboot.

Started seeing this error on C93240YC-FX2 running 9.3(5)

we updated past 9.3(x) and the issues reduced to one port from 10.
Our current suspicion for our issue is SDN causing packet splitting going from a 1500 mtu to a 14xx mtu.

hope this helps!

kwuenP
Level 1
Level 1

we have the same issue with cisco Nexus9000 C9364C Chassis 

Software
BIOS: version 05.44
NXOS: version 9.3(8)
BIOS compile time: 04/02/2021
NXOS image file is: bootflash:///nxos.9.3.8.bin
NXOS compile time: 8/4/2021 13:00:00 [08/05/2021 05:25:26]

at the time syslog generate "%TAHUSD-SLOT1-4-BUFFER_THRESHOLD_EXCEEDED: Module 1 Instance 3 Pool-group buffer 90 percent threshold is exceeded!", the switch drop traffic and cause fatal loss in traffic. We tried to reload the switch but it still happen after that

Review Cisco Networking for a $25 gift card