09-13-2016 09:53 PM - edited 03-05-2019 07:03 AM
Ok, installed brand new 3650 L3 switches last week, replacing some older Cisco 3750s. Been adding and removing VLANs and large quantity of IPv4 addresses without issues. Observium shows less than 20% CPU/RAM usage, everything is fine there. For some oddball reason tonight, I was trying to add a /29 subnet to one of the VLANS already there, as secondary subnet. The very second I do this, my entire network crawls to a halt and stays that way even for over 15 minutes, literally everything public but my access in the switch remains normal (?). The very second I remove that secondary subnet that was added, network immediately returns to normal. NOT NORMAL!
I replicated this with different VLANS and also entirely different C class subnets. Same issue, very strange and never seen this ever happen even on the old 3750s. Why is this happening suddenly on the new, far more powerful 3650's especially after a week with no problems at all, I'm baffled and need to know.
Any help would greatly be appreciated.
Info:
- replicated 2 different VLANS, 2 different C classes entirely
- CPU 20%
- 2 x 3650 switches stacked
- not network backbone issue at all, total throughout is between 80-150MBPs. My burstable is 300Mbps.
- been using these switches since Saturday without issues
- Model: WS-C3650-24TS (both switches)
09-14-2016 05:12 AM
Are you able to run a debug while adding the /29 ?
Could it be case of duplicate IP configured on another node ?
If not, it is indeed odd !
09-14-2016 07:53 AM
Not duplicate IPs at all.
Here's what I found out from Cisco tech, not sure how to resolve this.
++ you were adding secondary ip address on interface vlan, once you applied the command the network became slow and he got the following errors: *Sep 14 09:03:48.129: %L3UNICAST_ERRMSG-3-fib_err: 1 fed: FIB installation failure for unicast route: No Hardware Resource Available *Sep 14 09:03:57.958: %L3UNICAST_ERRMSG-3-fib_err: STANDBY:2 fed: FIB installation failure for unicast route: No Hardware Resource Available (Cisco3650-2) ++ we took packet captures before and after applying the commands, we can clearly see that there are TCP retransmissions after the connection malfunctioned. ++ i checked the tcam utilization and noticed the "Directly or indirectly connected routes" is almost full: CAM Utilization for ASIC# 0 Table Max Values Used Values -------------------------------------------------------------------------------- Directly or indirectly connected routes 16384/7168 16127/7168 ++ i checked the SDM templates available for this platform, i found both "advanced" template and "VLAN" template allocate same number for Directly or indirectly connected routes. ++ checking this deeply, we are hitting a hardware limitation with the 3550. There are two different SDM templates you can have for the 3650 and they are referenced in the link below. From the documentation though there is no template that will support more than 7168 masks and you are exceeding that value. In order to go away from this you are going to have to better summarize your routes and get the number of indirectly connected routes down. http://www.cisco.com/c/en/us/td/docs/switches/lan/catalyst3650/software/release/3se/system_management/configuration_guide/b_sm_3se_3650_cg/b_sm_3se_3650_cg_chapter_01100.html
09-14-2016 01:48 PM
Hi;
Can you try to upgrade/downgrade the Cisco recommend software, may be some unreported bug is hitting on the software version you are running.
Thanks & Best regards;
09-14-2016 02:42 PM
I removed nearly 1,000 routed IPs and I still cannot even add a single /29 back on despite not coming close to the max allowance. Cisco support is useless.
CAM Utilization for ASIC# 0
Table Max Values Used Values
--------------------------------------------------------------------------------
Unicast MAC addresses 32768/512 169/22
Directly or indirectly connected routes 16384/7168 15784/7168
L2 Multicast groups 4096/512 0/7
L3 Multicast groups 4096/512 0/9
QoS Access Control Entries 3072 52
Security Access Control Entries 1536 190
Netflow ACEs 768 15
Input Microflow policer ACEs 256 7
Output Microflow policer ACEs 256 7
Flow SPAN ACEs 512 13
Control Plane Entries 512 240
Policy Based Routing ACEs 1024 9
Tunnels 256 9
Input Security Associations 256 4
Output Security Associations and Policies 256 9
SGT_DGT 4096/512 0/0
CLIENT_LE 4096/64 0/0
INPUT_GROUP_LE 6144 0
OUTPUT_GROUP_LE 6144 0
09-15-2016 12:09 AM
Hello,
hard to find any documentation for the issue you are describing. You might want to try and change the MTU setting on your VLAN interface (1500 is the default) to something else, such as 1000, or 1400:
Switch(config)#interface vlan 100
Switch(config-if)#ip mtu 1000
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide