05-08-2014 11:30 AM - edited 03-07-2019 07:23 PM
This past weekend we replaced our Catalyst 4506 switches with a combination of Catalyst 4500-X switches (provides fiber and 10 GB connections to our closet switches and our UCS) and 2960-X switches (all other servers and printers are connected here). Our backup server has two NICs teamed into a NIC Team (port 35 and 36 on the 2960-X switch). The NIC team was set to the following:
Team Type Selection: 802.3ad Dynamic with Fault Tolerance
Transmit Load Balancing: Dest IP Address
Since we switched to the new switches we are seeing a high number of discards from our backup server on one of the ports in the NIC team. I verified the port-channel configuration on the switches and set the "port-channel load-balance" to src-dst-ip. When running the test etherchannel command from one of my servers to the backup server it responds that it would choose port 36. When I run a backup from this same source server to my backup server it chooses port 35 for everything. I would expect this if the data is coming from the same source ip to the backup server. To confirm this I added another test job from a different source IP. It still only chooses port 35.
If I "show lacp 3 counters" here's what I see:
LACPDUs Marker Marker Response LACPDUs
Port Sent Recv Sent Recv Sent Recv Pkts Err
---------------------------------------------------------------------
Channel group: 3
Gi2/0/35 14616 12368 0 0 0 0 0
Gi2/0/36 14628 12368 0 0 0 0 0
From this it appears the switch is pretty evenly distributing data across both channels. Am I correct? If that is true then is it a fair assumption that something on the backup server is rejecting the data on port 36 and thus causing the discards?
Just need an opinion on what might be going on to help convince people as to where the problem lies.
Thanks.
Solved! Go to Solution.
05-14-2014 07:32 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
You moved a busy server from a 4506 to a 2960X? If so, you might be bumping into the situation where low end Cisco switches don't often have sufficient buffers for really busy (i.e traffic bursting) ports.
As you've already noted, flows between a pair of hosts will use the same link. The fact that you used just one different source IP isn't definitive, as you have a 50/50 chance of using the same port as before.
05-14-2014 10:13 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
4500-X says:
Port Buffers
32-MB Shared Memory
2960X says:
Feature | 2960-XR | 2960-X | 2960-S | 2960 |
Egress buffers | 4 MB | 4 MB | 2 MB | 2 MB |
05-10-2014 12:37 AM
Hello.
To see the port utilization you would better to use "sh int g2/0/3x" command and compare 5 minutes input/output rate.
"Since we switched to the new switches we are seeing a high number of discards from our backup server on one of the ports in the NIC team"
Where do you see this and could you provide the log?
"When running the test etherchannel command from one of my servers to the backup server it responds that it would choose port 36. When I run a backup from this same source server to my backup server it chooses port 35 for everything"
Could you provide the output of the commands and how do you see that the flow is send via port 35? Are you sure server has only one IP-address to originate backup transfer?
05-14-2014 06:47 AM
I had also looked at the port utilization and didn't see any errors in the port statistics during the time frame I was testing. I was just looking at the LACP counters to verify that LACP was in fact working.
We are seeing the discards in the Solarwinds Orion console. I can't really provide the information from this system.
The servers in question each have only one IP address. I verified that the NIC team is only set up with the IP address. I didn't see any way to set the IP address on the NICs involved in the team as this option was greyed out.
The issue I had was that we were seeing discards even before the switch replacement so once the load balance method matched between the switch and the server it seemed that the issue would be on the server.
However, an update since this was posted. We have contacted HP and they had us change a couple of settings on the NIC team itself. If that doesn't resolve the issue then HP wants us to update the software for the NIC team.
Thanks for the help.
05-14-2014 07:32 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
You moved a busy server from a 4506 to a 2960X? If so, you might be bumping into the situation where low end Cisco switches don't often have sufficient buffers for really busy (i.e traffic bursting) ports.
As you've already noted, flows between a pair of hosts will use the same link. The fact that you used just one different source IP isn't definitive, as you have a 50/50 chance of using the same port as before.
05-14-2014 07:32 AM
We moved from 4506s to 2 4500-Xs (where our closet connections and 10 GB connection to our UCS reside) combined with 3 2960-Xs. This design came from a Cisco sales engineer in our area. Would the 4500-Xs have sufficient buffers? If so I've got a bunch of 1 GB GBICs that my UCS doesn't need anymore that I can try in the 4500-Xs.
Good thing is that we're in the process of moving to a different backup solution where this won't be an issue. But we have to address the immediate issue for the short term.
Thanks.
05-14-2014 10:13 AM
Disclaimer
The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.
Liability Disclaimer
In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.
Posting
4500-X says:
Port Buffers
32-MB Shared Memory
2960X says:
Feature | 2960-XR | 2960-X | 2960-S | 2960 |
Egress buffers | 4 MB | 4 MB | 2 MB | 2 MB |
05-16-2014 05:52 AM
Thanks for bringing this up. This turned out to be the solution.
05-16-2014 05:47 PM
Ah, good! Thanks for letting us know too.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide