03-08-2012 03:01 AM - edited 03-07-2019 05:25 AM
Hi guys,
I need some ideas about the minutiae of LACP. Our server teams want to use LACP bundling for some servers, and not for others. Furthermore, even for the servers they normally want to connect via an aggregate, while the server is being installed they want to use the links individually. I am trying to find a configuration that will allow them to decide whether to bundle or not, without having to ask me to intervene on the switch.
I have tried channel-group n mode active and channel-group n mode passive, but both have the same problem. When the server team decides to bundle, the connection comes up and I have connectivity to the server in under 5 seconds. But, if they decide they do not want to bundle, it takes about 60 seconds to establish the connection. According to the syslog, the links come up in about 3 seconds, but the line protocol does not get established until about 60 seconds. And that is long enough to give them problems with the BOOTP for a re-installation ... which is precisely the situation where they want to treat the links individually. I guess during those 60 seconds the switch is listening to see if the server is going to say something LACP or not.
I should say that I am sure this is not a Spanning Tree problem. I have portfast on the individual links as well as on the composite PortChannel interface. Spanning Tree goes straight into forwarding whether or not the server is bundling. I can see that is so from a debug - the STP goes straight into forwarding as soon as the line protocol gets established.
I have also disabled DTP, so it is not that.
So, my question is, how can I reduce the time that the switch waits before it concludes that the server is not going to say anything LACP, and puts the links into individual mode. The servers are Solaris 10 and the switches are 2960G. (Although hopefully they will soon be Nexus 2248s.)
BTW, just a tip: if you want a configuration that can handle both individual and aggregate situations, do not put flowcontrol receive desirable. If one link negotiates flow control before the other one does, then it will get suspended and everything stops.
Thanks for reading this.
Kevin Dorrell
Luxembourg
Solved! Go to Solution.
03-08-2012 09:41 AM
Hi Kevin,
Sadly, there is awfully little to be configured about LACP on Catalyst switches. Apart from selecting the mode and configuring the LACP system and port priority, there is nothing you can change about its behavior.
Nevertheless, I've made an experiment in our lab. Using 2960 running 12.2(58)SE1, I have connected it via two ports to another switch. The other switch was left totally unconfigured. On my switch, I have bundled both ports into a LACP-driven EtherChannel and observed how soon will the ports come up as individual ports and start providing connectivity.
With both active and passive state combinations, the line protocol on these ports became up 10 seconds after turning them on, and they became independently available of forwarding data. In other words, it took LACP 10 seconds to decide that the other end is not going to bundle the ports, and leave the ports to work in independent mode. I have used RSTP on both switches and as these two ports were the only ones interconnecting them, one of the links rapidly became the root link.
So while the LACP cannot be tweaked, I was unable to reproduce your 60 seconds of unavailable connection. I wonder if you have the option of running debug lacp ... debugs. Perhaps they would provide more information about what's going wrong.
Best regards,
Peter
03-08-2012 09:41 AM
Hi Kevin,
Sadly, there is awfully little to be configured about LACP on Catalyst switches. Apart from selecting the mode and configuring the LACP system and port priority, there is nothing you can change about its behavior.
Nevertheless, I've made an experiment in our lab. Using 2960 running 12.2(58)SE1, I have connected it via two ports to another switch. The other switch was left totally unconfigured. On my switch, I have bundled both ports into a LACP-driven EtherChannel and observed how soon will the ports come up as individual ports and start providing connectivity.
With both active and passive state combinations, the line protocol on these ports became up 10 seconds after turning them on, and they became independently available of forwarding data. In other words, it took LACP 10 seconds to decide that the other end is not going to bundle the ports, and leave the ports to work in independent mode. I have used RSTP on both switches and as these two ports were the only ones interconnecting them, one of the links rapidly became the root link.
So while the LACP cannot be tweaked, I was unable to reproduce your 60 seconds of unavailable connection. I wonder if you have the option of running debug lacp ... debugs. Perhaps they would provide more information about what's going wrong.
Best regards,
Peter
03-09-2012 04:48 AM
Hi Peter,
Thank you for labbing it up for me. Your results are interesting.
At one time, the situation was even worse. I remember that if you set mode active or passive, if the attached host did not talk LACP, then the 2960 would suspend the links regardless. That was with version 12.25(25)EWA. I opened a TAC case about it, and they achnowledged it as CSCsd02587. That was fixed in 12.2(35)SE and I am now running 12.2(53)SE2.
On reflection, I wonder whether they simply did not fix it full, and still needed to trim the dead time. I think I need to read the release notes for 12.2(58)SE1 and see if there is any mention, or maybe even to try 12.2(58)SE1.
THanks again, and I'll let you know what I come up with.
Best regards
Kevin Dorrell
Luxembourg
03-09-2012 04:53 AM
Peter / Kevin,
I know that it's not recommended per Cisco, but is the transition time faster if the channel-mode was set to on and lacp was specified as the protocol to use?
John
03-09-2012 05:09 AM
Hello John,
I haven't ever tested this combination of settings to be honest. In fact, at this point, I would believe that they would not be even accepted - and even if they are, the channel-protocol command should not actually activate LACP or PAgP - it should merely disallow using any other commands that would activate a different protocol.
Do you have any other experiences? I should test this in a lab, definitely.
Best regards,
Peter
03-09-2012 05:23 AM
Hi John,
Thank you for the suggestion, but I would be reluctant to try it.I think if the switch had bundling on and the server did not, then we would risk blackholing traffic in the direction switch --> server because half the traffic would be sent to the wrong NIC.
I have to admit I have never understood the utility of the channel-protocol command. I thought the protocol was already implied in the active, passive, desirable, auto, on keyword. Except perhaps as a "belt and braces" to ensure that you use the right keyword for the right protocol.
Best regards
Kevin
03-09-2012 05:27 AM
To Kevin's point, I don't know if it would work if the server doesn't have a way to specifically set it, but I do have etherchannels between a 6509 and IBM servers that required lacp. I set mine to on/lacp and it works fine. When I get in this morning, I'll post the output for you to look at. I'll also test in my lab as well. In my defense, I don't remember why I hardcoded it. I think I had a problem with it negotiating at all and decided that it would only work with the state set to on.
John
03-09-2012 06:27 AM
John,
I've labbed it: sadly, as I expected, your suggestion is not going to work. After configuring the channel-protocol lacp, entering the channel-group 1 mode on will lead to a refusal message stating a channel protocol mismatch.
Exactly as Kevin noted, neither I have understood the point behind the channel-protocol command. The keywords "active/passive", "desirable/auto" and "on" fully and unambiguously identify the channel protocol to be used, and hence, this command appears to be absolutely useless. Indeed, the only usage of it, as it seems, is the "belt and braces" to help those people who don't know or can't memorize which keywords map to which channel protocols.
Best regards,
Peter
03-09-2012 06:52 AM
Peter,
Did you set both sides to On? You shouldn't have gotten a protocol mismatch. See the excerpt from Cisco. I looked at my interfaces and I do have mine set to ON. Interestingly though, ON enables lacp
Use the channel-protocol command only to restrict a channel to LACP or PAgP. If you set the protocol by using the channel-protocol command, the setting is not overridden by the channel-group interface configuration command.
You must use the channel-group interface configuration command to configure the EtherChannel parameters. The channel-group command also can set the mode for the EtherChannel.
You cannot enable both the PAgP and LACP modes on an EtherChannel group.
PAgP and LACP are not compatible; both ends of a channel must use the same protocol.
Here's another piece below:
I'm about to go into my lab and test as well. I also don't understand the point of using channel-protocol now. From what I've read, it enables the protocol strictly and the mode that's used can't override, but aside from that, you only have active/passive/on for lacp and according to the above, you can't use channel-protocol with the on command. Contradictory?
03-09-2012 07:03 AM
John,
Did you set both sides to On?
In fact, I was not even able to create the EtherChannel. See:
SW-Dist1(config)#int ra gi0/16 - 17
SW-Dist1(config-if-range)#channel-protocol lacp
SW-Dist1(config-if-range)#channel-group 10 mode on
FEC_ER_DIFF_CHGP_REJECT
Command rejected (Channel protocol mismatch for interface Gi0/16 in group 10): the interface can not be added to the channel group
% Range command terminated because it failed on GigabitEthernet0/16
The channel-group 10 was a new group not yet created on the switch.
Interestingly though, ON enables lacp
That would be completely contrary to what I know. The "on" should not enable LACP neither PAgP.
Best regards,
Peter
03-09-2012 07:39 AM
Peter,
I did this in a lab as well and here's what I came up with:
1st scenario:
channel-protocol lacp
channel-group 1 mode on
This one fails with the protocol mismatch indicating that it's just On. I'm assuming that it just enables etherchannel, but in Kevin's case wouldn't work (more than likely) if he needs a certain protocol. (The other end needs to be on as well.)
2nd scenario:
channel-protocol lacp
channel-group 1 mode active
The 2nd scenario works, but what is the point? You're already telling the interface to use lacp by indicating active which is what I believe Kevin was saying earlier.
In the end, I wonder if the channel-protocol command is a command that was one of those commands from the days of old that may have been required in certain scenarios and Cisco just never took it out? It really doesn't make much sense...
This was fun
P.S. Kevin, apologies if you feel I hijacked your thread. I was strictly under the impression that the On mode didn't negotiate and just came up like dynamic desirable vs switchport trunk / nonegotiate.
03-09-2012 08:45 AM
Hi John,
The 2nd scenario works, but what is the point? You're already telling the interface to use lacp by indicating active which is what I believe Kevin was saying earlier.
Absolutely so. But had you formerly configured channel-protocol pagp , you wouldn't be able to use the channel-group 1 mode active/passive. And surely, with channel-protocol lacp, you aren't able to use the channel-group 1 mode desirable/auto.
Once again - I believe that the channel-protocol command actually does nothing beyond disallowing using a form of channel-group command that would activate a different protocol. Nothing beyond that.
Best regards,
Peter
03-25-2012 01:25 AM
Thanks guys. We have just had our maintenance slot and we updated all the 2960Gs to 12.2(58)SE2. The LACP "time to go individual" has been dramatically reduced from 60 seconds to 7-8 seconds. That is much more acceptable. Problem solved.
Thanks again.
Kevin Dorrell
Luxembourg
03-09-2012 12:42 PM
It's OK, I'm enjoying the discussion. I'll let you know if TAC gets back to me.
K.
03-09-2012 05:26 AM
I didn't find anything in the release notes. OTOH, the original bug isn't in the release notes either. So I have asked TAC to re-open the case with a supplementary question.
Kevin
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide