cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1264
Views
15
Helpful
11
Replies

Adding a 3750X switch to an existing network with 3550 switches disables them both

MikeC06933
Level 1
Level 1

Hi - 

Have an odd problem I can't get to the bottom of.

 

I have an existing network with a couple of WS-C3550-12T switches on it that link two buildings via fiber. These have been working fine for years without any problems. They are running 12.2(44)SE6 - IP-SERV-CRYPTO

 

I recently obtained some WS-C3750X-48P-L switches which I have reset and are now running c3750e-universalk9npe-mz.152-4.E10.bin

 

Both sets of switches are pretty much at minimal configuration - just an IP address assigned to VLAN1 and an enable password and secret and vty password - essentially what the initial basic setup configuration does (not set to cluster command switch and snmp not enabled). I also paired two 3750X's in a stack with two stack cables.

 

I took this newly configure 3750X stack on-site and plugged it in to the existing network - connecting one of the 48 gigabit ports on the 3750x stack to the existing network via a non-cisco switch. There are at least two non-cisco switches between the new 3750X stack and the existing 3550 switches

About 5 minutes later - I got frantic calls from users that the network was down - seems like the two 3550's were no longer routing any traffic between the two buildings. I couldn't access them from the network either.

As I'd done very little else to cause this problem (both 3550 switches seemed to be up and running according to the lights on them) - I thought I'd unplug the 3750x stack from the network. After about 5-10 minutes - and without me doing anything else - the 3550's came back to life and started passing traffic between them again.

I tried this once more a few hours later - plugged in 3750x stack - five or so minutes later - 3550's stop passing traffic. Unplugged 3750X from network - about 5-10 minutes later -3550's working again.

 

As it's a live network and this caused huge disruption - I couldn't really test it - so I set up a test network - very similar - this time three 3750X's - two in a stack - one stand-alone - and two 3550's - again - non-cisco switches between the two (part of another existing network)

 

Fortunately I was able to replicate the issue straight away - after a few minutes of plugging the 3750X switches into the existing network - the 3550's became inaccessible. So did the 3750X plugged into the network - pings to the router stopped and they showed up as missing in CNA

 

Unfortunately no amount of trouble shooting has got me any closer to a solution or figuring out what is going on.

I had a console connection to the 3550 that was connected to the existing network. When the problem occurred it logged the following:

00:04:00: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/10, changed state to down
00:04:03: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/10, changed state to up

And immediately afterwards you could no longer ping the switch - console access was still fine and no other errors were shown. I tried turning off spanning tree - no difference. sh int shows the interface up and line protocol up and all details look OK (same as without problem). show interface status err-disabled comes back with nothing

 

The 3750X switch was similar - same lineproto down and up - no other errors - sh int normal and status err-disabled blank

 

No messages were logged when the switch became accessible again

 

I set the logging level to debugging - slight more info when fault occurs on 3550 - but nothing helpful (at least to me):

000020: *Mar 1 03:10:04: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/10, changed state to down
000021: *Mar 1 03:10:04: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan1, changed state to down
000022: *Mar 1 03:10:07: %LINEPROTO-5-UPDOWN: Line protocol on Interface GigabitEthernet0/10, changed state to up
000023: *Mar 1 03:10:37: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan1, changed state to up

 

The Vlan1 changing back to UP does NOT restore connectivity to the switch

 

Now when the switch becomes accessible gain some minutes after unplugging the 3750x it logs the following:

000024: *Mar 1 03:52:38: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan1, changed state to down
000025: *Mar 1 03:53:08: %LINEPROTO-5-UPDOWN: Line protocol on Interface Vlan1, changed state to up

 

 

The problem occurs with a single 3550 and a single 3750x

I have not tried link the two switches together directly - they have always been connected with at least two other non-cisco switches between them

They have the same passwords

I never added them to the same community in CNA (but I don't think that has any effect on the switches themselves?)

The non-cisco switches were different between the live and test networks - so I don't think the make of them is relevant.

Also - the non-cisco switches seem unaffected by any of this and continue to work throughout.

 

I'm completely out of ideas as to what to try next or any sort of troubleshooting commands or steps that might shed further light on the issue. (I'm not a Cisco guru)

 

Thank in advance,

 

Mike

11 Replies 11

Ryan994080
Level 1
Level 1

Can we see the configurations for the interfaces that are connected to the two non-cisco switches? Is spanning-tree BPDU guard at play by chance?

Hi - 

 

Thanks for your reply.

 

I don't think I made any configuration changes to any of the interfaces - so they should be at factory default settings - as per my post - just a minimal configuration - except I did also set up VTP under VLAN configuration on both switches as well.

 

I will try to get the config from the switches once I've set up the test network again.

 

Regards, 

 

Mike

Hello
Its hard to understand the root cause because how you've describe your network isn't really that clear to me.

Now If i am correct in what you have explained, it seems just by connecting a single 3750 stack via a single port on that stack to your existing network caused this outage, If this is the case then it wouldn't be spanning tree issue, However if the 3750 stack had duplicated ip addressing assigned to it then this would conflict with the existing network, As such could be a possibility as to why you've incurred outage.

Can you post a topology diagram of the network and possibly the running configs of the 3750 stack and 3550?


Please rate and mark as an accepted solution if you have found any of the information provided useful.
This then could assist others on these forums to find a valuable answer and broadens the community’s global network.

Kind Regards
Paul

Hi -

 

Thanks for your reply

 

Apologies for not being clearer on the network topology - however you are entirely correct - just by connecting a single 3750 stack via a single port on that stack to my existing network - it caused the outage.

 

The 3750 stack was plugged into a non-cisco switch (an HP Procurve I think) - which was connected via another switch and a fiber optic link using stand-alone transceivers - one of which was plugged into the existing 3550 switch

 

I'm pretty certain that the new switch had a unique IP address assigned to it before I connected it to the existing network

 

Definitely when I went to re-create the problem on my test network - there were no conflicts as I could access both switches remotely via their IP addresses in the short period before they became inaccessible.

 

I will recreate my test network and look to post the configs of the switches

 

I will also try connecting the two switches directly as well as via other switches

 

What I found odd is that - other than the messages about the interfaces going down - there was no other error message logged on either switch.

 

Regards,

 

Mike

 

 

 

 

AdamF1
Level 1
Level 1

I’m just as confused about the topology as the others.  

you stated you plugged directly into the 35xx stack but then stated there are at least 2 other switches between the 3750x and 35xx stacks… sounds like bpdu guard or spanning tree in action down or upstream. 

What does the port configuration on both sides look like that you plugged into? 

are these L2 switches or L3? 

 

Hi - 

 

Thanks for your reply

 

I re-read my original post - I'm afraid I can't see where I said I plugged directly into the 3550 - also the 3550 isn't part of a stack - there were two 3550's linked via fiber in the original network - and just a single 3550 in my test network - only the 3750x's were in a stack

 

The only thing I plugged directly into the 3550 was the console port

 

I'll try to recreate my test network and get the port configurations

 

As far as I know - the 3570x stack is L2 only but the 3550 is L3

 

Regards,

 

Mike

When I read the original post my first thought was about duplicate IP addresses. But then I realized that this could not be the issue. The IP addresses on the switches are only for management purposes and do not have any impact on layer 2 forwarding. And the complain from users is pretty clear that layer 2 forwarding was impacted. My guess is that the issue is something like a mismatch in vtp.

HTH

Rick

I had assumed that management IP addresses would have no impact on l2 forwarding - but thanks for confirming.

 

In looking at the VTP configuration - I realized it wasn't set on either switch - but whilst on that window - moved to the the Configure Port tab (in CNA - Switching->VLANs->Configure Port) 

 

I noticed that on the 3750X stack - all ports are set to an Administrative Mode of "Dynamic Auto" - I also noticed that the port connected to the non-cisco switch (that the 3550 was also plugged in to) had an Operational Mode of "ISL Trunk" - and I would have expected an Operational Mode of "Static Access"

 

I modified the port and set an Administrative Mode of "Static Access" - leaving it on the default VLAN of 1 (for some reason it came up in red with errors that needed to be fixed - but didn't say what they were - and after clicking apply and refresh the port showed as both Administrative and Operational modes as Static Access) - and lo and behold - I immediately gained access to the switch via the non-cisco switch.

 

After a few minutes - the 3550 port connected to the non-cisco switch changed from Down to Up - and everything is working OK!

 

The 3550 switch ports default to "Dynamic Desirable" and the port connected to the non-cisco switch had/has an Operational Mode of "Static Access" - which is what I would expect

 

So the problem was down to the default "Dynamic Auto" setting on the 3750X switch leading to an Operational Mode of "ISL Trunk" when connected to the non-cisco switch

 

The non-cisco switch on my test network is an HP Procurve J9660A switch at factory default settings apart from it's management IP address.

 

So I have a fix/workaround - but I don't really understand why the problem occurred:

 

Why does the 3750x switch on default settings set an operation mode of ISL trunk when connected to the HP Procurve?

 

Why does the fact that the 3750x switch set the op mode to ISL trunk cause the 3550 switch that just happens to be on the same L2 network to shut down the port that connects it to that part of the L2 network?

 

And why does the 3550 not say why it shut the port down?

 

(I also tried setting the 3750x port to and Admin Mode of "Dynamic Desirable" to match the 3550 settings - but it still set an Op mode of "ISL Trunk" and shut down the 3550 port)

 

So the 3570X on default settings interacts with the HP Procurve in a way I would not expect it to and in a way which the 3550 switch does not

 

I guess I'm going to have to set the Admin Mode on the 3750x ports manually to avoid this problem

 

Regards,

 

Mike

 

 

 

 

 

 

 

Mike

Thanks for the update. Glad to know that you found a mismatch in the config that seems to be what was causing the issue. I am curious about whether having the Procurve in the middle contributes to the issue? If you connect a 3750 with the default configuration directly to a 3550 would you still have the same issue. And I am puzzled why (or how) connecting the 3750 had such drastic impact on the 3550. I would have assumed that mismatch between port configuration would prevent the 3750 and 3550 from communicating. But what causes the 3550 to basically quit processing its access ports?

HTH

Rick

Hi - 

 

I've done some more testing.

 

If I plug the 3750X and 3550 together directly - the problem does NOT occur - but this is because the 3550 also changes the port mode to ISL Trunk - so both switches match configs with ISL trunk and communication works.

 

I also tried two other non-cisco switches - a managed netgear and an unmanaged 10/100 fast ethernet switch - with both - the 3750X set an op mode of ISL trunk and the problem occurs.

 

I did one further test - I unplugged the 3550 from the third party switch - the 3750x then set the Op Mode of the port to Static Access which is what I would expect -plug the 3550 back in to the third party switch - 3750X switches port to ISL trunk and problem occurs - unplug 3550 and after some minutes - reverts back to static access

 

So it seems to me the problem is a bug in the 3750x IOS that incorrectly sets the port op mode to "ISL Trunk" when there is a 3550 switch indirectly connected to it via another switch. The 3550 behaves correctly and only sets "ISL Trunk" mode when the 3750x is directly connected to the port.

 

There then seems to be the separate issue on the 3550 of why its shuts down a port on a network where there is an indirect connection to a switch with a misconfigured port - ISL trunk mode to a nob -cisco switch instead of static access) - however - as the 3550 is way over EOL.....

 

Regards,

 

Mike

 

Mike

Thanks for the update. It appears that the issue is the behavior of ports configured for Dynamic Auto. The dynamic part looks for some negotiation. When the 3750 is directly connected to 3550 the negotiation is successful, and I am a bit surprised that the result has both using ISL trunk. When connected to Procurve there is not negotiation and the 3750 using ISL trunk is a problem. I can not say whether it is a bug for the 3750 to insist on ISL trunk when set for Dynamic Auto - anyone from Cisco care to address this? But clearly if you want to connect the 3750 to Procurve then you need to change the settings on the 3750 and not use the default.

 

HTH

Rick
Review Cisco Networking products for a $25 gift card