08-24-2016 08:37 AM - edited 03-05-2019 04:33 AM
hi guys,
I have set up an hsrp between 2 Catalyst 4000 L3 Switch Software (cat4000-I9S-M and has been giving me problems for the past 2 weeks. I have the topology attached.
The hsrp Lab is working fine when one switch port track interface goes down the other switch takes over as the active one.
but when preempt makes the high priority switch take over after interface has come up, users from certain network stop accessing the Internet. For my switch Core_2 it works fine but for CORE_1.
topology attached.
Solved! Go to Solution.
08-25-2016 12:05 AM
Hi,
I would not say "...users from certain network stop accessing the Internet."
As the hosts #8 - 16 in your tracert outputs are using public IP adresses, so they are within the Internet, aren't they?
So the users should be able to reach some Internet destinations, I guess?
I'm just guessing but maybe there is some issue between your Core switches and your FW and the packets are sent from the FW to the other Core switch when returning from the Internet?
Best regards,
Milan
08-24-2016 09:10 AM
My trace of a subnet that is using CORE 1 as hsrp gateway before interface goes down
C:\Users\lucas>tracert -d 8.8.8.8
Tracing route to 8.8.8.8 over a maximum of 30 hops
1 <1 ms 1 ms 1 ms 192.168.9.1
2 1 ms 1 ms 1 ms 10.0.100.1
3 1 ms 1 ms 1 ms 10.0.101.1
4 4 ms 2 ms 1 ms 192.168.1.1
5 * * * Request timed out.
6 97 ms 78 ms 82 ms 10.16.16.3
7 74 ms 133 ms 92 ms 10.16.17.1
8 245 ms 217 ms 213 ms 41.72.61.70
9 281 ms 238 ms 273 ms 197.149.148.105
10 359 ms 313 ms 272 ms 197.149.151.4
11 337 ms 327 ms 327 ms 185.148.112.22
12 324 ms 353 ms 357 ms 193.136.250.20
13 321 ms 293 ms 317 ms 216.239.49.242
14 320 ms 357 ms 347 ms 209.85.245.237
15 293 ms 307 ms 287 ms 216.239.57.227
16 388 ms 362 ms 343 ms 216.239.62.153
17 * * * Request timed out.
18 436 ms 357 ms 364 ms 8.8.8.8
Trace complete.
after intface goes down and comes back up I receive hsrp confirming its active switch for the certain subnet but still not able
to reach internet.
CORE_1_TESTE#
01:25:51: %HSRP-6-STATECHANGE: Vlan6 Grp 5 state Standby -> Active
01:25:51: %HSRP-6-STATECHANGE: Vlan7 Grp 6 state Standby -> Active
01:25:51: %HSRP-6-STATECHANGE: Vlan8 Grp 7 state Standby -> Active
01:25:51: %HSRP-6-STATECHANGE: Vlan10 Grp 9 state Standby -> Active
01:25:51: %HSRP-6-STATECHANGE: Vlan12 Grp 18 state Standby -> Active
01:25:51: %HSRP-6-STATECHANGE: Vlan19 Grp 11 state Standby -> Active
01:25:51: %HSRP-6-STATECHANGE: Vlan24 Grp 23 state Standby -> Active
01:25:51: %HSRP-6-STATECHANGE: Vlan17 Grp 205 state Standby -> Active
01:25:51: %HSRP-6-STATECHANGE: Vlan45 Grp 45 state Standby -> Active
trace results
C:\Users\lucas>tracert -d 8.8.8.8
Tracing route to 8.8.8.8 over a maximum of 30 hops
1 1 ms 18 ms 20 ms 192.168.9.1
2 <1 ms 1 ms 1 ms 10.0.100.1
3 1 ms 1 ms 1 ms 10.0.101.1
4 2 ms 2 ms 1 ms 192.168.1.1
5 * * * Request timed out.
6 98 ms 157 ms 97 ms 10.16.16.3
7 117 ms 127 ms 98 ms 10.16.17.1
8 156 ms 113 ms 142 ms 41.72.61.70
9 190 ms 192 ms 217 ms 197.149.148.105
10 293 ms 352 ms 288 ms 197.149.151.4
11 268 ms 298 ms 336 ms 185.148.112.22
12 329 ms 312 ms 337 ms 193.136.250.20
13 327 ms 358 ms * 216.239.49.242
14 209 ms 217 ms 227 ms 209.85.245.237
15 298 ms 222 ms 232 ms 216.239.57.227
16 357 ms 312 ms * 216.239.62.153
17 * * * Request timed out.
18 * * * Request timed out.
19 * * * Request timed out.
20 * * * Request timed out.
21 * * * Request timed out.
22 * * * Request timed out.
23 * * * Request timed out.
24 * * * Request timed out.
25 * * * Request timed out.
26 * * * Request timed out.
27 * * * Request timed out.
28 * * * Request timed out.
29 * * * Request timed out.
30 * * * Request timed out.
Trace complete.
08-25-2016 12:05 AM
Hi,
I would not say "...users from certain network stop accessing the Internet."
As the hosts #8 - 16 in your tracert outputs are using public IP adresses, so they are within the Internet, aren't they?
So the users should be able to reach some Internet destinations, I guess?
I'm just guessing but maybe there is some issue between your Core switches and your FW and the packets are sent from the FW to the other Core switch when returning from the Internet?
Best regards,
Milan
08-25-2016 01:57 AM
could it be loop happening on the network? but for subnets on my core_2 does not happen when track interface goes down and restores, all works fine. Im running ospf as a routing protocol on my 2 core switches upwards on the topology, on my firewall I have the 2 interfaces facing Core switches as inside.
08-25-2016 02:52 AM
Hi,
what are the exact symptoms?
When the track interface goes Down on Core1 switch, HSRP moves the Active interface to Core2 and users are able to connect to the Internet with no problem?
But when the track interface goes Up again, the users are not able to connect to the Internet at all? Or to some destinations only? Or does the connection recover after some time again?
How is the routing between your FW and the Core switches done in details?
When the track intrface goes Down on Core1, is the traffic for the subnets behind Core1 switch forwarded from the FW to Core2? Or still to Core1?
Are there any suspicious messages visible in the FW log?
Best regards,
Milan
08-25-2016 04:53 AM
1. when Core 2 asumes active for subnets under Core 1 all works fine and users are still able to access internet. but when Core 1 restores and assumes back the role as active users from subnet get time out and stop accessing the net.
2. OSPF between devices, firewall and 2 core switches exchange routes just fine.
3. no message log at all.
Im starting to think it has to do with the image, because subnets on CORE 2 works just fine when active or standby.
08-25-2016 05:08 AM
im using Cat image on my layer 3 switches, I was wondering if I could use the same image on both switches because they are using different ones. starting to think about bug.
08-26-2016 01:17 AM
Hi,
ad 1. ...when Core 1 restores and assumes back the role as active users from subnet get time out and stop accessing the net.
So the users are not able to connect to any Internet destination at that time?
And it does not recover after some time?
ad 2.
I can see only
router ospf 1
log-adjacency-changes
network 10.0.10.0 0.0.0.255 area 0
network 192.168.5.0 0.0.0.255 area 0
network 192.168.7.0 0.0.0.255 area 0
network 192.168.9.0 0.0.0.255 area 0
in Core 1 config but more network ... commands in Core 2.
Shouldn't that be the same in both configs?
ad 3. sure it's better to run the same image on both swtches.
BR,
Milan
08-26-2016 02:34 AM
Hi Milan ,
Im running tests on subnet 7 that is why I only added a few subnets to ospf. I will upgrade the switches with same IOS and try again.
08-26-2016 04:51 AM
08-26-2016 10:20 AM
Yes, that was one of possibilities.
Or a problem with some next-hop MAC address.
That's why I was asking several times:
When the track interface goes Up again, the users are not able to connect to the Internet at all? Or to some destinations only? Or does the connection recover after some time again?
BR,
Milan
08-30-2016 09:34 PM
It was defenetly something wrong with the firewall and my core switches, firewall statefull inspection so I had to change my topology in order for the firewall to have only 1 inside interface and still connect both Core switches. All is working fine now and thanks for your support Milan.
New topology attached
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide