cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
6176
Views
30
Helpful
14
Replies

OSPF and HSRP Routing through standby?

stevericks
Level 1
Level 1

Hi,

We had a network outage to a remote site.

The site has 2 Virgin Media links into 2 different comms rooms. They connect to 3560 switches and they in turn connect to 2 ASA Firewalls.

The link between the 2 switches went down, seemed to take the standby switch offline and then we couldn't get to the remote site and they couldn't run our apps.

I have tried to recreate the problem using Cisco Packet Tracer.

what confuses me is that the tracert I am doing goes via 10.36.100.2, the standby address. (Ignore the first tracert)

looking at my monitoring, it seems to get traffic in on the active switch, send it across the gigabit link to the standby switch, back again to the active one and then onto the firewall. (Firewall is off the diagram to the right.)

Its doing the same on the live network....So Im getting somewhere.

The hsrp address is 10.36.100.1 and the active address is 10.36.100.3

Sometimes it does go via 10.36.100.3. So it seems to be load balancing....a bit. But I would expect it to go via 10.36.100.3 all the time.

So my question is....is why is it going via the standby switch at all?

Hope that all makes some sense.....

Any help will be much appreciated.

Thanks,

Steve.

14 Replies 14

Philip D'Ath
VIP Alumni
VIP Alumni

I'm assuming there is no dynamic routing protocols, and everything is statically routed.

Loosing the link between the HSRP switches would make both switches go HSRP active since they would both think the other has failed.  You should consider using two links in a channel.

The routing behaviour of the two Virgin routers is unknown.  They could load balance or be primary/standby.  In either case they would normally only forward to the HSRP address.  In this case, both routers would be forwarding to both switches, since they are both directly attached and both switches are HSRP active.

Thanks for the quick reply guys. Philip, that does make good sense. So best way forward is to put another link in between the switches in an etherchannel. To answer to some of the questions....Yes it is OSPF between the routers and the switches. The switches have vlan interface ip addresses on them. The link between the 2 switches is a single fibre trunk. The virgin routers are in an active and standby state. using HSRP again I believe. Also the other thing to mention is that the ASA firewall apparently failed over to the standby one. I don't manage them, its a third party company, so cant see the config or logs. Steve.

Steve

Adding another link would help but your traceroute doesn't make sense unless your diagram is wrong.

If the link between the switches is down then the only way for the router to send traffic to the standby switch is via the other router but your traceroute does not show that.

It shows the packet arriving at the bottom router and then the next hop is the standby switch but it can't get to that switch via the other switch because the link is down.

So it would have to go via the other router and this would show as another hop in your traceroute.

Jon

Here is a tracert on the live network....It shows it going correctly, but then goes to the address of the vlan interface on the backup switch. Very confusing.

Could it be that OSPF sees the 1gb link to the other switch as a faster route than the 100mb link to the firewall?

Or is it that the vlan is on both switches and it doesn't matter what vlan interface it uses?

Tracing route to SHDC1 [172.24.10.11]

over a maximum of 30 hops:

  1    <1 ms    <1 ms    <1 ms  172.16.32.156

  2    <1 ms    <1 ms    <1 ms  fghfw3.ottouk.com [172.16.32.209]

  3     1 ms     1 ms    <1 ms  listerhills_demarcation_sw01.ottouk.com [10.227.178.2]

  4    <1 ms    <1 ms    <1 ms  10.227.177.11 (Goes onto Virgin Media MPLS)

  5     *        *        *     Request timed out.

  6     *        *        *     Request timed out.

  7     *        *        *     Request timed out.

  8     *        *        *     Request timed out.

  9     *        *        *     Request timed out.

 10     5 ms     5 ms     5 ms  host-80-193-79-229.static.cable.virginmedia.com [80.193.79.229] (This is the correct Virgin Media Router at their end)

 11     5 ms     5 ms     6 ms  10.36.100.2 (Hits the Backup Switch??? Should be .3)

 12     5 ms     5 ms     5 ms  10.37.100.1 (Hits their firewall cluster address)

 13     *        *        *     Request timed out.

 14     6 ms     5 ms     5 ms  SHDC1 [172.24.10.11]

Trace complete.

Thanks,

Steve.

Steve

Is this with the link between the 3560s up ?

If your diagram is accurate it has to be up because if it is down the only way to get to the backup switch is via the other router and that is not showing as a next hop.

Can you clarify ?

Jon

Yes, sorry, its all up and working now.

Just cant figure out why it would report the backup switches vlan interface IP address as one of the hops.

Steve.

Steve

If you are using OSPF and you do not have a static route with a lower AD pointing to the VIP then the routers can choose either switch as the next hop IP because it is a common vlan so they will see equal cost paths to all destinations.

Edit - that is assuming the routers have OSPF routes to the internal networks which depends on whether the firewalls are advertising them or not.

Jon

I think that is what I suspect Jon. There are static routes on the switches and OSPF then advertises them around the network.

Its been years since I did my CCNA and I had forgotten all about Administrative distances.

Looking at the switches quickly I can see they all appear to be default ADs. So your right.

So if I make the route I want lower, it should always go that way unless its down.

Although...Just thought..We are using HSRP, so that would be the advertised route.

So I presume OSPF will pick either route.

Steve.

Steve

It's difficult to say what to do without understanding the traffic flows you want in terms of links used both to and from the site.

You can use statics to force traffic a certain way but you may only be doing that because of the interaction of HSRP with OSPF and ideally you would only really want to be running one of them between the routers and switches.

Which to use really does depend on what you are trying to achieve and we would need to understand how you want the WAN links used for traffic both ways.

Jon

We are going to do a bit of testing in the next couple of weeks. shutting down links etc.

I think going forward I can recommend an extra fibre be installed between the 2 switches and also have a chat with Virgin as to why we have HSRP as well as OSPF.

Thanks for all the help, been a while since I had to look so deeply into any network issue.

Steve.

Steve

No problem.

One last thing before your chat, bear in mind what happens with a link failure between either of the routers and it's corresponding switch because it is not just the switch interconnect you need to think about

I suspect that is part of the reason for using both HSRP and OSPF and you may find you can use one or the other instead of both.

By all means come back if you want to discuss further at a later date.

Jon

If you are using OSPF I don't understand why you need HSRP.

But yes, I would use dual links in an Etherchannel.

Good point....

This was implemented by a predecessor, so I'm trying to get my head around it.

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello Steve,

I don't know how good is packet tracer in replicating a scenario like yours.

In order to get better help you should provide:

- a complete network diagram inclusing the ASA firewalls that are likely deployed in active/standby setup.

- a description of the routing protocols between the SP routers (Virgin Media routers in your case) and the two C3560. Are they speaking OSPF between them? Or there are only static routes using HSRP VIP addresses as IP next-hops?

- there is a L2 path between the two C3560 so that if the standby C3560 receives a packet destined for a network downstream it can reach the active ASA firewall via the other C3560?

In other words is the Vlan in common between the two C3560 and the two SP routers carried over a trunk between the two C3560 (I expect this as you have configured HSRP that cannot run in multiple broadcast domains).

If the above is true you don't need to care about the traceroute hitting the standby router as at OSI L2 traffic is then redirected to the active ASA via the other C3560 and everything should go well.

If you can attach the configuration of the two SProuters and of the two C3560.

In your real network when the standby C3560 was isolated from the network the whole site became unreachable.

You should investigate on the logs of the C3560 devices and of the ASA to see what happened.

Hope to help

Giuseppe

Edit:

Philip is right when the inter switch link is broken both switches claim to be the HSRP master as they don't see each other HSRP messages, and so each SP router propagates to the directly connected device. The difference is downstream where only one ASA is likely active and if your normally standy C3560 connects to the standby ASA it has no way to reach the active ASA if the inter-switch link is down.

Recommendation: use a port channel for the inter-switch link in order to achieve better redundancy.

Review Cisco Networking products for a $25 gift card