cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

3505
Views
0
Helpful
13
Replies
Highlighted
Beginner

ASA Failover with HSRP issues

Good Morning,

This past weekend I scheduled some maintenance to move our LAN gateway off our ASA 5520's and onto our collapsed core which contains 3750 stacks. I was able to get the gateway moved down onto the collapsed core, the point to point network between the ASA's worked and was able to route packets, and the internal devices were able to get out to the internet. I then tried to failover to the other ASA just to make sure everything was working properly and this is where my project hit a wall.

When doing a failover to the other unit the primary IP address for some reason didnt float over to the secondary. For example, lets say I used 10.1.0.0/28

where the primary ASA IP is 10.1.0.1 with a standby of 10.1.0.2 and the two internal 3720 stacks each had IPs of 10.1.0.4 and 10.1.0.5 with a HSRP instance using the 10.1.0.3 IP.

So using the IPs above, when working off the primary (lets call it ASA1) everything worked properly and was able to ping the secondary ASA but when I failed over to the secondary (ASA2), ASA1 got the IP of 10.1.0.2 but ASA2 didnt ge the ip of 10.1.0.1 and I wasnt able to get into it since the IP didnt move over.

The odd thing about all of this is the config changes were being sent to the other ASA so I am confused as to why it didnt work. Both 3750 stacks had a route statement of 0.0.0.0 0.0.0.0 10.1.0.1 which I believe is correct and the ASA had a route back into the LAN that seemed to work properly off the ASA. When I rolled back, the gateway on the ASA's was able to ping regardless of which one was the primary.

Has anyone experienced this issue before that can shed some insight or be able to explain what I did wrong? I can give more information if needed.

Thanks,

13 REPLIES 13
Highlighted
Hall of Fame Guru

How did you effect the failover? Did you run "show failover" and/or "show ip address" when you were having the problem?

From the high level description it sounds as if you did it right. I could speculate about arp cache entries but it would be a shot in the dark....

Highlighted

Hello Marvin,

Thanks for your reply. No failover commands were issued during our maintenance so I feel that the failover config shouldnt have been effected. I guess the only part of the failover config that would have changed would have been the primary and standby IP addresses for the inside interfaces although I figured that wouldnt have been an issue. I did issue show failover commands and show ip address commands - I wish I would have captured the output.

From what I remember, the show failover showed the correct statistics. By that I mean when failing over to the secondary it showed that the Primary was standby and the Secondary was active. When I did a show ip address on the Primary I got an ip of 10.1.0.2 but I couldnt SSH or ping 10.1.0.1 which should have been the IP on the active ASA.

I guess the more I talk it out it may be a routing problem? From the ASA, routing to our default gateway I moved down (using fake IPs again) I had:

route inside 10.2.0.0 255.255.0.0 10.1.0.3

and from both the cores I had:

ip route 0.0.0.0 0.0.0.0 10.1.0.1

Highlighted

I asked "effect" not "affect". In other words what did you do to make the active node role switch to be on the secondary unit?

If the secondary unit believed it was not ready to take on the active role (usually due to a monitored interface being unavailable), it could mess things up.

If there's a good failover link up and for whatever reason you can't get onto one of the units, you can sometimes get good results with the "failover exec ..." command to pull information from the other unit.

Highlighted

Doh, silly me. On the primary ASA ASDM I clicked the make standby button. Only reason I used ASDM was for screenshot purposes for our prod controls team.

Your monitored interface being unavailable comment made me think of something. I am using HSRP on the core layer so that the 10.1.0.3 IP always remains up. Is it possible that the 10.1.0.1 IP moved to the secondary but wasnt ready to take on the active role because the core stack connected to ASA1 still was answering the 10.1.0.3 traffic and not ASA2?

Ah, thank you for the advice. That is good to know because I was very curious about the other ASA.

Highlighted

I did some googling and found some potential issues. Here is an example of my HSRP commands issued on the Core level 3750 stacks.

###Main Core###

ip route 0.0.0.0 0.0.0.0 10.1.0.1

interface GigabitEthernet 1/0/1

switchport mode access

switchport access vlan 902

interface vlan 902

ip address 10.1.0.4 255.255.255.240

standby 1 ip 10.1.0.3

standby 1 priority 200

standby 1 preempt

standby 1 authentication md5 key-string XXXXX

interface vlan 10

ip address 10.10.0.2 255.255.0.0

standby 1 ip 10.10.0.1

standby 1 preempt

standby 1 priority 200

standby 1 authentication md5 key-string XXXXXX

###Secondary Core###

ip route 0.0.0.0 0.0.0.0 10.1.0.1

interface GigabitEthernet 1/0/22

switchport mode access

switchport access vlan 902

interface vlan 902

ip address 10.1.0.5 255.255.255.240

standby 1 ip 10.1.0.3

standby 1 preempt

standby 1 authentication md5 key-string XXXXXXXXX

interface vlan 10

ip address 10.10.0.3 255.255.0.0

standby 1 ip 10.10.0.1

standby 1 preempt

standby 1 authentication md5 key-string XXXXXXX

Is the issue I am running into caused from me not adding a tracking command to my SVI of 902? If it is, how does the switch know to failover the ip of 10.1.0.3 if its a software failover and there isnt a disconnect from the ASA to Core?

Highlighted

Should this go under the Route/Switching forum? I havent heard back from anyone and dont want to double post.

Highlighted

It's a user forum. We try to chime in as our day job allows.

Your HSRP should not have to fail over. HSRP is operating independently from ASA failover. Failover of the ASA is mostly about the downstream switch recognizing that mac address associated with the active ASA inside address is now out port connected to ASA2 vice ASA1. Since ASAs use a virtual MAC which is on the Active unit (primary or secondary as the case may be) that should not be an issue for you. (Reference)

Highlighted

Ah my mistake Marvin, I thought there were staff members as well on these forums. I appreciate the time you are taking out of your day to respond!

Ok so do you think I have an incorrect routing statement somewhere? I guess I am just stumped as to why this isnt working, it must be a single line of config or something I am not doing properly. I dont get why the 10.1.0.1 address seems to drop off the network on a failover.

What I was thinking up until reading your reference and your post was that when I failed over to ASA2 being active, the Core stack in the other building had the 10.1.0.3 address and in order for it to route traffic out with my route 0.0.0.0 0.0.0.0 10.1.0.1 command it for some reason wasnt trying to send it across the trunk to the core in the other building and then up to ASA2.

Highlighted

There are Cisco staff around but there's no SLA or obligation for them to reply here. The TAC (if you have support contract coverage) is the avenue for a guaranteed Cisco response.

It smells more like a L2 problem to me.

Does the inter-switch trunk allow VLAN 902? (It should if the HSRP group is forming properly.)

Can the main core switch reach the standby ASA 10.1.0.2 address?

Is failover state helathy on the ASAs? What does "show failover" report?

Highlighted

Yes, when doing a show interfaces trunk I see the VLAN allowed on the inter-switch trunk.

Here is the show failover you were asking for. Since we had to roll back I dont know if it will help illustrate what went wrong on Saturday. I did bold a section that looks odd to me because the standby should be 10.10.0.2. I will see what you think though.

FW01# show failover

Failover On

Failover unit Primary

Failover LAN Interface: failover GigabitEthernet0/3 (up)

Unit Poll frequency 1 seconds, holdtime 15 seconds

Interface Poll frequency 5 seconds, holdtime 25 seconds

Interface Policy 1

Monitored Interfaces 3 of 160 maximum

Version: Ours 8.4(3), Mate 8.4(3)

Last Failover at: 06:40:31 EDT Oct 20 2012

        This host: Primary - Active

                Active time: 10031620 (sec)

                slot 0: ASA5520 hw/sw rev (2.0/8.4(3)) status (Up Sys)

                  Interface outside (X.X.X.X): Normal (Monitored)

                  Interface inside (10.10.0.1): Normal (Waiting)

                  Interface dmz (X.X.X.X): Normal (Monitored)

                slot 1: ASA-SSM-20 hw/sw rev (1.0/7.0(7)E4) status (Up/Up)

                  IPS, 7.0(7)E4, Up

        Other host: Secondary - Standby Ready

                Active time: 11034958 (sec)

                slot 0: ASA5520 hw/sw rev (2.0/8.4(3)) status (Up Sys)

                  Interface outside (X.X.X.X): Normal (Monitored)

                 Interface inside (0.0.0.0): Normal (Waiting)

                  Interface dmz (X.X.X.X): Normal (Monitored)

                slot 1: ASA-SSM-20 hw/sw rev (1.0/7.0(7)E4) status (Up/Up)

                  IPS, 7.0(7)E4, Up

Stateful Failover Logical Update Statistics

        Link : failover GigabitEthernet0/3 (up)

        Stateful Obj    xmit       xerr       rcv        rerr

        General         595126873  0          826888610  560787

        sys cmd         2680812    0          2680812    0

        up time         0          0          0          0

        RPC services    0          0          0          0

        TCP conn        172374390  0          223907351  212485

        UDP conn        122738711  0          195481873  348302

        ARP tbl         297162520  0          404571469  0

        Xlate_Timeout   0          0          0          0

        IPv6 ND tbl     0          0          0          0

        VPN IKEv1 SA    4499       0          5612       0

        VPN IKEv1 P2    120583     0          182830     0

        VPN IKEv2 SA    0          0          0          0

        VPN IKEv2 P2    0          0          0          0

        VPN CTCP upd    0          0          0          0

        VPN SDI upd     0          0          0          0

        VPN DHCP upd    0          0          0          0

        SIP Session     0          0          0          0

        Route Session   0          0          0          0

        User-Identity   45358      0          58663      0

        Logical Update Queue Information

                        Cur     Max     Total

        Recv Q:         0       31      1045252720

        Xmit Q:         0       1503    745317343

Highlighted

Whats the complete configuration of interface "inside" ?

Does it have the standby IP address configured in the interface configuration? I guess it should have since you have pinged it?

- Jouni

Highlighted

Yes that output looks odd. The standby IP should be showing up.

What does the interface address section of your inside interface look like? I would expect something like:

nameif inside

security-level 100

ip address 10.10.0.1 255.255.255.240 standby 10.10.0.2

Highlighted

Here is what the inside interface looks like on a show run:

interface GigabitEthernet0/1

nameif inside

security-level 100

ip address 10.10.0.1 255.255.0.0 standby 10.10.0.2

when we rolled back for some reason the standby command didnt take on the inside interface so I had to manually add it just now for that interface. During Saturday's maintenance I remember all interfaces as monitored

Here is the new show failover output:

CLE-FW01# show failover

Failover On

Failover unit Primary

Failover LAN Interface: failover GigabitEthernet0/3 (up)

Unit Poll frequency 1 seconds, holdtime 15 seconds

Interface Poll frequency 5 seconds, holdtime 25 seconds

Interface Policy 1

Monitored Interfaces 3 of 160 maximum

Version: Ours 8.4(3), Mate 8.4(3)

Last Failover at: 06:40:31 EDT Oct 20 2012

        This host: Primary - Active

                Active time: 10033667 (sec)

                slot 0: ASA5520 hw/sw rev (2.0/8.4(3)) status (Up Sys)

                  Interface outside (X.X.X.X): Normal (Monitored)

                  Interface inside (10.10.0.1): Normal (Monitored)

                  Interface dmz (X.X.X.X): Normal (Monitored)

                slot 1: ASA-SSM-20 hw/sw rev (1.0/7.0(7)E4) status (Up/Up)

                  IPS, 7.0(7)E4, Up

        Other host: Secondary - Standby Ready

                Active time: 11034958 (sec)

                slot 0: ASA5520 hw/sw rev (2.0/8.4(3)) status (Up Sys)

                  Interface outside (X.X.X.X): Normal (Monitored)

                  Interface inside (10.10.0.2): Normal (Monitored)

                  Interface dmz (X.X.X.X): Normal (Monitored)

                slot 1: ASA-SSM-20 hw/sw rev (1.0/7.0(7)E4) status (Up/Up)

                  IPS, 7.0(7)E4, Up

Stateful Failover Logical Update Statistics

        Link : failover GigabitEthernet0/3 (up)

        Stateful Obj    xmit       xerr       rcv        rerr

        General         595396365  0          826888884  560787

        sys cmd         2681086    0          2681086    0

        up time         0          0          0          0

        RPC services    0          0          0          0

        TCP conn        172455720  0          223907351  212485

        UDP conn        122836608  0          195481873  348302

        ARP tbl         297252441  0          404571469  0

        Xlate_Timeout   0          0          0          0

        IPv6 ND tbl     0          0          0          0

        VPN IKEv1 SA    4500       0          5612       0

        VPN IKEv1 P2    120636     0          182830     0

        VPN IKEv2 SA    0          0          0          0

        VPN IKEv2 P2    0          0          0          0

        VPN CTCP upd    0          0          0          0

        VPN SDI upd     0          0          0          0

        VPN DHCP upd    0          0          0          0

        SIP Session     0          0          0          0

        Route Session   0          0          0          0

        User-Identity   45374      0          58663      0

        Logical Update Queue Information

                        Cur     Max     Total

        Recv Q:         0       31      1045252994

        Xmit Q:         0       1503    745636858

Content for Community-Ad