cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1406
Views
5
Helpful
3
Replies

FWSM ospf neighbor stuck in init state

sjung
Level 1
Level 1

We have a FWSM in a 6509 chassis that is running ospf on the inside, outside, and DNS DMZ.  The inside and outside neighbor relationships are fine and functioning with no problems.  The DNS DMZ that is running OSPF with 2 servers for anycast dns is stuck in init state.  A debug ospf events on the firewall shows that the DMZ gateway is sending hellos to 224.0.0.5 and that it is receiving hellos from the two servers.  A packet sniff on the DMZ vlan or etherchannel feeding the switch that houses the servers only shows hellos sent by the servers.

This configuration was up and working before our data center had power maintenance performed this weekend.  Ip connectivity is working and the servers are reachable but it seems that the multicast for ospf routers is being lost in layer 2 somewhere.  I tried to establish an OSPF adjacency on a different DMZ vlan with the exact same results.  DMZ gateway sends hello but it gets lost somewhere in layer 2 land.  This was also confirmed with a packet sniffer on the vlan and physical port where the test router was connected.

Nothing in the config has changed and as stated previously it was in working condition until the reboot after the power maintenance.  This exact same configuration is up and working in our southern data center.  I've stripped the ospf config out and put it back with the same results.  You never actually see a hello generated by the gateway with the exception of the debug on the fwsm.  Both parties are sending but the servers or test router never see them.  Any thoughts would be greatly appreciated.  Thanks for your time.

Scott

3 Replies 3

sjung
Level 1
Level 1

Just in case anyone else runs into this scenario I wanted to give a few more details and the resolution to this problem.  This 6509 has a 6513 mate that shares HSRP responsibilities.  The 6513 also has a fwsm that is usually secondary-standby.  I failed over to the secondary to make it active and the ospf neighbors came up with no problem.  Keep in mind that nothing layer 2 changed as the 6509 was still the active switch.  A reset of the FWSM in the 6509 and a failover back to the fwsm in the 6509 exhibited the same symptoms.  I even switched out the fwsm from our lab 6509 with the same results.

Next step was to actually reboot the 6509 thinking that this would clear any processes that might be "stuck."  Same results.  The neighbors on the DMZ were still stuck in init and the hellos from the gateway were not being seen on the vlan.

After comparing the two 6500's the only difference is that the 6509 has an IDSM-2.  Deciding to focus on that I started digging in deeper and found a known caveat for the ios version that is running on both 6500s.  The version that is in production is 12.2(18)SXF11.  This bug can cause multicast problems when there is a fwsm AND an idsm in the same chassis.  The bug id is CSCsk62017.  The workaround is to create an SVI for the dmz vlan on the 6500 and assign it a random ip address.  You do NOT have to bring the interface up.

interface Vlan205
description ** DO NOT CHANGE - NEEDED FOR OSPF ON DMZ **
ip address 10.1.1.1 255.255.255.0
shutdown

After creating the svi I made the 6509 fwsm active again and the neighbor relationships came up with no problem.  I still don't understand why this fiixes said problem due to the fact that I do not have a solid understanding on how the 6500 and the fwsm communicate.  I know that mon sess 1 (service module session) and po270 play a critical role but the details are unclear to me.

What had me looking in the wrong direction is that this config worked initially without the svi on the 6509.  Seeing how the ospf neighbor relationships on the inside and outside interfaces worked, I realized that they are the only two interfaces that are not being monitored by the ids via the vacl.  This config working initially must have been sheer luck as reboots and process clearing would not budge it.  Our southern data center was running on the secondary fwsm which is in a 6513 that doesn't have the idsm in it.  The same config change was made to the primary 6509 and its neighbor relationships came up with no problem either.

I hope this helps anyone who runs into this scenario in the future as it really threw me for a loop.

I cannot thank-you enough for your information.  We ran across this exact same bug today and through the magic of Google found your post here.  Your write up and explanation of your troubleshooting and your posted work around was wonderful.  It really saved us.

Thank-you  so much!

I'm happy to hear that this helped someone. You are very much welcome.

Review Cisco Networking for a $25 gift card