(TL;DR at the bottom)
We're in the process of rolling out VoIP across our campus. The network is divided up such that we have a router per building (about 20 buildings on site), off each of which anything from 10 to 40 edge switches hang depending on building size. Each building router is uplinked to each of our two core devices. We've converged our VoIP and data infrastructure, so the VoIP network in each building is merely a different VLAN on the same equipment as our user PCs (all switches can provide PoE to every port). Each building router has, as much as possible, an identical configuration. The only differences from building to building are the actual VLAN IDs and the subnet defined for each VLAN of a specific type (Across all buildings all VLANs of a specific type will share a wider range, so for example all VoIP VLANS might be within 10.10.0.0/16, even though an individual building might only have 10.10.44.0/22. Any firewall config references 10.10.0.0/16 and thus we can have the same firewall config on each router).
DHCP for all networks, including VoIP, is handled by a central isc-dhcpd server which understands and hands out all the extended attributes required by IP Phones, Wireless Access Points, PXE booting machines etc based on the source IP (i.e. the gateway address) the request is being forwarded from.
Now on to the actual issue:
We've completed about 15 buildings of POTS -> VoIP migration so far, no problems with any of them except the latest one. In the latest building, we've had numerous complaints of phones displaying the message "IP Address Released", and requiring a reboot to get them back online. We've yet to manage to catch a packet dump of the moment this occurs, but mirroring the port the phone is on *after* the occurrance simply shows no traffic going from the phone out to the network. It's not attempting DHCP discover/request or anything, although the switch shows the port as up, on the correct VLAN, and sees the phone's MAC as being on that port. There is broadcast traffic for the relevant VLAN being sent out to the phone.
Everything I can find online about the "IP Address Released" message has to do with how to force this state to occur through the settings menu of the phone itself, which has to do with purposefully releasing an IP on one VoIP VLAN for use by another device before moving the device which you cleared the IP on to a different VoIP VLAN where it will obtain a new IP. We've never had to do this, as our DHCP server sets the age of a lease to 48 hours and frees the IP if the phone hasn't re-requested it within that timeframe (All devices should re-request before the max lease time in the original DHCP response). I assume this is to do with if your CUCM is also your VoIP DHCP server.
But we're getting phones which randomly show this message with no user input. They're sitting at their desk, working away fine on their PC, and suddenly notice that the phone is displaying this error message. There has been no noticeable interruption of networking for the PC (generally plugged into a datapoint going to the next port up on the same switch). If they unplug and reconnect the phone's network cable and cause the phone to reboot it'll come online just fine and obtain an IP. I just have no idea what could be causing this message to occur all over this one building nut nowhere else. Do the phones enter this state if they re-request their current IP from the DHCP server and receive no response? Our DHCP server is run by a different team so it's awkward but not impossible to get logs from it to see if a request came in.
TL;DR Phones on a specific VoIP subnet keep displaying "IP Address Released" message and needing a reboot to get back online. Users are not choosing the option in the settings menu that manually releases the IP, so what could be causing this behaviour? All other phones on site, on other VoIP subnets, are fine.
Do the phones enter this state if they re-request their current IP from the DHCP server and receive no response?
Yes. The phones will attempt to renew their DHCP-assigned address when 50% of the lease period has passed. In your case, this means the phone will issue a DHCP Request (not a Discover) every 24 hours. If it does not get an Acknowledge response it will release it's address entirely.
If this is really limited to a single site then either something is causing the DHCP server to behave differently for that scope (e.g. policy preventing devices retaining the same address or address exhaustion); or, you have DHCP helper problem. Maybe check to make sure the IOS version on the DHCP helper is the same between a known-working building and the suspect building.
Thanks for the reply. Our switches and routers aren't actually Cisco, so don't run IOS, but I take your point. I had a look at the DHCP server logs and we're getting requests from the problematic phones and sending responses, but for some reason the phones are presumably not getting them. It looks like the phone is then falling back to a discover, and we're sending an offer but then never seeing a request, so presuambly the phone sometimes isn't getting the offer either. I'll investigate communication with our DHCP server for this subnet. Oddly, it's only affecting 6941 model phones. We've got a number of 7940s and 7960s on the same subnets, even down to the same edge switches in some locations, and none of them are affected.
We're also seeing a message in the logs on the phones themselves - "AMMU: ammu_simple_msg(). Queue post for message FAILED, error = 3d0002". I can't find any details of that error message or just the error code on Google or Cisco's bug search tool, but I assume it is related to the lack of IP...
Is the problem resolved ? If yes,what steps did you take to resolve it ?
I had face same issue where in we have Juniper switches and phone was getting info through LLDP. We have disabled CDP as per TAC's request on all 6941 IP phones which resolved the issue.
Did anyone find resolution for this issue. We are facing the same issue. we cannot disable CDP as we are using all cisco devices.