cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
577
Views
5
Helpful
4
Replies

DNS resolve issues - content and group rule

gmiiller
Level 1
Level 1

I'm having intermittent issues with DNS resolution where clients aim at a VIP for DNS, with requests farmed off to a cluster of resolvers. There is also a fairly vanilla group config for the VIP address and the services.

Things generally work okay, but under some failure conditions (eg: a dns resolver stops) the service state is accurately picked up by the CSS, but clients who have previously been directed to that particular resolver (balance srcip) by the CSS can't resolve dns. Also, there appear to be a high number of DNS errors even under normal conditions, and behaviour varies between requests using 53 as a src port and those using random high ports

Should I be doing a

'flow-state 53 udp flow-disable nat-enable' ?

4 Replies 4

Gilles Dufour
Cisco Employee
Cisco Employee

CSS does not like udp traffic with source port in the 1-1024 range.

So you should indeed disable flow mapping for DNS traffic.

The command you suggested shoud be correct.

[be carefull it does not exist in every versions].

Regards,

Gilles.

We have run into this issue also.

For a few months there were no problems and then boom! DNS responses started dropping at the CSS.

The problem would temporarily go away if we either recycled the DNS service on the back-end server or if we manually cleared the FCB on the CSS (i.e.: first determine the flow id number in HEX using the "flow-agent show active_fcbs tuple ..." debug-mode command and then issue the debug-mode command: flow-agent action kill_fcb 0x########).

The recycling of the DNS service on the web server causes a new random UDP source port to be used by the server for future DNS queries, which forces the CSS to create a bran new (working) FCB.

I have a few questions about this issue: Why does the CSS not have this flow state disabled by default for DNS? Why did this start failing on us only now? Could this be because the server is now sending generating a lot more DNS traffic than it originally was in the past, which causes the UDP53 flow to remain active for too long and preventing the "flow garbage collection" process from kicking in? If so is there a known threshold for UDP flows to explain the break-down point (i.e.: a maximum number of packets or bytes for a given UDP flow or a maximum time that a a given UDP flow can be sustained?

Thanks in advance,

Daniel

there is no threshold at which point a flow breaks.

It should work forever.

I would suggest to capture a 'flow-agent show fcb 0x....' where 0x... is the flowid that you get with 'flow-agent show active'.

Verify that the flow info is corret - especially mac addresses.

Capture a trace on all ports of the css and see if the traffic is being dropped or forwarded to the wrong destination.

Gilles.

I am setting up DNS cache loadbalancing, Not yet production, how do I protect my setup vs this problem (flow problems) from day 1.

My setup is very vanilla at the moment, dnscaches using the css as a gateway for lookups, and a working setup with the following group definition;

group dnscache

add service dnscache1

add service dnscache2

vip address 192.168.0.1

active

I have a pair of css 11503's running 8.1.x. Is it as simple as adding 'flow-state 53 udp flow-disable nat-enable'

And does setting the gateway to the css interface have any performance impact?

Review Cisco Networking for a $25 gift card