Page 1 of 1

DNS TTL rewriting

Posted: Wed Aug 24, 2011 5:34 pm
by duanewayne
I've seen that there is a patch out to dnsmasq to allow setting a minimum TTL value when caching DNS responses and I was wondering if it was available in routerOS?

The history of how we got here is detailed below in a posting I did on the cisco website.

Regards,
--Duane

We've managed to get ourselves into a bind (no pun intended) with DNS TTL values. A little history is required.

We use a couple of service providers to run a pair of private intranets with one of the intranets getting some services from the other intranet via 2 pairs of IOS routers that are running dynamic destination nat in both directions into distinct /24 virtual pools on older 12.3 code. This code unconditionally resets the DNS TTL to 0, probably, because NATs can get reused and if this happens the cached DNS translations will end up pointing to the wrong hosts.

We've been doing this for a long time and the clients started controlling routing to the these hosts by querying the specific address of the statically natted DNS boxes (which we also control) and later DNSs that actually lived on the intranet itself and are authoritative for some hosts living there but still forward to the statically natted DNS ip on these routers and pass along the results. We have two DNS boxes and each forwards to a single NAT gateway so the clients can still control routing for these forwarded domains as the results from each pair of NAT routers is a unique /24.

Anyway, we had a client's DNS meltdown, they claim because of our DNS TTL 0 values and they've asked us to start honoring the DNS value and pass it through. Some of our architecture guys did some research and a newer version or code (12.4T) will allow us to pass the DNS TTL unchanged, the setting is "no ip nat service dns-reset-ttl". That coupled with turning up the NAT time-outs to a sufficiently large value should take care of the issue because we don't have enough hosts that we’ll ever reclaim NAT translations, except for the fact that we NAT into two different virtual pools in two different data centers, which means that we could end up blackholeing traffic for up to the DNS TTL value of 1200 seconds if we had a failure.

I championed static natting all the hosts (~150) and having our DNS be authoritative to the static NAT values. This allows us to nat in both data centers into the same virtual pool and not have to worry about synchronization (stateful nat) across our internal networks or the providers WAN. We'd let the provider deal with the routing dynamically and it should all just work. This was deemed too big a change and we're at a stalemate. Sure, the clients lost the ability to control routing via DNS, but they have better ways to do that anyway.

We've moved from a strategic fix for everyone to a tactical fix for the one client. My question is, do you know if you can "DNS Doctor" the DNS TTL values in an ASA/PIX/IOS router? I'm pretty sure that I can't do it in an IOS box and I didn't see the functionality in the ASA docs either, but I figured that I'd ask the experts.

I received no replies on the support forum, so I started looking at other alternatives. There are several DNS servers out there that will perform this function (unbound, Pdnsd, and if patched dnsmasq). I’d really rather deploy a IOS/PIX/ASA device at the client site if it could do this rewriting for me but I can’t figure how to make that happen. I’ve tested unbound and it work as advertised but that means I’ll need to convince the client to do this or else build him a box. If it comes to that, I might see if any of the embedded linux firewall /routers (routeros, vyatta) do it via a patched version of dnsmasq.

I’m hoping that you can help because I’d MUCH rather support a known box, which to me would be a Cisco…

Re: DNS TTL rewriting

Posted: Thu Aug 25, 2011 9:46 am
by changeip
seriously, just fix the problem instead of trying to bandaid it. You should NEVER mess with TTLs on dns queries. Fix the cisco so it passes them thru as they should be.

Re: DNS TTL rewriting

Posted: Thu Aug 25, 2011 4:19 pm
by duanewayne
You missed a couple of points in my posting. The first was that I wanted to statically nat all the hosts into a single virtual pool and have our DNS box resolve to the static nats. Once we do that, then we can stop rewriting the DNS TTL values on the router and it will all just work. I think this is your "fix the problem instead of bandaid it" part. I want to do this but it was deemed to big a change. We absolutely need the nat in-between these two networks, so the only option is static nat into a single pool in each data center or run dynamic stateful nat between the two locations and a single pool. I like the first one better because confused state between the two routers would be hard to debug...

The second point was that we control all the DNS boxes the client can get to so there is NO risk in me messing with the DNS TTL values because I know what they should all be.

3rd point was that we nat into two distinct pools, so without fixing that, sending through the real TTL values introduces the possibility of the client blackholing traffic to a virtual nat pool that is unreachable.

So, back to the question. Does anyone know if the dnsmasq patch to allow you to override minimum TTL values has been integrated into routeros? link: http://lists.thekelleys.org.uk/pipermai ... 00253.html

I talked with Cisco and it is on the roadmap for the ASA line as an enhancement request, so other people are in situations like this too.