DNS TTL rewriting
Posted: Wed Aug 24, 2011 5:34 pm
I've seen that there is a patch out to dnsmasq to allow setting a minimum TTL value when caching DNS responses and I was wondering if it was available in routerOS?
The history of how we got here is detailed below in a posting I did on the cisco website.
Regards,
--Duane
We've managed to get ourselves into a bind (no pun intended) with DNS TTL values. A little history is required.
We use a couple of service providers to run a pair of private intranets with one of the intranets getting some services from the other intranet via 2 pairs of IOS routers that are running dynamic destination nat in both directions into distinct /24 virtual pools on older 12.3 code. This code unconditionally resets the DNS TTL to 0, probably, because NATs can get reused and if this happens the cached DNS translations will end up pointing to the wrong hosts.
We've been doing this for a long time and the clients started controlling routing to the these hosts by querying the specific address of the statically natted DNS boxes (which we also control) and later DNSs that actually lived on the intranet itself and are authoritative for some hosts living there but still forward to the statically natted DNS ip on these routers and pass along the results. We have two DNS boxes and each forwards to a single NAT gateway so the clients can still control routing for these forwarded domains as the results from each pair of NAT routers is a unique /24.
Anyway, we had a client's DNS meltdown, they claim because of our DNS TTL 0 values and they've asked us to start honoring the DNS value and pass it through. Some of our architecture guys did some research and a newer version or code (12.4T) will allow us to pass the DNS TTL unchanged, the setting is "no ip nat service dns-reset-ttl". That coupled with turning up the NAT time-outs to a sufficiently large value should take care of the issue because we don't have enough hosts that we’ll ever reclaim NAT translations, except for the fact that we NAT into two different virtual pools in two different data centers, which means that we could end up blackholeing traffic for up to the DNS TTL value of 1200 seconds if we had a failure.
I championed static natting all the hosts (~150) and having our DNS be authoritative to the static NAT values. This allows us to nat in both data centers into the same virtual pool and not have to worry about synchronization (stateful nat) across our internal networks or the providers WAN. We'd let the provider deal with the routing dynamically and it should all just work. This was deemed too big a change and we're at a stalemate. Sure, the clients lost the ability to control routing via DNS, but they have better ways to do that anyway.
We've moved from a strategic fix for everyone to a tactical fix for the one client. My question is, do you know if you can "DNS Doctor" the DNS TTL values in an ASA/PIX/IOS router? I'm pretty sure that I can't do it in an IOS box and I didn't see the functionality in the ASA docs either, but I figured that I'd ask the experts.
I received no replies on the support forum, so I started looking at other alternatives. There are several DNS servers out there that will perform this function (unbound, Pdnsd, and if patched dnsmasq). I’d really rather deploy a IOS/PIX/ASA device at the client site if it could do this rewriting for me but I can’t figure how to make that happen. I’ve tested unbound and it work as advertised but that means I’ll need to convince the client to do this or else build him a box. If it comes to that, I might see if any of the embedded linux firewall /routers (routeros, vyatta) do it via a patched version of dnsmasq.
I’m hoping that you can help because I’d MUCH rather support a known box, which to me would be a Cisco…
The history of how we got here is detailed below in a posting I did on the cisco website.
Regards,
--Duane
We've managed to get ourselves into a bind (no pun intended) with DNS TTL values. A little history is required.
We use a couple of service providers to run a pair of private intranets with one of the intranets getting some services from the other intranet via 2 pairs of IOS routers that are running dynamic destination nat in both directions into distinct /24 virtual pools on older 12.3 code. This code unconditionally resets the DNS TTL to 0, probably, because NATs can get reused and if this happens the cached DNS translations will end up pointing to the wrong hosts.
We've been doing this for a long time and the clients started controlling routing to the these hosts by querying the specific address of the statically natted DNS boxes (which we also control) and later DNSs that actually lived on the intranet itself and are authoritative for some hosts living there but still forward to the statically natted DNS ip on these routers and pass along the results. We have two DNS boxes and each forwards to a single NAT gateway so the clients can still control routing for these forwarded domains as the results from each pair of NAT routers is a unique /24.
Anyway, we had a client's DNS meltdown, they claim because of our DNS TTL 0 values and they've asked us to start honoring the DNS value and pass it through. Some of our architecture guys did some research and a newer version or code (12.4T) will allow us to pass the DNS TTL unchanged, the setting is "no ip nat service dns-reset-ttl". That coupled with turning up the NAT time-outs to a sufficiently large value should take care of the issue because we don't have enough hosts that we’ll ever reclaim NAT translations, except for the fact that we NAT into two different virtual pools in two different data centers, which means that we could end up blackholeing traffic for up to the DNS TTL value of 1200 seconds if we had a failure.
I championed static natting all the hosts (~150) and having our DNS be authoritative to the static NAT values. This allows us to nat in both data centers into the same virtual pool and not have to worry about synchronization (stateful nat) across our internal networks or the providers WAN. We'd let the provider deal with the routing dynamically and it should all just work. This was deemed too big a change and we're at a stalemate. Sure, the clients lost the ability to control routing via DNS, but they have better ways to do that anyway.
We've moved from a strategic fix for everyone to a tactical fix for the one client. My question is, do you know if you can "DNS Doctor" the DNS TTL values in an ASA/PIX/IOS router? I'm pretty sure that I can't do it in an IOS box and I didn't see the functionality in the ASA docs either, but I figured that I'd ask the experts.
I received no replies on the support forum, so I started looking at other alternatives. There are several DNS servers out there that will perform this function (unbound, Pdnsd, and if patched dnsmasq). I’d really rather deploy a IOS/PIX/ASA device at the client site if it could do this rewriting for me but I can’t figure how to make that happen. I’ve tested unbound and it work as advertised but that means I’ll need to convince the client to do this or else build him a box. If it comes to that, I might see if any of the embedded linux firewall /routers (routeros, vyatta) do it via a patched version of dnsmasq.
I’m hoping that you can help because I’d MUCH rather support a known box, which to me would be a Cisco…