Mikrotik DNS server issues with Amazon S3 - low TTL 60sec

Hello,

I’ve run in to an issue where a customer using a Mikrotik RB751 running the latest RouterOS 5 (5.8 at the time?) has a lot of issues using Amazon S3 apparently due to the very low TTL that Amazon uses (60 seconds). I’ve heard of others with this issue as well, is the only workaround to not use the DNS server within RouterOS?

Hey, this explains the exact problem I was experiencing!

Is there an ETA for a fix?


Thanks

I’ve noticed this once or twice and didn’t think much of it since I typically run my own DNS cache on a server, but yeah, that issue definitely sounds familiar.

YES! This is the exact problem a few of my customers have reported, and I have confirmed. It seems to have started about 3-4 months ago. Running 5.5 caching google’s DNS servers.

Why not use just mangle to raise the TTL?

TTL isn’t measured in seconds, it’s a hop count.

TTL in DNS terms is indeed number of seconds to cache a DNS record:

Amazon keeps the TTL low for various reasons. Mucking with it would likely cause you to be connecting to the wrong IP address.

Ah, well I deal with network engineering mostly so my brain defaults to network terms.

Our entire LAN is using a MT RB750G unit as a DNS server, which randomly doesn’t resolve some domains (notably Amazon S3 due to its widespread use). I’ve had to resort to giving out Google DNS IPs via DHCP until this is fixed which is annoying since it breaks all the internal LAN DNS which I had setup. Been having this problem for over a year, glad to see I’m not the only one!

Just had another customer call and complain of this issue. Any ideas to resolve this issue, or information I can provide to assist in the resolution?

I have problem with random resolve issues since 5.x version too, I can’t remember, but it’s about 5.10, may be earlier.

And it’s strange, browser, curl/wget and any other software can’t resolve name, but nslookup works without problems. So usually I have:

xeron@macbook:~$ wget — can't resolve
xeron@macbook:~$ wget — can't resolve
xeron@macbook:~$ nslookup — resolved
xeron@macbook:~$ wget — can't resolve
wait 1-2 minutes
xeron@macbook:~$ wget — resolved

And really often this problem happens with Amazon S3 hosts, but not only S3.

I tried to increase max-udp-packet-size, but still no luck.

I have seen issues with this as well. We are currently running RouterOS 5.13 and have seen DNS TTL issues crop up but it’s not exactly limited to Amazon S3 but that seems to be the worst offender. I have also seen issues if you have a large number of records expire in cache, it takes 100% cpu for a few seconds to clear those entries out. If you have a lot of requests coming in, and a lot of cache records expiring, it causes the entire router to slow down on all duties.

Exactly the SAME PROBLEM!
Not only amazon, but randomly ever!

Where is mikrotik team?

I have the same problem with DNS.
And my dns settings randomly returns to the previos ones by themself some times.
ROS 5.14 - RB751U-2HnD.

Still an issue, anyone from mikrotik about to weigh in on this? Very annoying

If everyone in this thread would send an email to support@mikrotik.com, it would probably get seen as a high priority issue!

sent

it is not clear what issues you are having, if you want to use router dns cache and chace time of the entry is very short - just adjust the max-limit to some lower value, or check if DNS server that responds gives you correct cache time to start with as all values below cache-max-ttl are set according to replied value.

About some other issues - try to increase replysize by increasing max-udp-packet-size to something large like 4096 (that will be default in later RouterOS versions but will not be changed via update to newer version) due to DNSSEC that returns huge replies.

Holy crap! This might be the problem I had almost a year and a half ago!

I was using OpenDNS and randomly google and amazon DNS related stuff wouldn’t resolve. I never had the time to really look into it as it crippled our network and I needed to just get it working. It was something TTL related for sure but couldn’t investigate further.

I’m keeping an eye on this one!

hi, I´m having the same dns problem. it´s seem to be random.
when I try to resolve several times I get not answer:
:resolve dns.domain.com
failure: dns name does not exist
and then without doing nothing works fine (previous I flushed the cache).
I can add some information but for now I can´t says why:
It´s seem work fine with a dns server on solaris 9 (bind 8.3.3) but is not working on dns server on debian (lenny with bind9 1.9.6). I hope this helps in some way!
thanks!

to get this resolved, take a packet capture of port 53 on the external interface and highlight the query and the response in wireshark. then send a supout along with those results. if you can prove that the response came back in but didnt get used, then maybe mikrotik will finally look at it.