I have been having an issue with my mikrotik DNS being really slow in regards to DNS requests that aren’t cached. I thought mikrotik acting as the DNS server only affected cached and not the uncached, but when I set my lan to be the server, it somehow manages to slow down uncached DNS requests. The images are the results of Mikrotik as DNS and not as DNS.
The affects can be felt in regular use as well, going to a new page or a page I haven’t visited in a while tends to not initially load when Mikrotik is DNS.
What could be causing this issue?
The cpu usage increases from 2% to about 4% max when mikrotik is the DNS, so the board has plenty of resources.
I made sure the following firewall rule is used for the DNS to work:
;;; defconf: accept established,related,untracked
chain=input action=accept connection-state=established,related,untracked
in-interface=ether1 log=no log-prefix=“”
.
your precision-freeware isn’t showing any parameter stepsize on it’s x-axis ??
… so for starters: hard to say if this is an issue or a neurotic prolaps … (sorry !) : )
.
not shure what kind of size (router and network) we are talking about ?
… in my Opinion: Gibson’s Stevie targeted with this dns-test-tool a small-to-midsize service-provider !! (why would a single-mom with two kids stress out all these dns-servers ?) .. ?!
… if you are one of them (provider ! not single-mom ! …) … it should be possible to have an dedicated dns-proxy (pi-hole for example), which brings security from botnets and from adware to the cultural internet-life … and dedicated dns-(caching)-skills !?
.
DNS is an application layer thing … needs cpu … and hence all the servers you request (with what an amount of requests ?!) also RAM, which is eventually short on a router !
… so consider your use case ! … provider-single-mom ; ) !
On the X axis, you will notice a dark dashed line going up and down. This line is equal to 180 milliseconds.
When not using mikrotik, dns queries are solved at or less than 180 milliseconds. But when mikrotik is the dns, the queries are being resolved at 1,260 milliseconds… this equates to mikrotik taking 7x or 7 times as long to resolve the exact same queries.
As for the size and equipment, it is the rb750r3 and it is being used in a home of 5 people with multiple devices. The cpu usage is low and the ram usage is low even when performing the test.
I stress out because when the dns queries show the latency in the dns tester, there is also excessive latency on the rest of the network, something that doesn’t happen if mikrotik is not being used for dns. Ultimately the network comes to a hault, netflix stops, games are unplayable, and its only an issue when mikrotik is the primary dns.
Diffs in response time that big appears to me, like the first configured resolver times out and a backup entry is doing the job.
Try a tool like dig or nslookup to check, if your Mtik is resolving.
Change to
ip dns
and
export terse
your config to give us some insight whats in the config.
/ip dns set allow-remote-requests=yes max-concurrent-queries=120 query-total-timeout=8s servers=1.0.0.1,8.8.4.4,4.2.2.2
/ip dns static add address=192.168.88.1 comment=defconf name=router.lan
I think I may have pin pointed more of the issue.
I have adjusted the “Query Server Timeout” and the “Query Total Timeout” to see if there was a difference and there was for server timeout but not for total timeout.
My issues were happening with Query Server Timeout set at 2. When changed to 1 the network just completely stopped working. Yet, adjusting it to 3 has improved dns latency to within normal realms.
I will keep monitoring to see if the issues come back, but setting Query Server Timeout to 3s seems to have fixed the problem for now.
Seems weird. Still would like to know why that my be?
You should not shorten the timeouts too much. Upstream servers have to perform recursive resolves if the wanted result is not cached already. Here’s an example: if you want to get A record (IP address) for www.mikrotik.com, recursive resolver has to perform the following queries:
checks internal table of root DNS servers
sends query for NS record of .com TLD to one of root servers. If it chooses poorly, time to get the answer could be up to 0.5 seconds. Result will include IP addresses of authoritative name servers.
sends query for NS record of mikrotik.com to one of .com authoritative servers. Again the query can take up to 0.5 seconds. Result will include IP addresses of mikrotik.com authoritative servers.
sends query for A record of www.mikrotik.com and authoritative server will reply with IP address. Again it can take up to 0.5 seconds to get that reply.
result gets returned to your router
From start to end it can take easily between 1 and 2 seconds and we didn’t start to mention possible problems, hidden from your router:
Any of servers contacted in steps 2., 3. or 4. can timeout (server has its own timers) and server has to try another authoritative server.
Result in step 4. might actually be a CNAME to a completely different domain and the recursive resolver has to do all the steps again, this time for the new FQDN.
…
You get the picture …
And if your router times out before upstream server unrolls everything, it sends same query to next upstream server who might have answer cached or it might have to do everything again … during that time your router probably would have received the answer from the first server queried if it was still waiting. So by shortening timeout you actually achieved the opposite from what you wanted.
The fast responses from some well-known public recursive resolvers are mostly due to wide use and thus possibility of result being cached is higher. Speed is also partially due to fast servers (quick retrieval of cached results) and due to low-latency access to many of authoritative servers for more relevant domains.