Static CNAMEs cause DNS cache memory leak

I recently deployed an RB5009UG+S+IN as both a primary gateway and local DNS server. For about three months after deployment it was leaking memory, eventually reaching ~95% memory usage, at which point the router would reboot. This was happening quite quickly too - the last such cycle took a little less than 6 days from startup (about 15% usage) to reboot.

I have been monitoring the router’s memory usage via SNMP since the initial deployment, so I was able to watch memory usage drifting upwards in real time. The upwards trend would occasionally pause for an hour or two, but outside of such periods it was a consistent upwards drift. Interestingly the rate at which memory was used actually increased several times, coinciding with changes being deployed to the running config - I will return to this later.

Eventually a colleague of mine stumbled across supout.rif, so I generated one right as the memory usage was at its peak and they analysed it. From this we were able to learn that the

resolver

process (which we assumed to be the local DNS server) had allocated itself 843MB of memory! In other words, it was using over 80% of the RB5009’s 1GB of memory on it’s own!

Edit: I forgot to mention here that I had increased the DNS cache size to fit the configured adlist. However it was only increased to 20MB or so, so the conclusions don’t change.

Earlier I had temporarily disabled the DNS adlist feature, thinking that might be causing the issue. However, this hadn’t made any difference, so I did some more research. During my research I turned up the following threads:

I think all these threads are related, but the first and second are the most similar to my issue - particularly the second, which actually describes the router rebooting due to an out of memory condition, which I haven’t seen mentioned in any other threads.

I saw several users report that converting static CNAMEs to A records resolved issues with the DNS cache being filled, so I converted all the CNAME records on my router to A records. Upon making this change the router’s memory usage immediately stopped increasing. However, memory usage didn’t drop until I rebooted the router a few days later; since then it has been stable at about 15% usage.

Returning to the increased memory consumption rates I mentioned earlier. I compared deployed changes with the inflection points, and I indeed found that the coinciding changes had created one or more new CNAMEs.

Edit2: Note about RouterOS version(s): when I first noticed the leak the router was running v7.14.3, and it was running v7.17.2 when I managed to resolve the issue. I have since upgraded to v7.18.2 - I have not tried CNAMEs since upgrading, but the changelog doesn’t indicate any relevant fixes to the DNS resolver, so I assume the problem persists.


To me this seems like fairly strong evidence that RouterOS is mishandling static CNAME records, but I have one more spanner to throw in the works: I have a RB4011 running at another site which, despite having several static CNAMEs, does not exhibit a memory leak. However, I know am aware the CNAMEs configured in the RB4011 are for uncommonly queried names (in fact those records may not be being queried at all), whereas the CNAMEs set in the RB5009 were queried extremely often.

It is time that MikroTik abandon their resolver toy and go for something like “unbound”.

router != dns server. simple.

RoS shouldn’t try to play dns resolver, to begin with.

I don’t agree.
Any router should be capable of providing DNS resolver functions!
However, that is a difficult programming project, and it is being under-estimated by MikroTik,
There are already so many bugs like this one that have been fixed in the past and then there usually was another problem (whack-a-mole).
And then we do not even have modern functionality like DNSSEC validation or DoH/DoT service!

The same thing happened on an RB3011 router that has no static records.
Captură de ecran 2025-03-31 004913.png

Impossible to tell if that is the same problem.
The DNS cache has a configurable size, default is ridiculously small.
When you haven’t changed it it can easily get full, but that is not the same as the DNS process using up all memory (even above configured cache size)!

But so far, it has not happened that the cache is filled without an adlist or other special settings.

Sorry to be pedantic, but this screenshot doesn’t demonstrate the same issue. All it shows is that your DNS cache is full. While that may be an issue for your setup, on its own it’s not indicative of a bug.

The issue in my post is not simply that the DNS cache is full, it’s that it has a memory leak. This leak causes it to improperly consume more memory that has been allocated to it - eventually exhausting all memory on the router, forcing it to reboot.

And in my case the amount of memory used increases:
Captură de ecran 2025-04-11 101100.png
Captură de ecran 2025-04-11 101354.png
After reboot:
Captură de ecran 2025-04-11 101716.png

The graphing pattern suggests that there is a memory leak for each query to the resolver.
It may also depend on whether the target of the CNAME is also local to the router, or is some remote record.
We have seen many interesting problems with CNAME in the DNS resolver before.

Very strange records:
Captură de ecran 2025-04-11 112413.png

I’m experiencing a similar issue on an RB750Gr running RouterOS 7.18.2, though I haven’t been able to confirm that it’s caused specifically by CNAME entries.

I’ve tried the following, but none have resolved the issue:

  • Reducing the cache TTL
  • Decreasing and then increasing the cache size

The DNS cache is immediately reported as full, and memory usage continues to increase with no recovery, suggesting a memory leak. Eventually, the router reboots when memory is exhausted, indicating a likely memory leak in the DNS service.

:bulb: Coming from a software development background, I suspect that the condition used to detect a “cache full” state differs between the warning mechanism and the logic responsible for freeing LIFO entries.

I just stumbled across this post. I’ve had a support case open since December on the DNS cache issue with the only suggestion being to just “try the latest version”. You’d think someone from support might be looking at the forums and could have suggested the CNAME workaround on the ticket, but no.

Sure AdList isn’t enabled (while cache size hasn’t been increased)?
https://help.mikrotik.com/docs/spaces/ROS/pages/37748767/DNS#DNS-adlistAdlist

The Adlist is enabled, but I have increased the DNS cache size to compensate:

[ADMIN@MIKROTIK] > /ip/dns/adlist/print 
Flags: X - disabled 
 0   url="https://raw.githubusercontent.com/StevenBlack/hosts/master/hosts" ssl-verify=no match-count=261542 name-count=184950 
[ADMIN@MIKROTIK] > :put [/ip/dns/get cache-size]
20480
[ADMIN@MIKROTIK] > :put [/ip/dns/get cache-used]
14071

Also, an update: since removing all CNAMEs my router has been running with stable memory usage for weeks. Prior to the fix it was unable to last more than two weeks without restarting due to a lack of memory, and by the time I worked out the problem it wasn’t even lasting a whole week!

I finally got a response to the support issue I opened back in December…

The issue has been reproduced, we look forward to fixing it on upcoming RouterOS versions.

FWIW