ROS 5.21, DNS false negative?

We have a VRRP pair of RouterBOARD 1200, IP addresses 192.168.1.2 and 192.168.1.3, VRRP address 192.168.1.1.
Both are running ROS 5.21. These are our gateway devices to the Internet, in a typical “We’re an internal LAN, the MikroTiks let us out to the Internet” medium-sized office configuration. Nothing too special.

We just began having a problem where internal DNS resolution requests - which go to the MikroTiks’ VRRP address (192.168.1.1) which is then configured with a list of six possible public DNS servers (two from our upstream ISP, two from Google, and two from OpenDNS) - would return false negatives.

Right at the moment, the VRRP master is 192.168.1.3. The backup (and therefore very un-loaded MikroTik) is 192.168.1.2.


Example:

$ host msgxsignature.domain.net 192.168.1.1
Using domain server:
Name: 192.168.1.1
Address: 192.168.1.1#53
Aliases:

Host msgxsignature.domain.net not found: 3(NXDOMAIN)

$ host msgxsignature.domain.net 192.168.1.2
Using domain server:
Name: 192.168.1.2
Address: 192.168.1.2#53
Aliases:

msgxsignature.domain.net is an alias for ec2-5x-1x-6x-22x.compute-1.amazonaws.com.

$ host msgxsignature.domain.net 192.168.1.3
Using domain server:
Name: 192.168.1.3
Address: 192.168.1.3#53
Aliases:

Host msgxsignature.domain.net not found: 3(NXDOMAIN)

So, the VRRP master, which /ip dns cache print shows very full of cached DNS entries, falsely says NXDOMAIN, while the VRRP backup (largely idle) properly looks up and returns the right information.

I did an /ip dns cache flush on the VRRP master, and it began returning the correct DNS lookup results.


Can anyone help me guess whether this is more likely a bug (MikroTik DNS cache gets full, MikroTik DNS server starts behaving badly) or something to do with caching negative responses (although I don’t think that’s what happened; is there a MikroTik DNS server setting for how long to cache a negative response? For that matter, how can I do an /ip dns cache print where… command to show cached negative responses?

Thanks,

The same problem has just happened again. Once again, it was solved (temporarily) by flushing the cache.
I’ve now also reduced the cache-max-ttl from the 1w which it had been set at (by default, I guess) to 2d, in hopes that it helps. I’m not confident of that.
This really does look like a bug: When the DNS cache gets “too full” (whatever that might mean; in this case, there were around 3200 entries, using 1976KiB out of cache-size 2048KiB) the MikroTik DNS server starts failing some(?) lookups.

MikroTik, are you listening?
Any suggestions on debugging/ proving what’s going on?

Thanks,
-Jay