Strange DNS caching problem

Hi,

I have a strange problem on my network, I think I have narrowed down to the the DNS cache in my Mikrotik Hex S router.

3 of my devices keeps losing connection. Unfortunately also 3 very closed IoT devices :frowning:
A Yale/August Connect bluetoooth gateway, a Netatmo weather station and a Grohe Blue Home water tap.

The strange thing is that as soon as I flush the cache in the router, all 3 reconnects almost instantly to the internet.

I have tried to look at the cached records, but can’t really seem to find any signs of strange things.

I can’t really find any common denominator for all tree. The August gateway and Grohe seems to be using AWS:

rbs-sticky.august.com: rbs-sticky.august.com is an alias for rbs-sticky-prod-aws.august.com.
rbs-sticky-prod-aws.august.com is an alias for rbs-prod-aws.august.com.
rbs-prod-aws.august.com is an alias for ireland-prod-rbs-legacy.august.com.
ireland-prod-rbs-legacy.august.com is an alias for a3d8823bcc4694a5a8064c19ae4437d8-ea1fa960054d012f.elb.eu-west-1.amazonaws.com.
a3d8823bcc4694a5a8064c19ae4437d8-ea1fa960054d012f.elb.eu-west-1.amazonaws.com has address 108.128.67.33
a3d8823bcc4694a5a8064c19ae4437d8-ea1fa960054d012f.elb.eu-west-1.amazonaws.com has address 52.210.203.100
a3d8823bcc4694a5a8064c19ae4437d8-ea1fa960054d012f.elb.eu-west-1.amazonaws.com has address 99.81.169.172

idp-apigw.cloud.grohe.com: idp-apigw.cloud.grohe.com is an alias for grohe-idp-prod-apigwlb-01-665378892.eu-central-1.elb.amazonaws.com.
grohe-idp-prod-apigwlb-01-665378892.eu-central-1.elb.amazonaws.com has address 3.124.61.13
grohe-idp-prod-apigwlb-01-665378892.eu-central-1.elb.amazonaws.com has address 3.124.127.18
grohe-idp-prod-apigwlb-01-665378892.eu-central-1.elb.amazonaws.com has address 3.74.7.173
grohe-idp-prod-apigwlb-01-665378892.eu-central-1.elb.amazonaws.com has address 35.157.44.100

But Netatmo seems to be using Azure:

api.netatmo.com: api.netatmo.com is an alias for front-azure.netatmo.net.
front-azure.netatmo.net has address 51.145.143.28

Anyone have any hints?

Wish I could provide more info, but all tree is completely closed, and I have no problems with any of the other stuff on my network.

My DNS config:
[admin@MikroTik] > ip/dns/print
servers: 1.1.1.1,8.8.8.8,9.9.9.9
dynamic-servers:
use-doh-server:
verify-doh-cert: no
allow-remote-requests: yes
max-udp-packet-size: 4096
query-server-timeout: 2s
query-total-timeout: 10s
max-concurrent-queries: 100
max-concurrent-tcp-sessions: 20
cache-size: 2048KiB
cache-max-ttl: 1h
cache-used: 425KiB

I would sniff the DNS exchange between the IoT devices and the Mikrotik, before and after flushing the cache, into a file and use Wireshark to compare the first response (for each fqdn queried) after flushing the cache with the others.

yes you are right, that is probably the best way to find the problem.
but I hoping I could avoid that, since it sometimes takes days between it goes offline :frowning:

okay I will put up a sniffer once I find the time to it.
thanks

I have a netatmo weather station as well and I do not have this issue. I wonder if this is related to changes in IP at the service level? Do they all stop working at the same time?
Have you tested to send 1.1.1.1 as secondary DNS server via Dhcp so they can go external in case of inte internal issues?.

Good idea. I have added it, and it seems to have made a difference!

The Yale/August gateway was the most unstable, and I don’t think I have seen it offline since I added the secondary DNS server.
I even got so confident that I reconnected my Netatmo Welcome camera, that have been very unstable. Unfortunately it still is :frowning: But that is possibly unrelated.

However I discovered something else. My alarm system is connected to the network by a small OpenWRT router, and some simple shell code I have hacked together (I’m NOT a programmer!).
It uses MQTT and mosquitto_pub to send updates to my Home Assistant installation.
In the log I see lots of: “Unable to connect (Lookup error.).”

So I added the hostname to /etc/hosts sine my alarm system naturally is very important to me.