DNS cache and memory usage, without adlist

I have seen a lot of posts regarding the DNS cache and memory usage. Both here, and elsewhere on the Internet. But most of them seem to think it has to do with adlist. I am less sure.

This is a new router installation, it has been in production for less than a week. Adlist has never been enabled. Yet it’s leaking(?) memory. Also /ip/dns/cache/flush doesn’t reduce the cache-used.

[admin@router-01] /system/package/update> /system/routerboard/print 
       routerboard: yes               
             model: CCR2004-1G-12S+2XS
     serial-number: HE308HSBB7V       
     firmware-type: al64              
  factory-firmware: 7.6               
  current-firmware: 7.18.2            
  upgrade-firmware: 7.18.2
  
[admin@router-01] /ip/dns> print
                      servers: 1.1.1.1  
                               8.8.8.8  
                               8.8.4.4  
              dynamic-servers:          
               use-doh-server:          
              verify-doh-cert: no       
   doh-max-server-connections: 5        
   doh-max-concurrent-queries: 50       
                  doh-timeout: 5s       
        allow-remote-requests: yes      
          max-udp-packet-size: 1024     
         query-server-timeout: 2s       
          query-total-timeout: 10s      
       max-concurrent-queries: 500      
  max-concurrent-tcp-sessions: 20       
                   cache-size: 100000KiB
                cache-max-ttl: 1w       
      address-list-extra-time: 0s       
                          vrf: main     
           mdns-repeat-ifaces: vlan1604 
                               vlan1605 
                   cache-used: 62342KiB 

[admin@router-01] /ip/dns> /ip/dns/cache print count-only
1031

[admin@router-01] /ip/dns> adlist/print 
Flags: X - disabled 

[admin@router-01] /ip/dns> forwarders/print
Flags: X - disabled

The current memory usage is 62342KiB, that’s because it has just been raised from 50000KiB. It maxed out 50 MB in less than a day, and the logs started to complain that the cache was full. 50 MB of memory usage for ~1000 cached DNS entries seems like a lot. :slight_smile:

I can also say that the mdns-repeater seems unrelated. I had the memory issues before allowing mdns packets through the filters, so that service was never hit.

Since it’s a new router installation I have not tried any older firmware than the one it’s currently running.

Am I missing something, or is there a huge memory leak? The DNS server does receive a lot of DNS requests. But it’s most often for the same hostnames/domains.

I’ve been seeing this as well though not as drastic and have been scouring the Internet for an explanation to no avail. I’m not using adlist, and memory usage has been steadily ticking up while the number of cache entries remains steady. I changed my DNS arrangement to have Pi-Hole hit Cloudfare directly (had it going through the RB4011 to work around a previous issue) and flushed the cache, but memory usage barely dropped despite going from ~3k entries to <200. The only thing that makes sense to me is that there’s a memory leak. I saw a post somewhere (Reddit?) where someone claimed their issue was resolved by changing a CNAME record to an A record. I do have a several static local CNAME records, but changing them isn’t an option.

Now 4 days later most entries in the cache had expired.

/ip/dns/cache print count-only

told me I had ~860 entries, most of those being static. So even after entries expires there ain’t memory enough to add new ones. So eventually the cache will be empty, even though using 100 MB of RAM.

I increased the cache size from 100000KiB to 250000KiB. In minutes it was using 190 MB instead of 100 MB. I now have ~1700 entries in the cache, and 193/250 MB cache memory usage. But I can’t really keep adding 100 MB memory per week. :slight_smile:

I am honestly not sure the cache size limit even works. Seems to stop entries from being added, but it also seems that the router keeps allocating more memory for “something else”. If I increase the cache-size the cache-used is instantaneous being increased, by quite a lot. This must be a quite serious bug, and I honestly don’t understand why it doesn’t get more attention. 10 MB allocated in 2 seconds ain’t likely.

[admin@router-01] /ip/dns> /system/clock/print ; /ip/dns/print ; /ip/dns/cache print count-only
                  time: 10:46:09        
                  date: 2025-04-10      
  time-zone-autodetect: yes             
        time-zone-name: Europe/Stockholm
            gmt-offset: +02:00          
            dst-active: yes             
                      servers: 1.1.1.1  
                               8.8.8.8  
                               1.0.0.1  
                               8.8.4.4  
              dynamic-servers:          
               use-doh-server:          
              verify-doh-cert: no       
   doh-max-server-connections: 5        
   doh-max-concurrent-queries: 50       
                  doh-timeout: 5s       
        allow-remote-requests: yes      
          max-udp-packet-size: 1024     
         query-server-timeout: 2s       
          query-total-timeout: 10s      
       max-concurrent-queries: 500      
  max-concurrent-tcp-sessions: 20       
                   cache-size: 250000KiB
                cache-max-ttl: 1w       
      address-list-extra-time: 0s       
                          vrf: main     
           mdns-repeat-ifaces: vlan1604 
                               vlan1605 
                   cache-used: 250000KiB
1380
[admin@router-01] /ip/dns> /system/clock/print ; set cache-size=270000KiB 
                  time: 10:46:33        
                  date: 2025-04-10      
  time-zone-autodetect: yes             
        time-zone-name: Europe/Stockholm
            gmt-offset: +02:00          
            dst-active: yes             
[admin@router-01] /ip/dns> /system/clock/print ; /ip/dns/print ; /ip/dns/cache print count-only
                  time: 10:46:35        
                  date: 2025-04-10      
  time-zone-autodetect: yes             
        time-zone-name: Europe/Stockholm
            gmt-offset: +02:00          
            dst-active: yes             
                      servers: 1.1.1.1  
                               8.8.8.8  
                               1.0.0.1  
                               8.8.4.4  
              dynamic-servers:          
               use-doh-server:          
              verify-doh-cert: no       
   doh-max-server-connections: 5        
   doh-max-concurrent-queries: 50       
                  doh-timeout: 5s       
        allow-remote-requests: yes      
          max-udp-packet-size: 1024     
         query-server-timeout: 2s       
          query-total-timeout: 10s      
       max-concurrent-queries: 500      
  max-concurrent-tcp-sessions: 20       
                   cache-size: 270000KiB
                cache-max-ttl: 1w       
      address-list-extra-time: 0s       
                          vrf: main     
           mdns-repeat-ifaces: vlan1604 
                               vlan1605 
                   cache-used: 261137KiB
1402

If I leave it memory capped for a week it will jump 50 MB in a second. :frowning:

Have you already tried playing with the cache-max-ttl? I have set it to 1d (instead of your 1w).

Sure your DNS server is only used internally (and its ports aren’t open to the world)?

No I haven't tested that. It doesn't make sense to me since there are only 1400 records stored, which takes > 250 MB of memory. But I can go ahead and reduce it to an hour straight away.

I have made sure the DNS server does not respond to anything on the WAN interfaces. I have double checked 53/tcp from an external IP using telnet. And the udp rules regarding port 53 are identical (besides protocol) so I am sure that works. Running the host command vs my IP also confirms that nothing wants to respond. And even if it was open to the internet, 1400 records in the cache still shouldn't use 250 MB of memory, it doesn't make sense. And as others have reported, flushing the cache changes nothing. The entries in the cache is dropped, the memory usage stays the same.

But thank you for your suggestion. At 12:10:37 local time I changed max ttl to 1h. Cache used is currently 263315KiB. I will check cache usage again later today.

Could you share your dns settings?

/ip dns export

Make sure to remove anything non relevant (like static dns entries).

I would love to. It's a simple config.

[admin@router-01] /ip/dns> /ip/dns/export
# 2025-04-11 08:25:40 by RouterOS 7.18.2
# software id = EI63-W5GT
#
# model = CCR2004-1G-12S+2XS
# serial number = xxxxxxxxxxx
/ip dns
set allow-remote-requests=yes cache-max-ttl=1h cache-size=270000KiB max-concurrent-queries=500 max-udp-packet-size=1024 \
    mdns-repeat-ifaces=vlan1604,vlan1605 servers=1.1.1.1,8.8.8.8,1.0.0.1,8.8.4.4
/ip dns static

And 9 CNAMEs added to that. They look like this (slightly modified):

add cname=xxx.localnet name=yyy.localnet type=CNAME

Another “proof” that memory is likely to get allocated beyond the allowed limit.

Cache used is 270000 KiB which is the configured max. I raise the limit to 500000 KiB. In a second (4 seconds between my commands actually) I have 356xxx KiB used. That’s almost 100 MB increase instantly. This is because I have waited ~2 days with raising the limit.

Right now it seems like I will have to reboot the router every second month or so to make sure that it has memory enough.

[admin@router-01] /ip/dns> /system/clock/print ; /ip/dns/print ; /ip/dns/cache print count-only
                  time: 09:26:11        
                  date: 2025-04-13      
  time-zone-autodetect: yes             
        time-zone-name: Europe/Stockholm
            gmt-offset: +02:00          
            dst-active: yes             
                      servers: 1.1.1.1  
                               8.8.8.8  
                               1.0.0.1  
                               8.8.4.4  
              dynamic-servers:          
               use-doh-server:          
              verify-doh-cert: no       
   doh-max-server-connections: 5        
   doh-max-concurrent-queries: 50       
                  doh-timeout: 5s       
        allow-remote-requests: yes      
          max-udp-packet-size: 1024     
         query-server-timeout: 2s       
          query-total-timeout: 10s      
       max-concurrent-queries: 500      
  max-concurrent-tcp-sessions: 20       
                   cache-size: 270000KiB
                cache-max-ttl: 1h       
      address-list-extra-time: 0s       
                          vrf: main     
           mdns-repeat-ifaces: vlan1604 
                               vlan1605 
                   cache-used: 270000KiB
868
[admin@router-01] /ip/dns> /ip/dns/set cache-size=500000KiB
[admin@router-01] /ip/dns> /system/clock/print ; /ip/dns/print ; /ip/dns/cache print count-only
                  time: 09:26:15        
                  date: 2025-04-13      
  time-zone-autodetect: yes             
        time-zone-name: Europe/Stockholm
            gmt-offset: +02:00          
            dst-active: yes             
                      servers: 1.1.1.1  
                               8.8.8.8  
                               1.0.0.1  
                               8.8.4.4  
              dynamic-servers:          
               use-doh-server:          
              verify-doh-cert: no       
   doh-max-server-connections: 5        
   doh-max-concurrent-queries: 50       
                  doh-timeout: 5s       
        allow-remote-requests: yes      
          max-udp-packet-size: 1024     
         query-server-timeout: 2s       
          query-total-timeout: 10s      
       max-concurrent-queries: 500      
  max-concurrent-tcp-sessions: 20       
                   cache-size: 500000KiB
                cache-max-ttl: 1h       
      address-list-extra-time: 0s       
                          vrf: main     
           mdns-repeat-ifaces: vlan1604 
                               vlan1605 
                   cache-used: 356223KiB
904

As an experiment I haven’t changed the memory configuration for the dns server. Just monitoring memory usage on the router itself. Between the two below there is roughly 5 days between the two resource prints. Memory consumption has increased by ~210 MiB. I believe it’s the DNS server.

[admin@router-01] /ip/dns> /system/resource/print
                   uptime: 2w2d22h5m
                  version: 7.18.2 (stable)
               build-time: 2025-03-11 11:59:04
         factory-software: 7.6
              free-memory: 3046.9MiB
             total-memory: 4096.0MiB
                      cpu: ARM64
                cpu-count: 4
            cpu-frequency: 1700MHz
                 cpu-load: 3%
           free-hdd-space: 99.1MiB
          total-hdd-space: 128.0MiB
  write-sect-since-reboot: 20185
         write-sect-total: 38036
               bad-blocks: 0.1%
        architecture-name: arm64
               board-name: CCR2004-1G-12S+2XS
                 platform: MikroTik
[admin@router-01] /ip/dns> /system/resource/print
                   uptime: 3w1d12m17s
                  version: 7.18.2 (stable)
               build-time: 2025-03-11 11:59:04
         factory-software: 7.6
              free-memory: 2830.9MiB
             total-memory: 4096.0MiB
                      cpu: ARM64
                cpu-count: 4
            cpu-frequency: 1700MHz
                 cpu-load: 6%
           free-hdd-space: 99.0MiB
          total-hdd-space: 128.0MiB
  write-sect-since-reboot: 27643
         write-sect-total: 45494
               bad-blocks: 0.1%
        architecture-name: arm64
               board-name: CCR2004-1G-12S+2XS
                 platform: MikroTik

What surprised me is that increasing the cache size with 200 MB wasn’t enough. I guess it was overfull already. I had to increase with > 250 MiB to get some free memory.

[admin@router-01] /ip/dns> /system/clock/print ; /ip/dns/print ; /ip/dns/cache print count-only                                                     time: 13:20:26
                  date: 2025-04-24
  time-zone-autodetect: yes
        time-zone-name: Europe/Stockholm
            gmt-offset: +02:00
            dst-active: yes
                      servers: 1.1.1.1
                               8.8.8.8
                               1.0.0.1
                               8.8.4.4
              dynamic-servers:
               use-doh-server:
              verify-doh-cert: no
   doh-max-server-connections: 5
   doh-max-concurrent-queries: 50
                  doh-timeout: 5s
        allow-remote-requests: yes
          max-udp-packet-size: 1024
         query-server-timeout: 2s
          query-total-timeout: 10s
       max-concurrent-queries: 500
  max-concurrent-tcp-sessions: 20
                   cache-size: 500000KiB
                cache-max-ttl: 1h
      address-list-extra-time: 0s
                          vrf: main
           mdns-repeat-ifaces: vlan1604
                               vlan1605
                   cache-used: 500000KiB
873
[admin@router-01] /ip/dns> /ip/dns/set cache-size=700000KiB
[admin@router-01] /ip/dns> /system/clock/print ; /ip/dns/print ; /ip/dns/cache print count-only
                  time: 13:20:29
                  date: 2025-04-24
  time-zone-autodetect: yes
        time-zone-name: Europe/Stockholm
            gmt-offset: +02:00
            dst-active: yes
                      servers: 1.1.1.1
                               8.8.8.8
                               1.0.0.1
                               8.8.4.4
              dynamic-servers:
               use-doh-server:
              verify-doh-cert: no
   doh-max-server-connections: 5
   doh-max-concurrent-queries: 50
                  doh-timeout: 5s
        allow-remote-requests: yes
          max-udp-packet-size: 1024
         query-server-timeout: 2s
          query-total-timeout: 10s
       max-concurrent-queries: 500
  max-concurrent-tcp-sessions: 20
                   cache-size: 700000KiB
                cache-max-ttl: 1h
      address-list-extra-time: 0s
                          vrf: main
           mdns-repeat-ifaces: vlan1604
                               vlan1605
                   cache-used: 700000KiB
873
[admin@router-01] /ip/dns> /ip/dns/set cache-size=1000000KiB
[admin@router-01] /ip/dns> /system/clock/print ; /ip/dns/print ; /ip/dns/cache print count-only
                  time: 13:20:55
                  date: 2025-04-24
  time-zone-autodetect: yes
        time-zone-name: Europe/Stockholm
            gmt-offset: +02:00
            dst-active: yes
                      servers: 1.1.1.1
                               8.8.8.8
                               1.0.0.1
                               8.8.4.4
              dynamic-servers:
               use-doh-server:
              verify-doh-cert: no
   doh-max-server-connections: 5
   doh-max-concurrent-queries: 50
                  doh-timeout: 5s
        allow-remote-requests: yes
          max-udp-packet-size: 1024
         query-server-timeout: 2s
          query-total-timeout: 10s
       max-concurrent-queries: 500
  max-concurrent-tcp-sessions: 20
                   cache-size: 1000000KiB
                cache-max-ttl: 1h
      address-list-extra-time: 0s
                          vrf: main
           mdns-repeat-ifaces: vlan1604
                               vlan1605
                   cache-used: 738528KiB
898

I still am a strong believer that there is a major memory leak in the dns service.

I replaced all my CNAMEs with A records and the memory leak is gone. Too bad that Mikrotik doesn’t acknowledge this bug.

I hope that’s fixed in 7.19. But the changelog is way too unclear. A clearer changelog could have saved me many hours.

*) dns - improved DNS server service stability;

I can confirm I also got a memory leak in the resolver DNS cache when using CNAMES on the most recent stable firmware (7.19.3)