DNS over HTTPS, round robin support

Starting from RouterOS version v6.47 it is possible to use DNS over HTTPS (DoH).

Something like this:

/ip dns set use-doh-server=https://cloudflare-dns.com/dns-query verify-doh-cert=yes

This is the example from https://wiki.mikrotik.com/wiki/Manual:IP/DNS.

Well, cloudflare-dns.com has multiple IP addresses. (Both IPv4 and IPv6.)

How does RouterOS handle this? It is not documented anywhere (or at least I could not find it), and I wonder if RouterOS will try the second address if the first one does not work.

Add 2 Static DNS Entries for cloudflare-dns.com to Address: 104.16.248.249 and 104.16.249.249.

I’m sorry, that was not the question. The question was this: does RouterOS handles round-robin A records for dns-over-http? So for example, if the first address is not available for the given domain, then will it try to use the second one? Or will it fail with SRVFAIL?

Please note that cloudflare was brought up as an example only. I don’t want to use cloudflare. What I really want is to run my own remote pi-hole servers with multiple addresses, and use then from MikroTik routers over https. But I want to know if RouterOS will distribute dns requests between them, and I also want to know what happens if one of the servers becomes unavailable. This is (AFAIK) not documented.

Round Robin in on the server side and not the client side. RouterOS is here a client.

Yes, it is plays the role of an https client. All major browsers do this. They fetch all addresses and if an address fails, then they will try another one until they can connect. I think many other tools are doing the same thing (for example, squid, apache proxy etc.) For normal (port 53) DNS requests, having a primary and a secondary DNS server grants fault tolerance. I think that for DoH, round robin can do the same. If RouterOS can utilize round robin to provide fault tolerance for DoH then I’m a happy camper. If it cannot, then DoH feature in RouterOS is a toy that should be used in production with caution.

But which way it is? (Not documented anywhere)?.

I thought that was the whole idea of use router DNS Services.
It would attempt to resolve DNS for you.
a. via its own cache
b. via your dynamic server entries
c. via your ISP connection if all above failed.

There are at least two DNS lookups involved. One is used to resolve IP address for the domain of the URL. Then the URL (and the resolved addresses for its domain part) are used to perform all other DNS lookups. In order for the first one to succeed, you must setup a regular DNS server. This is also documented here:

https://wiki.mikrotik.com/wiki/Manual:IP/DNS#DNS_over_HTTPS

Note that you need at least one regular DNS server configured for the router to resolve the DoH hostname itself.

My question was NOT about this first part. My question was about the second part. What happens if the domain name given in the DoH URL resolves to multiple addresses, and only some of them can be connected.

  • Will RoS connect to the first address? Or randomly? Or round-robin?
  • Will RoS try to connect to the next one if the first fails?

I have not worked / looked into DNS in detail for a couple of years, but suspect it has not changed that much.

DNS round robin does not provide fault tolerance, it provides crude way of load balancing. As far as I know, DNS server will not send client “all” A / AAAA records, if it did, then round robin / local net priority functionality on DNS servers will be useless

So as earlier mentioned by @msatter, the router acts as a client in this case, and will only receive one record details when doing DNS request. Once TTL has expired, it might get a different record details

You are wrong in many ways.

  1. DNS rr does provide fault tolerance. I’m actually the operator of a website that partly uses DNS rr for fault tolerance. (Also mentioned here: https://en.wikipedia.org/wiki/Round-robin_DNS) But it can only be used for fault tolerance if the https client is cooperatively using it the right way.
  2. DNS rr also provides load distribution. That is the correct term for it. The difference between balancing and distribution is that a balancer can monitor the state of the worker nodes and actively divert more traffic to less occupied workers. HAProxy can do load balancing, DNS rr can only do load distribution. (We are using both, because both have advantages and disadvantages.)

Wrong again. A DNS server usually sends a few A/AAAA records. If there are a couple of records, then it usually sends all of them. If there are too many, then it might not send all of them. Actually, there is a known technique that avoids long distance connections by returning the addresses of the nearby servers (for the same host name). For example, if you use your ISP’s DNS server to look up google.com, then it will probably return some addresses, most of them will be nearby.

It is true however, that most programs use the gethostbyname (3) system call to resolve a hostname to a single address. But many programs (including all major browsers) will not do this. They will retrieve multiple addresses (if they can), and they will connect to the first one that is available. In other words, fault tolerance is possible with DNS rr only because the browsers (“https clients”) are designed that way.

My question was exactly this: is RouterOS DoH (acting as an https client) designed that way? Can it use DNS rr for fault tolerance or not?

This is nonsense. You should at least try before you post something like that.

╭─root@ns1 ~
╰─# host cloudflare-dns.com
cloudflare-dns.com has address 104.16.248.249
cloudflare-dns.com has address 104.16.249.249
cloudflare-dns.com has IPv6 address 2606:4700::6810:f9f9
cloudflare-dns.com has IPv6 address 2606:4700::6810:f8f9

How many addresses do you see?

Stupid question, but how does router know to which IP address to resolve cloudflare-dns.com domain, if you use only DoH?

I already gave an answer for that. But I’m going to paste it here for you.

https://wiki.mikrotik.com/wiki/Manual:IP/DNS#DNS_over_HTTPS

Note that you need at least one regular DNS server configured for the router to resolve the DoH hostname itself.

I don’t know the answer and right now I’m too lazy to test it. But you can easily do it yourself. To watch default behaviour, just add logging rule in output for destinations with tcp/443 (or whatever your DoH server uses). And then in output again, you can block (reject/drop) connections to selected address(es) and see how RouterOS deals with that.

Well thanks a lot. The whole point of asking it on this forum was to avoid building a whole testing environment. :slight_smile: I can only test this if I do these first:

  • setup a domain name for my own DoH servers, with multiple A records
  • setup DNS over HTTPS servers on two of them (at least), this also needs valid HTTPS certificates
  • setup a spare mikrotik router plus a packet sniffer (so that I 'll be able to check connection attempts made by routeros)

I know how to test this! You can say that “you can easily do it yourself”, and you are right. It is not difficult. But it is time consuming. And this should be documented in the official docs anyway. I was hoping that somebody (maybe a MikroTik staff member?) can answer this question. This should not be something that needs to be tested to find out how it works. There are users here with access to the routeros source code. All it takes is a quick look.

Well yeah, it seems that I have no other option but to build a test environment for this. And if it turns out that DNS rr is not utilized for failover, then it will have been a waste of time. :frowning:

I do not disagreee with you. It’s just that when I need to know something, I rather spend few minutes testing it than waiting and hoping that someone gives me the answer.

And for basic test, simply add two records in IP->DNS->Static for any made up hostname and random addresses, use it as DoH server address and see where router tries to connect. You don’t need working servers and actually get any response, just to see if router tries to connect to both.

At least, you have learned something after that. :slight_smile:

For me, it would take hours setup a working, usable test environment. I’m not lazy, but I may not be clever enough to do it in less time.

All right, this is a good idea. Actually it is the first usable anwer. :slight_smile: I will try this. Altough… if it works, then it also means that the documentation is not 100% correct. It says that in order to use DoH, you must set a regular DNS server, because it is needed to resolve the address of the DoH host. If this part works with static dns entries, then maybe you don’t need to set a regular DNS server…

Scientists do experiments to find out how things are, because they don’t have a choice. I tend to learn things from documentations (and from others) when possible. It is just more effective.

Of course, there is a good side of learning something the hard way. But it would be better if it was documented.

I’d say it’s just simplification for most common target audience. If DoH resolver uses hostname (instead of numeric address like https://1.1.1.1/dns-query), router needs to get its IP address from somewhere. And another regular resolver is the best choice for most users, because they don’t have own DoH resolvers, and if they would add local static record with current address of some public one, they can’t be sure if it’s going to be the same tomorrow.

Okay, I could setup a test environment and play with it. Here are the results.

  1. Do we need a regular DNS server for resolving the hostname of the DoH server?

The answer is no, we don’t. Static dns entries can be used for resolving the hostname of the DoH http server.
I have tested this the following way:

/ip dns static
add name=cloudflare-dns.com address=104.16.248.249
add name=cloudflare-dns.com address=104.16.249.249
/ip dns set servers=“” use-doh-server=https://cloudflare-dns.com/dns-query verify-doh-cert=no
/ip dns cache flush
/system reboot

The reboot was probably not neccessary, I just wanted to make 100% sure that the router can reboot and work without any regular DNS server IP given.

  1. Does RouterOS utilize DNS Round Robin? Will it try a second IP for making a different DNS over HTTPS request when the first IP does not work?

The answer is yes, but there is some latency.

I have tested this by mirroring the WAN port to another port, and used WireShark on that port to display all packets sent to/from the given two static IP addresses 104.16.248.249 and 104.16.249.249.

These are my experiences:

  • RoS will build a new HTTPS connection when the first DNS resolve happens, and it will keep that alive (Connection: keep-alive) if possible.
  • So the very first request will be a bit slow, further requests are resonably fast.
  • RoS also seems to connect to both servers, even when both of them work perfectly.
  • These connections were closed in every 2 seconds or so. (I’m not sure if it was because of RoS or cloudflare). In other words, a https request is made periodically in every few seconds. As a result, DoH lookups are usually slower (on the average) than normal DNS lookups.

I have simulated server outage with a firewall rule that prevents communication with one of the servers. A failure like this will first result in a lookup error, but it will work after the HTTPS connection is built. If you do a multiple DNS requests within a second, then sometimes many of them fails (until the HTTPS connection is built).

Example:

[admin@MikroTik] /ip firewall filter<SAFE> set 1 disabled=yes      
[admin@MikroTik] /ip firewall filter<SAFE> set 2 disabled=no       
[admin@MikroTik] /ip firewall filter<SAFE> :put [/resolve index.pl]  
failure: dns server failure
[admin@MikroTik] /ip firewall filter<SAFE> :put [/resolve index.pl]
failure: dns server failure
[admin@MikroTik] /ip firewall filter<SAFE> :put [/resolve index.pl]
194.8.46.13
[admin@MikroTik] /ip firewall filter<SAFE>

As you can see, I could re-try the same request quickly enough to see two failures. I don’t think that it would be a problem in production use, because DNS server outages are rare. After all, I’m very satisfied with this. It means that DoH can be used in production, because the implementation is fault tolerant.

Thanks for your help!

You are right, most users don’t have their own DoH resolvers. But now that we have a good DoH resolver in routeros, I can see how it will make my life much easier. :slight_smile: I have to take care the networks of multiple (small) companies. I always wanted to do DNS based content filtering, but it would have been too much work to setup and manage separate redundant pi-hole servers for all of them. The new DoH implementation gives us the power to do DNS based content filtering on fault tolerant centralized servers, and make these lookups secure and private at the same time.

The same could have been achieved with a different approach (e.g. DNS over VPN), but it is always good to have something that is so simple to setup on the client side. Just add two DNS static records, download a certificate, set the DoH URL and ready to go! :slight_smile: