RouterOS default config suffering severe QoS issues on 5G connection while downloading caused by bufferbloat

I’ve got hAP ax³ as edge router, and internet connection is 5G, supposed to be 100Mbit download and 10Mbit upload, but the download is often 10Mbit only. So, internet connection speed is unstable and bandwidth unknown.

The connection was behaving strangely, sometimes pages take few seconds to even begin loading, and sometimes won’t even load at all. These issues were caused by router dns (cache) not resolving queries for domain names, that aren’t yet cached. This resolution failure was caused by bufferbloat, when something was being downloaded from strong upstream.

My normal ping is 30~100ms, and while downloading something it got over 1400ms, sometimes reaching over 2000ms, and dropping DNS responses, due to bufferbloat.

I’ve configured queue tree with CAKE, and bufferbloat issue got mitigated. Now, ping increases only to 50~150ms during download, and DNS responses aren’t lost anymore. I used CAKE, because of unknown/unstable bandwidth and CAKE offers cake-autorate-ingress option.

Wondering, why Mikrotik doesn’t provide something in default setup? And, why there are no docs with QoS anti-bufferbloat quick-setup recommendation? This issue very much worsens experience on internet (via Mikrotik router). And, Mikrotik routers are often praised for routing performance, but is it because QoS is just not handled at all by default and router has less work to do?

Other vendors provide this feature under one checkbox. The https://www.waveform.com/tools/bufferbloat recommends Amazon eero Pro 6 mesh WiFi, NETGEAR Nighthawk Pro Gaming 6-Stream WiFi 6 Router (XR1000), IQrouter – IQRV3 Self-Optimizing Router with Dual Band WiFi or Ubiquiti EdgeRouter 4.

My cake queue tree configuration:

/ip firewall mangle
add action=mark-packet chain=forward in-interface-list=WAN new-packet-mark=wan_download
add action=mark-packet chain=forward new-packet-mark=wan_upload out-interface-list=WAN

/queue type
add cake-autorate-ingress=yes cake-flowmode=dual-dsthost cake-nat=yes kind=cake name=cake-download
add cake-autorate-ingress=yes cake-flowmode=dual-srchost cake-nat=yes kind=cake name=cake-upload
/queue tree
add max-limit=95M name=cake-download packet-mark=wan_download parent=bridge queue=cake-download
add max-limit=9500k name=cake-upload packet-mark=wan_upload parent=ether1 queue=cake-upload

# disable fasttrack, export should contain action=fasttrack-connection as disabled=yes
/ip firewall filter
...
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" connection-state=established,related disabled=yes hw-offload=yes
...

Also, is above a correct setup of CAKE queue true? I mean it works, but is the way it should be done? Or, can something be improved?

CAKE forces all traffic through software traffic control which is very slow, you won’t be able to use 1gbps connections with Mikrotik hardware if it is on by default.

Your pipe is so small and inconsistent, why bother with queuing at all?

This makes WAN connection bandwidth even more scarce resource, that shouldn’t be fully consumed by one client / download connection for own needs only, and harm every else client or stream.

Hosts send data at speed, that can be consumed by available bandwidth of the client without loss, or lesser, but not higher as hosts don’t want to waste own bandwidth with undelivered packets.

Throttling connection - slowing down forwarding of received packets from wan - delays when client receives packets for downloading stream, which makes client to respond later with TCP ACK (or else application level transmission control packets for UDP-based protocols), which makes host think, that client can’t download data at faster speed, and this results in one client/download-connection not using the full-bandwidth of WAN connection.

As always: queuing traffic in Tx direction is easy, sender only has to observe Tx buffer. However shaping in Rx is hard and more often than not relies on information about maximum throughput available. If that information is highly fluctuating, then receiver can’t really do the throttling effectively.

So your efforts to make download smoother may turn out to be futile.

Well, this makes results of synthetic benchmarks look great on the paper, but real life experience is bad.

MikroTik official Test results should include benchmarks with QoS setup, that is able to accommodate to variable max-bandwidth of WAN connection, and still would result in fair use of the WAN connection for all clients. Otherwise, these Test results numbers are very misleading for common household/SOHO use.

In my experience, all past providers, including wired 100Mbit and 1Gbit connections, had contention ratio 1:4 or worse, some even 1:16. The connection to ISP networking equipment was 100Mbps or 1Gbit, but the ISP was connected to the world with much lesser bandwidth, that could only cover 1/16th of aggregated bandwidth of all users.

Besides benchmarks, that skew appearance of MikroTik devices about real life performance,…

It would be good, if MikroTik at least provided official quick start docs for QoS setup for variable max-bandwidth connections to WAN. Other routers provide one checkbox in admin.

Far from futile.

Above setup reduced on-load-latency-increase from extra 1500~2000ms to only extra 25~100ms.

It made connection fairly usable, browsing possible (no longer resolution issues for cache-miss DNS queries), and calls quality is now almost unharmed, while some app, update or else thing, was being downloaded from strong host.

I did blame ISP for poor connection quality, but it was issue with QoS setup on my side. Once someone in household started downloading something during my calls, the quality of video calls got bad. And, it was NOT ISP’s fault.

OK, this was my fault. One cable was faulty after my network equipment re-arrangement, and it resulted in negotiation of 10Mbps in the chain of:

5G antenna - cable - PoE injector - cable - PoE injector - cable - hAP ax³

It seemed somewhat off, that the connection speed dropped to 10Mbit and didn’t go up anymore, neither during very-definitely-not-peak hours. Now the WAN connection is going over 40Mbps even during peak hours.