some quick comments on configuring cake

“The only thing you have to prevent users from doing is setting the
CAKE ‘bandwidth’ parameter when it’s installed as a leaf qdisc under
HFSC. Running cake in ‘unlimited’ mode (which is the default anyway) as
a leaf in an HFSC tree is perfectly fine, so if users want to do that,
I’d say let them.”

is where toke left off. This is also where I am confused as to which, if any, of the successful benchmarks above had hfsc at all…

To avoid further flooding of the 7.30beta thread with Cake topics, here some results taken from my home network:

RB5009, ROS 7.2.2, Fiber uplink at SFP1 using PPPoE with NAT capped at nominal 500/100 by the ISP equipment at the other end of the fiber.
The ISP UL shaper does a not so bad job, but the DL shaper is awful as visible in the plot without queue.

ROS simple queue setup targeting the PPPoE uplink interface

/queue type
add name=cake-WAN-tx kind=cake cake-diffserv=diffserv3  cake-flowmode=dual-srchost cake-nat=yes 
add name=cake-WAN-rx kind=cake cake-diffserv=besteffort cake-flowmode=dual-dsthost cake-nat=yes

/queue simple
add max-limit=500M/100M name=queue1 queue=cake-WAN-rx/cake-WAN-tx target=wan-pppoe1

This gives very good results in the flent rrul test. Almost no latency increase and both DL/UL are running at nominal speed, saturated by the 4 parallel connections.
flentres_cake.png
The same test with the simple queue on wan-pppoe1 disabled shows high buffer bloat >100ms. The latency under load increases by a factor of 10.
The total DL is about 1/2 of the line rate, because the 4 parallel connections are fighting each other.
flentres_noqueue.png
Regarding how good it works, it would really be interesting to hear what exact reasons MT has to disallow use cases such as above with the latest 7.3 beta.

Disclaimer - I’m barely a dabbler when it comes to this stuff and I’m still not comfortable with some of the jargon and abbreviations. But I’m trying.

I have a RB760iGS (hEX S) (256MB RAM) that is my office’s gateway/firewall/router. Our internet is provided via a WISP - they’re using Ubiquiti equipment. Our connection is 25M/5M - it tests out a little less.

I just upgraded the router to 7.3rc1 and implemented the following based on some examples I’ve seen:

/queue type
add cake-ack-filter=filter cake-diffserv=diffserv4 cake-flowmode=dual-srchost \
    cake-memlimit=32.0MiB cake-mpu=84 cake-nat=yes cake-overhead=38 \
    cake-overhead-scheme=ethernet cake-rtt-scheme=internet kind=cake name=\
    cake_UL
add cake-diffserv=diffserv4 cake-flowmode=dual-dsthost cake-memlimit=32.0MiB \
    cake-mpu=84 cake-nat=yes cake-overhead=38 cake-overhead-scheme=ethernet \
    cake-rtt-scheme=internet cake-wash=yes kind=cake name=cake_DL

/queue simple
add dst=ether1-Internet name=queue1 queue=cake_UL/cake_DL target=""

The cake-memlimit - I’ve seen suggestions that 32M is a good number to start with and this router certainly has plenty available. Question - what’s the default value?

The interface queues are all “only-hardware”. I also disabled the fasttrack I had in my forward filters. Is this all that’s needed for me to get started with this? What information can I provide to assist with validating performance? Running the Waveform bufferbloat test gives me an A+. However - I also get that A+ with the queue disabled.

Am I correct that with the queue enabled cake is supposed to automagically implement qos without my needing to mark packets in mangle?

Thank you for the explanation… I’m on cable, with a modem I have zero control over… suspect that is the culprit.

Your target has no data, and simple queuing does not take effect.

/queue simple
add limit-at=940M/143M max-limit=950M/146M name=CAKE queue=cake-down/cake-up \
    target=pppoe-out1

Your target has no data, and simple queuing does not take effect.

/queue simple
add limit-at=940M/143M max-limit=950M/146M name=CAKE queue=cake-down/cake-up
target=pppoe-out1

My “target” is should have been set (ether1) - don’t know why it didn’t show up in the command line (it was set via Winbox GUI). I’ve now set both “dst” and “target” to ether1.

I’m assuming the limit numbers you’re showing are for your own connection - as I stated mine is 25M/5M. I left the limits out as I thought cake would auto-configure/adapt without explicit limits set.

It doesn’t need one. But a simple queue needs a target interface or IP set before you get to CAKE. And, if you know the speed of the link you are inserting CAKE (or any queue) on is fixed speed, then you should use that in CAKE. All queues benefit from having a theoretical/ideal max speed as basis.

Clearly the gist of this thread is you should be use Flent, https://flent.org, to test/compare different your settings… Especially if you’re saying get the same “A+” speedtest both with CAKE and without it – there isn’t much for CAKE to do if your packet flow is working okay.

Random plug:

I am very happy to have picked up more flent converts. My favorite feature of flent is the ability to do comparison plots!

Given that it is too hard to install easily on OSX, and the core devs don’t have much time to maintain it, I am trying to put together a proposal to form a foundation around it, whether via NSF or SBIR grant, or some other process, I don’t know. But if you are into forming orgs, or know anyone that is? Please help with https://docs.google.com/document/d/19GPpuFG4p9uG1sR_jVW8O3ptpoBfuQJxPuo5lIZf7hc/edit

I really hope they understand this now.

In other random work on trying to help mikrotik and the userbase out today, I posted these. I note that a big point for me, was always, that getting these algorithms running natively, at line rate, has a lot of benefits (really low cpu cost that way, for starters) , and that at least some of the shaping y’all have been forced to do was due to the lack of backpressure in the ethernet and wifi drivers.

http://forum.mikrotik.com/t/bql-and-default-queue-lengths/158510/1 - BQL and fq_codel cut many ms without shaping out of line rate.

And on fq_codel running native on the wifi:

http://forum.mikrotik.com/t/fq-codel-on-mikrotik-wifi/158515/1

I fear that mikrotik is also running way behind on these fronts. :frowning:

I regret that I have to scale back my commitment (my grant ran out, nobody is paying me) to helping y’all out here for a while. Please summon me when there is progress on mikrotiks’ front? Also, my natural metier is email - the cake mailing list over on lists.bufferbloat.net is pretty active, in particular. Thx for all the wonderful work on documenting what works, and the flent plots, and everything. I will try to somehow compress this mutually educational exercise into some more compressed form over the coming months.

I have lost track… can an interface queue also shape inbound?

Seems to be egress only.

I’m sharing some of Flent tests I ran on hAP ac2 with ROS 7.3.1.

Baseline, no queues, fasttrack enabled. Router CPU utilization around 6% during the test.
no_queues_fasttrack.png
FQ_codel, simple queue configuration, fasttrack disabled. CPU 29%.

/queue type
add fq-codel-limit=1000 fq-codel-quantum=300 fq-codel-target=12ms kind=fq-codel name=fq-codel
/queue simple
add max-limit=118M/11M name=fq-codel queue=fq-codel/fq-codel target=ether1

fq_codel_fasttrack.png
Cake, simple queue configuration, fasttrack disabled. CPU 31%.

/queue type
add cake-flowmode=dual-srchost cake-nat=yes kind=cake name=cake-up
add cake-flowmode=dual-dsthost cake-nat=yes kind=cake name=cake-down
/queue simple
add max-limit=118M/11M name=cake queue=cake-down/cake-up target=ether1

cake_simple_queue.png
The next two tests are with fasttrack enabled. Queue trees are attached to the interfaces HTB, which allows the queues to work along with the fasttrack. This is how I’ve been using fq_codel for some time now with good real world experience.

Fq_codel, queue tree, fasttrack enabled. CPU 16%.

/queue type
add fq-codel-limit=1000 fq-codel-quantum=300 fq-codel-target=12ms kind=fq-codel name=fq-codel
/queue tree
add bucket-size=0.01 max-limit=118M name=download packet-mark=no-mark parent=bridge1 queue=fq-codel
add bucket-size=0.01 max-limit=11M name=upload packet-mark=no-mark parent=ether1 queue=fq-codel

fq_codel_simple_queue.png
Cake, queue tree, fasttrack enabled. CPU 19%.

/queue type
add cake-flowmode=dual-srchost cake-nat=yes kind=cake name=cake-up
add cake-flowmode=dual-dsthost cake-nat=yes kind=cake name=cake-down
/queue tree
add bucket-size=0.01 max-limit=118M name=download packet-mark=no-mark parent=bridge1 queue=cake-down
add bucket-size=0.01 max-limit=11M name=upload packet-mark=no-mark parent=ether1 queue=cake-up

cake_fasttrack.png
I don’t fully understand how to interpret the graphs other than looking at the average black line. My takeaways:

  • It seems there isn’t much difference between simple queue vs queue tree with interface HTB. But with interface HTB I can have fasttrack enabled, which significantly lowers CPU utilization, almost by a half with fq_codel. Also, the graphs are more smooth with fasttrack on.
  • Observing WAN interface traffic graph, I could clearly see how cake’s graph was a lot more jagged and also smaller overall (less download bandwidth) - compared to the same scenario with fq_codel.
  • There is less overlap in different TCP upload lines for cake, so it’s seems to separate those different types of traffic better. Not sure what that means in practical terms.
  • If I interpret the graphs correctly, both cake and fq_codel showed good results, massively better over baseline.
  • There are definitely a lot of options with cake to play with, I used almost all defaults. For fq_codel I have more customized configuration based on several recommendations from trusted sources.

For the time being I’m sticking with fq_codel as it’s known to use less CPU (which I need available for several WireGuard VPN tunnels). I have been using fq_codel because it wasn’t clear if there were issues with cake in RouterOS. Now that 7.3 came out, and the whole matter has been clarified about how it’s supposed to be configured, this is a good option too. I just don’t know if I see any improvements over my current configuration. Maybe there are, I just don’t know what to look for in RRUL tests :slight_smile: .

If anyone cares to comment on the results or make suggestions, I would be interested to hear them out.
rrul-2022-06-11T150550.863792.fq_codel-queue_tree-fasttrack_on.flent.gz (221 KB)
rrul-2022-06-11T151710.095351.cake-queue_tree-fasttrack_on.flent.gz (239 KB)
rrul-2022-06-11T151204.298760.no_queues-fasttrack_on.flent.gz (228 KB)
rrul-2022-06-11T152832.655991.fq_codel-simple_queue-fasttrack_off.flent.gz (227 KB)
rrul-2022-06-11T152347.781716.cake-simple_queue-fasttrack_off.flent.gz (231 KB)

Really great report, thank you. I’m hoping this is a really stable release! I’m pleased that in this scenario cake is only about 3% more cpu.

  1. something weird happened on “Cake, simple queue configuration, fasttrack disabled.” - did you reset the qdisc? A typical “hit” from some other flow on the link affects throughput, not latency…

  2. The flent-gui *.flent.gz allows you to produce comparison plots, so you could look at the difference between the fq_codel and cake plots on the same plot. You can also rescale multiple plots to be on the same scale. it helps a LOT when looking at plots. It was really great that you achieved more throughput in general than the FIFO. I think. While the main rrul plot is VERY useful and needed to check for anomalies (like what happened in 0, above), I actually tend to revert to using the cdf or whisker plots.

(I try to get folk to upload their flent.gz files, but even more so to go hog wild with flent “Data->add other open files”. I have hundreds of thousands of tests at this point I can just fly through…)

  1. Your cake result shows the effective use of diffserv to put best effort(BE), background (CS1), CS5 and EF into different bins. If you don’t want to differentiate between dscp types, use cake besteffort (which saves on cpu) (or the rrul_be test). It also explains why the graphs look different.

  2. You will get a little more upload throughput if you enable cake’s ack-filter. This test does not show the benefit of per host/per flow fq that cake can do.

My dreams:

  1. Get more of y’all (ideally mikrotik) to obsolete the default on interface pfifo AND sfq in favor of fq_codel (and test that without shaping - running 2 ports into one). And increase the packet limit especially if they also get BQL.
  2. one day!! be able to use cakes integral shaper on both egress and ingress without the htb.
  3. Get useful statistics back from fq_codel and cake on drops, marks, and backlog into a grafina-like tool

and lastly, get this into their WiFi device drivers, if they haven’t already: https://www.cs.kau.se/tohojo/airtime-fairness/

I’m not sure what happened there. Even though I chose a quiet time on the home network, I thought some phone or device started a backup or something. But if you say latency would not have been affected, then I don’t know. I wasn’t touching anything during the test.


(ideally mikrotik) to obsolete the default on interface pfifo AND sfq in favor of fq_codel

Would there be a benefit in putting fq_codel on the physical interface instead of default “only-hardware-queue” and run that along with cake in a simple queue? The configuration certainly allows it.


If you don’t want to differentiate between dscp types, use cake besteffort (which saves on cpu)

What would be the real-world implications in differentiating between dscp types vs not? If I understand correctly, it would improve certain latency-sensitive traffic even further.


I appreciate your feedback and suggestions, I will make the changes and try to find time to run more tests. Also, I attached the flent.gz files to the original post just in case.

Good day guys. I’m really glad that the cake sqm was finally added to Router OS V7 and it’s working perfectly great in my network. I want to apply the diffserv [4] to my network instead of best effort but I have no idea how to apply the DSCP to the firewall like what I normally do in OpenWRT. I’m just a hobbyist with a Small Business and just had a small knowledge in Mikrotik Routers.

started using cake now. its working quite good in the simple queue on my WAN interface. except when Im downloading file from tcp/443, websites slow down like image load time etc. this doesn’t happen on watching video or browsing on other devices though.

my guess cake cant separate http traffic load on same device.. is there any setting that can fix this issue?

edit: setting my QT’s PCQ to both src and dst address did the separation for different http sources for same device.

for cake qos, I read the manual on the internet but still dont know if I should use wash or ack filter and which diffserv to use.. can someone help me on this?

edit2: now I set the DSCPs in my mangle rules, all of them respectfully to their type and read about diffserv, wash, ack filter. it is even better now.

I am using cake for a while now. My connection is LTE 50M/10M. Most of the day I see 40-45M and 9-10M when I run speedtest.net. Bandwidth can vary, as my ISP uses some kind of “priority management” to distribute available bandwidth using some algorithm to clients. So downlink can somewhen drop to 25M and uplink to 5-6M. Then bufferbloat really kicks in. My ISP seems to shape the “line-rate” quite well, but once my connection is trottled - latency increases way over 100-200ms.

Now i discovered the neat flent cli and graphs posted here in the forum and the waveform bufferbloat test, I now really wonder: how can I reduce latency to < 5ms? I have run dozens of flent tests and changed like any queue config I could think of. Tried using queue tree instead of simple queue. Changed cake params. But I can’t get latency down. It stays around 30-40ms. And I really like to wonder why.

To be absolutely sure I am in control of the buffer and not my ISP, I set a max-limit of 20M/5M. That’s a rate I am sure I have at minimum.

Basically I use this config:

/queue simple
add max-limit=20M/5M name=cake-50 queue=cake-ingress/cake-egress target=lte1
/queue type
add cake-diffserv=besteffort cake-flowmode=dual-dsthost cake-nat=yes kind=cake name=cake-ingress
add cake-flowmode=dual-srchost cake-nat=yes kind=cake name=cake-egress

/ip firewall filter
add action=accept chain=forward comment="queue-cake - upload" \
    connection-state=established,related out-interface=lte1 \
    src-address=192.168.0.0/24
add action=accept chain=forward comment="queue-cake - download" \
    connection-state=established,related dst-address=\
    192.168.0.0/24 in-interface=lte1
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
    connection-state=established,related hw-offload=yes

This ensures that lte-traffic is not fast-tracked.
I can see from the rates, that the simple-queue maxes out at 20/5 when running flent.


No queue ideal conditions:
cake-20-5.png
No queue - lte connection throttled by ISP
no_queue2.png
Cake simple queue
no_queue.png
https://www.waveform.com/tools/bufferbloat?test-id=8b641985-265c-4786-89fd-d10d785ba09b

https://www.waveform.com/tools/bufferbloat?test-id=edaca352-b3aa-4671-9bce-19c2bee73d9a

I keep trying to get ISPs to put better queue management on their links, so perhaps you can show them your data, and point them at Preseem and LibreQos as one key way for them to manage their bandwidth better? These sorts of middleboxes within the ISP can help immensely and are very inexpensive to setup.

Still it is best to have better CPE at the bottleneck links, but there is no way you can compensate for when they lose control of their queuing like that.

Largest ISP in Austria (A1). I see no hope to convince them. Squeeze the lemon till the last drop. They have no need to improve service quality. Customers who notice AND can name/identify it as bufferbloat are maybe 0,001% of my ISPs customer base. Why go the “extra mile” for those?