fq_codel/CAKE stories?

I´m always interested in how people are using cake and fq_codel. We are adding some new features to cake in particular of late.

How´s it working out for you?

Please add a built-in ‘auto-rate’ that dynamically adjusts Cake’s settings to match network speeds when LTE/NR congestion fluctuates.

As a home user, my needs are simple, but I wrote up my configuration anyway. It will serve as a report on my experience, and perhaps you will see an error I’ve made and advise me.

Great summary!

Just basically a cut and paste from another forum post on queues, but this is my queue story:

I have 300M up and down from FiOS and honestly it’s pretty good out of the box, but it isn’t perfect all the time. I’m always wondering if it would be better to just use FastTrack with the default only-hardware-queue, or to run cake or fq_codel in a simple queue on eth1 without FastTrack, or to run a queue tree on eth1 and the bridge with cake or fq_codel to keep FastTrack but possibly slow the bridge traffic down.

I’ll go a month with FastTrack on and nothing but the default only-hardware-queue set, but then we’ll get a zoom or YouTubeTV or PS5 hiccup and I’ll convince myself that queueing will help. So then I’ll throw cake or fq_codel on a simple queue on eth1, but then I’ll convince myself that it isn’t doing anything, as there will be a similar random hiccup even with queueing enabled. Then I’ll mistrust the simple queue and throw a cake or fq_codel queue tree on eth1 and the bridge so that FastTrack can stay on, but then I’ll wonder if that’s unnecessarily slowing down the bridge due to it needing to be set with a max-limit of 290ish, so I’ll just take it all off and start over with nothing but FastTrack enabled. I also tried just changing the queue type to fq_codel for all the interfaces, but I couldn’t figure out if it was doing anything without having any max-limits set. The settings are there, so tinkerers gonna tinker.. I just haven’t noticed much of a difference, which may very well just mean that FiOS is pretty good and 300/300 is enough to handle what we do regardless of queue setups… I just don’t know.

Enabling fq_codel or cake indisputably gives me better scores on bufferbloat tests, but it’s not always apparent to what degree various queues improve our day-to-day internet use. With or without cake/fq_codel queues enabled, someone will infrequently complain about a loading screen, or dying in War Thunder or FortNite, or a hue light not working, or a video call… but it isn’t obvious if it’s something a queue can fix. Is it our network, or is YouTubeTV slow or not buffered enough to handle the NFL? Same thing with a video call… is it our side or their side? Is it bufferbloat we can fix with queueing on our end, or just a lack of data from the other side? I don’t know how to answer these questions, but queueing (and cake in particular) sounds like an interesting possible solution to setting up the best of all possible connectivity.

I’m currently running cake in a simple queue on eth1 with FastTrack disabled largely because the router (RB5009) can definitely handle doing this at 290/290 and I’m hoping this will eliminate any potential poor performance from our side of the pipe. I run the cake queue as a kind of preventative medicine, even if I’m not sure how to measure its effect on day-to-day internet use in my household.

All this said, it’s great to hear that cake is still being developed! I look forward to more best practice discussions on queueing and on future developments.

The algorithms in cake-autorate could possibly be ported to mikrotik.

Ideally some sort of BQL-like mechanism could be applied also to the outgoing interface to run at the varying natural rate. Last I looked, however, all lte and 5g devices had a fixed amount of overlarge buffering in the device itself.

You are very right in that sometimes the internet itself is glitchy. I do keep hoping we will see fq_codel and cake running on more backbone links, but there will always be a glitch somewhere.

In terms of seeing if cake or fq_codel is doing any good, seeing if you have any drop statistics over time would be an indicator, but when I last looked, mikrotik had none. Could you look again?

Good config. I agree with you that if your nearest CDN is less than 30ms away - and you aren´t trying to communicate worldwide, that regional is a good setting.

However, the ack-filter on rx is largely useless and eats cpu. You only need it on tx in your environment.

Am I not gambling that my furthest commonly-used CDN is less than 30ms away, and that the consequences of being wrong occasionally are minor?


However, the ack-filter on rx is largely useless and eats cpu. You only need it on tx in your environment.

That’s insightful, thanks. I’ve not only removed that useless bit from the config, I added this paragraph:

“We do this on the cake-tx side alone because in typical cases, we can only control what we send to remote TCP/IP stacks, not what they do in turn. There are exceptions to this, as with a site-to-site WireGuard tunnel, but even in that case ACK filtering has its most useful effect on the narrow side of an asymmetric pipe. Consequently, the best strategy is to do this ACK filtering on each peer’s Tx side alone, improving the remote site’s Rx queuing from afar.”

There is one additional important detail you have to keep in mind: the bandwidth limits are given from the target interface’s perspective, not that of your client. We are used to specifying asymmetric Internet bandwidths like these in terms of the client perspective, but that will give you an upside-down view of the way RouterOS queues work. WinBox will tell you that my “Target Upload” speed is 256M and my “Target Download” speed is 24M, which feels wrong until you realize that when I download something to my client computer, I do so by making the WAN interface upload that traffic to my client.

Are you saying, @tangent, that upload and download settings should be reversed when setting up a simple queue? This wouldn’t affect me for speed settings, as I have symmetric, but it certainly would for things like “cake-flowmode=dual-dsthost” versus “cake-flowmode=dual-srchost.” If what you cite is correct, I think I have these reversed, and I bet a good number of other users have these settings reversed as well.

@dtaht I don’t see any drops at all right now, but I’ll keep looking. I’m not sure if this means there are no drops to count, or drops are not being counted/kept by the simple queue for fq_codel or cake.

Given that my configuration feeds two CAKE queues into a simple queue where the max-limit gets applied, I’d have to say, “Yes.”

But why not try it and find out there? Your symmetric conn doesn’t preclude experimentation. If it’s 1G/1G, try 900M/800M and see which direction ends up with the sticky end of the lollipop.

I came to this conclusion after trying it here and getting puzzled as to the behavior, then coming up with the quoted explanation.

Yep… they look reversed in terms of reported simple queue data to me, at least in terms of what I, an admitted amateur at this, would expect. I added a new besteffort Cake queue and fired up a YouTubeTV stream to see if upload or download increased more in the “traffic” tab panel and the “Statistics” panel of the Cake simple queue. The queue seems to report what I would consider download traffic as upload traffic in the cake queue, in that I would expect download to increase far more than upload when a streaming service starts on the network. These values fluctuate, but I didn’t observe them flipping to have larger values for download over upload.

Traffic
Traffic / Target Upload: 24.3 Mbps
Traffic / Target Download: 283.8 kbps
Total Dropped: 0

Statistics
Avg Rate: Target Upload: 13.1 Mbps
Avg Rate: Target Download: 404.5 kbps

So I read this as “expected” simple queue download really being upload… and vise versa.

I found the same: rx/tx values in simple queue are swapped compared to what one would expect: RX is target interface upload and TX is target interface download

The configuration below works well on 1000/100 cable connections with users complaining about downloads or uploads of many and/or large files interferes with parallel Teams/Zooms etc. calls.

The rx besteffort queue to limit the ingress WAN traffic shortly before the ISP traffic shaper kicks in. This has shown to have less impact on latency.
Th tx diffserv4 queue does a good job in prioritizing outgoing Teams/Zoom traffic if the uplink is fully loaded. At least the complaints have stopped.

/queue type
add cake-diffserv=diffserv4 cake-flowmode=dual-srchost cake-nat=yes kind=cake name=cake-WAN-tx
add cake-diffserv=besteffort cake-flowmode=dual-dsthost cake-nat=yes kind=cake name=cake-WAN-rx
/queue simple
add comment="RX= Target Upload TX = Target Download" max-limit=950M/95M name=cake-WAN queue=cake-WAN-rx/cake-WAN-tx target=bridge1_WAN

I have always been puzzled as to why the dual rather than triple isolate mode is used. With nat enabled, I thought triple-isolate was better.

drops are not being counted w/cake or fq_codel. Please feel free to pester tik more about it on my behalf.

I remain curious as for the reasoning for overriding the default flow mode.

Just trying to follow the reasoning, ‘overriding’ as in any of the examples in this thread?

Yes, the chances of being wrong occasionally are very minor. It is better, especially, when inbound shaping, to be using your typical RTT as the estimate so as to get control of the queue before it flows into the isp shaper you are defeating.

The codel algorithm was designed to scale to worldwide RTTs (280ms) with its default settings of target 5ms, interval 100. It was also designed so that a single TCP flow could use up nearly all the bandwidth, so long as the RTT was in that range. The most common use cases nowadays is 15 short flows from a web browser, 5 or more from torrent, and in general interactive flows like voip and videoconferencing use so little bandwidth that they never see a drop.
With lots of flows, and/or short rtts in general codel isn´t actually responsive enough.

A case where using the regional RTT might hurt (30ms interval, 1.5ms target) is where a single packet takes longer than the bandwidth to “fit”. At 10Mbit, a single packet fits into 1.3ms.

Since the development of codel, linux based hosts have adopted packet pacing which tends towards working better at short and long ranges. I haven´t got any real benchmarks of such however.

Anyway, unless if you are in a case where you have to transmit lots of data over a single flow, somewhere further than 100ms away, and you have > 10Mbit/s bandwidth, the regional setting should be fine.

“We do this on the cake-tx side alone when we have a greater than 10x1 download/upload ratio, on the narrow end of an asymmetric pipe, where the acks from a single TCP flow can saturate the upload. Ack-filtering does not do any good for encrypted traffic - as with a site-to-site WireGuard tunnel. Acks will almost never be queued on the rx side, there is no point it attempting to filter there.”

"

The world seems to have copy/pasted the dual-whatever settings, where, in nat mode, diffserv3 benchmarked out best. dual-whatever does make sense without nat on.

I played with rrul a while back. I can’t remember exact settings, would have been simple queue on my 100 DL / 20 UL connection with 95 DL / 15 UL limits.

What I found cool was how you can clearly see diffserv4 in action.

Right now I am running CAKE on a queue tree with fasttrack enabled on both ipv4 and ipv6 (7.18beta testing).

Previously I had fasttack disabled and mangle rules classifying stuff with dscp marks for diffserv, but I don’t trust my mangling ability enough to keep that setup.

No queues:
rrul-cake-disabled.png
fq_codel:
rrul-fq-codel.png
CAKE dual-src/dsthost diffserv4 nat internet:
rrul-internet.png

typically you can put cake with the exact, correct, framing hard up against the upload limit (and also use ack-filter there). On downloads, tho, no more than 92%. And thx for playing with rrul!