And re-re-reading this question (wow, did my eyes glaze over), cake pays no attention to vlan priorites. It can, with a tc rule. Assuming it’s a modern enough cake. asking your question of the cake mailing list might get you somewhere…
I noticed one of the RTT schemes is “satellite”… We sometimes use high speed, but high latency (500ms) GEO point-to-point IP links (10-100Mb/s SCPC)… Historically “TCP acceleration” is the approach to deal with these reliable but high RTT links for normal “web traffic” (i.e. some variant of “split TCP” using pepsal/SCPS-TS, sometimes using Hybla[-like] CC). In our case, the sat link has a fixed RTT and fixed/known, non-shared bandwidth – which is why I think CAKE may be of some use. Since we typically route sat links into Mikrotik ROS, CAKE be easy to apply in v7.1.
But I’m curious on your thoughts if “TCP acceleration” is even needed if a CAKE queue is used on either end of [a high RTT, high BW] bridged L2 satellite link?
Since the TCP CC algorithm/config employed by actual clients can dramatically effect TCP performance with high RTT, it’s just not that easy to just simulate in a lab (e.g. apple’s TCP stack responds differently than Linux, same for Windows, etc., and then also differently across those OS version since TCP CC flavors change) - thus curious what your experience is with CAKE in satellite use cases.
If you have a correct estimate of RTT across the satellite link, use rtt that_number + 60ms. Definitely do not use the default rtt estimate (100ms) here as it will not fill the link. “satellite” is a SWAG.
cake supports RFC3168 - style ecn - if you enable that on your endpoints you can do congestion control losslessly. Win. The FQ portion will keep lower rate request/response and voip protocols separate from the AQM, and (nearly) never drop those.
https://www.bufferbloat.net/projects/cerowrt/wiki/Enable_ECN/ [1]
There are a bunch of other ways to go with a “tcp accellerator” depending on your topology. If you are using a tcp proxy, enabling ecn on those endpoints will control the amount of data in flight. Using a delay sensitive tcp, also.
I would like very much a flent “rrul” test from an actual real-world satellite link, with and/or without a proxy. I have plenty from starlink, nothing from GEO, would love to emulate the other new constellations coming up. some packet captures too!
[1] Apple has made it more difficult to use ECN of late. The additional sysctl required to re-enable ecn negotiation always is
sudo sysctl -w net.inet.tcp.disable_tcp_heuristics=1
See also:
https://github.com/apple-opensource/xnu/blob/master/bsd/netinet/tcp_cache.c#L164
This disables mptcp and tfo also.
Your core question “are proxies even needed”, I didn’t answer. Please go measure.
Good evening, and thank you Dave for your many years of work along with the rest of your team combating bufferbloat! I have been following along for many years, and still feel like I know so little.
I am so glad to finally have fq_codel and cake in Mikrotik! Previously I had run an OpenBSD router at home for many years and it was great, but I have been running Mikrotik for a few years now. Anyhow, on to my testing..
INFO
Mikrotik CCR-1009
RouterOS 7.1 Stable
AT&T VDSL2 100/20
San Antonio, Tx.
Results of a ping to test server with unloaded pipe for reference:
— dallas.starlink.taht.net ping statistics —
27 packets transmitted, 27 received, 0% packet loss, time 26035ms
rtt min/avg/max/mdev = 28.288/28.699/29.860/0.314 ms
\
Test 1 - No queue

\
Test 2 - CAKE defaults
name=“cake-default” kind=cake cake-bandwidth=0bps cake-overhead=0 cake-overhead-scheme=“” cake-rtt=100ms
cake-diffserv=diffserv3 cake-flowmode=triple-isolate cake-nat=no cake-wash=no cake-ack-filter=none

\
Test 3 - Cake with NAT on download/upload and ACK filter on upload
name=“cake-up” kind=cake cake-bandwidth=0bps cake-overhead=0 cake-overhead-scheme=“” cake-rtt=100ms
cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=filter
name=“cake-down” kind=cake cake-bandwidth=0bps cake-overhead=0 cake-overhead-scheme=“” cake-rtt=100ms
cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=none

\
Test 4 - Adding bridged ptm
name=“cake-up” kind=cake cake-bandwidth=0bps cake-overhead=22 cake-atm=ptm cake-overhead-scheme=bridged-ptm
cake-rtt=100ms cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=filter
name=“cake-down” kind=cake cake-bandwidth=0bps cake-overhead=22 cake-atm=ptm cake-overhead-scheme=bridged-ptm
cake-rtt=100ms cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=none

\
Test 5 - Adding wash on download
name=“cake-up” kind=cake cake-bandwidth=0bps cake-overhead=22 cake-atm=ptm cake-overhead-scheme=bridged-ptm
cake-rtt=100ms cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=filter
name=“cake-down” kind=cake cake-bandwidth=0bps cake-overhead=22 cake-atm=ptm cake-overhead-scheme=bridged-ptm
cake-rtt=100ms cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=yes cake-ack-filter=none

\
Test 6 - Remove bridged ptm, and set overhead to 22 (same as bridged ptm) and also add MPU 44 (would not let me save that with bridged ptm selected)
name=“cake-up” kind=cake cake-bandwidth=0bps cake-overhead=22 cake-mpu=44 cake-atm=ptm cake-overhead-scheme=“”
cake-rtt=100ms cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=filter
name=“cake-down” kind=cake cake-bandwidth=0bps cake-overhead=22 cake-mpu=44 cake-atm=ptm cake-overhead-scheme=“”
cake-rtt=100ms cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=yes cake-ack-filter=none

Admittedly, AT&T is doing a pretty darn good job as of late. Bufferbloat used to be much worse with this same setup, I know they have pushed out several firmware updates over the years to this modem. It is especially heads and shoulders better than my old cable modem with Spectrum. I was lucky to fight through that horror to finally learn that it had a Puma 6 chipset which was known after a period of time to actually introduce latency to varying degrees at random! GRR
Anyhow, I can’t leave well enough alone and why leave my buckets up to them to control, so here I am! Also, I have noticed there has not been much testing that I could find so figured I would help! Next week, I am supposed to get my new 5009 router so that will free up this one for more lab style testing. I have a CRS309 (10gb switch), CRS326 (1gb with 10gb uplinks) here so even though it would be local.. maybe I can help by doing some testing as you mentioned about like 10gb → 1gb, etc.
Let me know how I can help, and I look forward to your feedback on my results. P.S. - thank you in advance for letting me use your server
I had done some testing against mine in Dallas, but was worried that it didn’t have enough CPU to generate the traffic needed?
One more edit… here is my results from waveform’s test after the last config:
https://www.waveform.com/tools/bufferbloat?test-id=6ae9ad8a-90e9-46fa-b561-6299892a2b21
Thx so much for testing. I have a low standard right now… “does it crash?”, so far, so good.
Your first result, sans cake, was really quite good, and indicates your AT&T link has only about 20ms of buffering in it, or so. Believe it or not, that’s actually “underbuffered” by prior standards, and makes it harder for a single flow to sustain full rate. But: a little underbuffering is totally fine by me, and I don’t care all that much if a single flow is unable to achieve full rate, I’d rather have low latency.
It’s easier to determine the buffer depth via a single upload test like this:
flent -x --step-size=.05 --socket-stats -t the_options_you_are_testing --te=upload_streams=1 -H the_closest_server tcp_nup
Use the gui to print the “tcp_rtt” stats. If you use the -t option to name your different runs, you can also do comparison plots via “add other data files” in flent-gui.
there are servers in atlanta and in fremont, california, if either of those would be closer for you.
OK, ok, I gave in, in order to do science, could you also try a tcp_nup with upload_streams=4? and =16?
The Test 1 appears to show an old issue raising it’s head - tcp global synchronization - the amount of queue is so short that all the flows synchronize and drop simultaneously, as per panel 3 of your first plot, but in order to do “science” here, simplifying the test to just uploads would help.
Secondly it appears that something on the path is treating the CS1 codepoint as higher priority than the CS0 codepoint, when CS1 is supposed to be “background”.
Does that VDSL device do hardware flow control? Or are you shaping via cake via htb? (I’m happy to hear the bandwidth=0 parameter seems to be working otherwise?), but the only way I can think of you getting results this good is if the vdsl modem is exerting flow control…
Anyway, your last result is a clear win over what you had before, methinks. I’d like a tcp_nup test of that config too, when you find the time.
No crashing, I have run the CCR1009 very heavy for several days without issue! Full transparency, tonight I am on the RB5009, it just showed up yesterday so I have been toying with it. So, I will be using it for my testing tonight. I can always swap around if you would like. Either way, they are both running 7.1 Stable.
I agree with your statement on under buffering and would also much prefer lower latency than a single stream achieving full rate.
Yes sir, I am in San Antonio so server is ~30ms from me. Here is the result with the test requested, sans queueing:

HAH, I was hoping to pique your interest
Science incoming!
I just thought of something that is very annoying about this modem/“router” from ATT. I have it in ‘bypass’ mode so that it assigns the public IP to the router however, it is still NAT’d traffic for lack of better words. I am not sure how it actually works, but it still has it’s own state table, etc. The FIOS guys have figured out a way to bypass it because they also have an ONT, etc. But since this is DSL, I am stuck with whatever they are doing inside the black box. Maybe this is what is causing the codepoint funny business, as I am not doing anything with DSCP, etc.
On to the data!




I am not sure if the VDSL device does or not to be honest. It is an ATT branded box model BGW210. I have it in passthrough mode, but as stated above it is still some black magic NAT but ‘passes’ the public IP to my router.
The very first test I posted was without any queue in the Mikrotik router. After that, was all with cake, and when using bandwidth=0 it deffinently works well! Obviously, tweaking that helps it out but for a general setup out of the box.
** NOTE ** I just realized I had made a mistake in my config. I was leaving bandwidth at 0 in the cake config, and was setting up the target max limit for upload under the simple queue general settings to 19M. However, no limiting on the download. I will need to try these tests again later setting that to unlimited and setting bandwidth within cake itself. Curious to see if that makes any difference.
Setting up that last config with tcp_nup results:




OK.
0) Still mostly very happy it doesn’t crash.
-
Your dsl device’s buffer is sized in packets, not bytes. The reason we only saw a 20ms RTT before on the rrul test, vs a vs the tcp-nup test being so much larger RTT, is that the acks from the return flows on the path filled up the queue also. I leave it as an exercise now for the reader to calculate the packet buffer length on this device…
-
I figured I was either looking a shaper above cake, or at dsl flow control .(I like hw flow ontrol, btw, I was perpetually showing off an ancient dsl modem with a 4 packet buffer and hw flowcontrol + fq-codel in the early days, as FQ = the time based AQM vs a fifo worked with that beautifully and cost 99% less cpu to do that way. Sadly most dsl modems moved to a switch and don’t provide that backpressure anymore. Not quite sure you just tested that without a shaper.
-
Do want to verify you are not using BBR on your client? The 5ms simultaneous drops are still a mite puzzling.
i do dream of hardware flow control, so no shaper, bandwidth=0 for cake as a tcp_nup test.
But i expect to be unlucky. Anyway, your fiddling with the frame parameters without a cake shaper active should have done nothing (I think), so that run was puzzling…
cake nat besteffort the_right_dsl_option bandwidth XMbit easiest to reason about. Do you have visibiity into the sync rate of the modem? Anyway, get that number right next then try
tcp_ndown… Note you cannot measure tcp rtt from this direction via flent directly, so we resort to inference or packet captures.
At some point I might ask you to stick your *.flent.gz files somewhere. Pleased to have so vastly improved tcp rtt.
I will need to do much more studying to find the answer to question 1. =) I assume it will atleast partially have to do with the RTT and bandwidth as part of the equation.
I think at this point, I need to start over somewhat considering I was NOT using the bandwidth limit within cake, and was setting the bandwidth limit oustide of it. Not thinking when I started, I was used to my old way of queueing in Mikrotik by setting up a simple queue with sfq.
It is interesting to me that you mention the hw flow control, and 4 packet buffer. Earlier I watched the youtube video for the first time of you explaining the 4 packet buffer with the people in the audience as packets. Also, I had watched another video where the gentleman had mentioned hardware. I will link both here for those interested.
https://www.youtube.com/watch?v=ZeCIbCzGY6k
https://www.youtube.com/watch?v=Q6SAcO-H6b0
BBR, good point I hadn’t thought of that! I am using PopOS which is a derivative of Debian. Sure enough..
❯ sysctl net.ipv4.tcp_congestion_control
net.ipv4.tcp_congestion_control = bbr2
I assume for our testing purposes, we would want to disable that correct?
OK, no bandwidth shaping, and the following cake config – and tcp_nup tests..
name=“cake-up” kind=cake cake-bandwidth=0bps cake-overhead=22 cake-mpu=44 cake-atm=ptm cake-overhead-scheme=“”
cake-rtt=100ms cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=filter
name=“cake-down” kind=cake cake-bandwidth=0bps cake-overhead=22 cake-mpu=44 cake-atm=ptm cake-overhead-scheme=“”
cake-rtt=100ms cake-diffserv=diffserv4 cake-flowmode=triple-isolate cake-nat=yes cake-wash=yes cake-ack-filter=none


Ahh yes, it looks like your assumptions were correct!
On to the next test! Here is the info you requested from the modem’s interface:


Here is the config for the following tcp_ndown tests:
name=“cake-up” kind=cake cake-bandwidth=19.0Mbps cake-overhead=22 cake-mpu=44 cake-atm=ptm cake-overhead-scheme=“”
cake-rtt=100ms cake-diffserv=besteffort cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=filter
name=“cake-down” kind=cake cake-bandwidth=100.0Mbps cake-overhead=22 cake-mpu=44 cake-atm=ptm cake-overhead-scheme=“”
cake-rtt=100ms cake-diffserv=besteffort cake-flowmode=triple-isolate cake-nat=yes cake-wash=yes cake-ack-filter=none




Now this has me interested looking at this data.. running a RRUL test as well, because why not ![]()

Ask and you shall receive! Here are the files:
http://zylone.org/taht/tcp_nup-2021-12-12T022959.752069.cake_4up_bw0.flent.gz
http://zylone.org/taht/tcp_nup-2021-12-12T023230.003301.cake_16up_bw0.flent.gz
http://zylone.org/taht/tcp_ndown-2021-12-12T024214.088705.cake_4down.flent.gz
http://zylone.org/taht/tcp_ndown-2021-12-12T024453.641245.cake_16down.flent.gz
http://zylone.org/taht/rrul-2021-12-12T024809.190330.cake_best_effort.flent.gz
You don’t have hw flow control.
Nice to know (I guess) that BBR2 still struggles with itself. Try resetting that to cubic on the up, please, and shape to 19
add ack-filter to the up
I’m running cubic on that server for the down.
Your baseline rtt might drop in half without bonding OR if you can disable interleaving (yes, as well as your bandwidth).
Is it possible to scrape that rate? cake supports dynamically changing it’s config without reloading the qdisc, but I doubt mikrotik can do that with their api (?) tc qdisc change dev whatever cake bandwidth the_new_bandwidth. You should be able to get really close to the actual uplink rate (22xxxkps) with the right framing. Those little ping spikes are a bit puzzlng (something out of band like ppp-oe?) I note some dhcp and some ppp messages now exist in some implementations that actually do send the link and/or shaped rate and framing.
Your download was really pretty. But anyway, I’d like to solidify the upload using cubic at 19mbit first, ack-filter on (I worry about that option), then I’d love to see sfq (unshaped and shaped) to the same rate with both bbrv2 and cubic - We are kinda getting down to attempting rigorous science here, so perhaps scripting, and some packet captures are in order. On the other hand, if you can keep the
tested options straight in the -t option we can easily compare things later. I have a long standing hypothesis that since SFQ was so popular in the wisp markets, (ubnt uses it), and I long ago proved it was too short to sustain fat tcp flows, that it was acting as an AQM also in this market, which is why the observed bufferbloat was only in the 80-100ms range, and as people started shaping to faster and faster rates and using 8+ multiflow speedtests, didn’t notice they were killing single flow tcp performance. ( https://www.bufferbloat.net/projects/bloat/wiki/Wondershaper_Must_Die/ ). The poor results I’d got then however, predate the advent of the linux stack’s pacing and single flows have actually been scaling higher than 12mbit since against sfq’s default 128 packet limit.
The reason why the rrul upload looks spotty is actually more related to sampling error, not an actual problem per se’, and you are also zoomed way in. You can scale plots relative to each other as you wish, or combine them, via flent. I like to zoom in but try to stay cognizant of the scale, and there’s a version of the plot that won’t zoom on you, also.
Somewhat puzzled about the QoS stuff, but I’d rather get the bandwidth param right first. I note I’m not a huge fan of QoS in the first place due to all the differing interpretrations, and there was also a bug in some version or another that wasn’t readng the dscp field properly with some encapsulations. cake has a “wash” option if you are actually seeing mismarks on ingress, or are doing something special on egress that you don’t want upstream to see. i do keep hoping we can “export” a standards compliant diffserv set in the hope that the ISP might respect it, and vice versa…
The rrul test is a stress test using greedy traffic and not indicative of the intent of QoS. Were it to be more representative, it would send voip-like isochronous traffic through the VI queue, videoconferencing 16ms frame-like traffic through the video queue, and something torrent like through the background queue. It semi-intentiionally and semi as a mistake, only excercises 3 of the 4 cake diffserv4 or wifi hw queues, rrulv2 does this more right, haven’t finished the spec yet.
Demonstrating the sad results of sending greedy traffic through a qos system that thinks its traffic was going to obey the rules was also on my mind at the time. You still see a lot of strict priority queues out there where if one user lucks into the right dscp marking, they can starve out everyone else. Cake’s game theory here uses soft admission control so that that doesn’t happen, and in general shows the benefits of short queues and 5 tuple fair queuing over any form of qos, and furthermore does per host fq, so the worst a user can do is do themselves in not everybody else.
There are 110 other tests in the suite, i’ve got rather good at reading the rrul test over the years, it’s the way to get a picture with the least amount of effort, then we do the tcp_nup and down tests. I might not have needed to suggest that had I not noticed that it looked like you were running BBR. The square wave tests are useful, as are the various _var versions which let you test different servers.
Roger that, I figured no hw control was the case.
I have changed net.ipv4.tcp_congestion_control=cubic on the client, and will use the same settings as the last test which have 19M for upload and ack=filter. Pasting config here for sanity.
name=“cake-up” kind=cake cake-bandwidth=19.0Mbps cake-overhead=22 cake-mpu=44 cake-atm=ptm cake-overhead-scheme=“”
cake-rtt=100ms cake-diffserv=besteffort cake-flowmode=triple-isolate cake-nat=yes cake-wash=no cake-ack-filter=filter
name=“cake-down” kind=cake cake-bandwidth=100.0Mbps cake-overhead=22 cake-mpu=44 cake-atm=ptm cake-overhead-scheme=“”
cake-rtt=100ms cake-diffserv=besteffort cake-flowmode=triple-isolate cake-nat=yes cake-wash=yes cake-ack-filter=none


I am going to upload data to my google drive. Hopefully that makes things a little easier to keep track of.
https://drive.google.com/drive/folders/1rE9AuPvhHoLHtShdaZ63Cd8-6fMcTO_v?usp=sharing
This run will be in folder 1-Change to cubic on client
It appears that I cannot turn off interleaving in the modem. However, I can tell it to use line 1, line 2, or both. I will have to wait until next weekend to test this out. The ol lady is home and if she losses her internet.. well, we all know where that leads us ![]()
I think she’ll be happy with your efforts so far.
I am not sure what you mean by scrape the rate? Do you mean change the bandwidth limit real time during a test, or possibly using it as part of a script to help automate testing using the API?
Here are the test results:
SFQ shaped cubic


SFQ unshaped cubic


SFQ shaped bbr


SFQ unshaped bbr


All data can be found in the 2-SFQ Testing foler
https://drive.google.com/drive/folders/1rE9AuPvhHoLHtShdaZ63Cd8-6fMcTO_v?usp=sharing
Just wanted to say, thank you for your analysis and education. I need to go back a few more times and re-read this thread to try and consume it all better. =) I am going to set the client back to cubic and fire cake back up and play around with it some more to see if I can get the framing tweaked to maybe get closer to the upload rate as you had stated.
I have given up asking her how the internet is doing.. she is very binary. It either works or it doesn’t. AHAHA!!!
Only thing I could do to make her happier is move the AP on that side of the house into her room so she has a better signal from her devices, she is right on the edge of GOOD 5g. I have done ALOT of testing/tweaking on the wifi here as well (all mikrotik). That will be the next round of testing after all of this. Go back and re-test/tweak the wifi and out of curiosity see how the flent tests fair over the air compared to over the wire.