Cake does fair queuing using a DRR++ (deficit round robin) derived algorithm. So it attempts to deliver a MTU´s worth of bytes from each flow, in order (DRR), and also puts out flows with an arrival rate less than the total departure rate first. Do not combine htb (simple queues) + cake bandwidth XX...
I would love a packet capture of this behavior. YES, if packets are fragmented, they will be delivered out of order. Ideally your tunnel should be signalling back that the mtu is too big. Most IPSEC implementations I reviewed had a reasonably sized reorder buffer - at least 32 packets. Some were muc...
I am starting up (with some funding from NLNET and comcast) a bufferbloat.net project as of this month to fix some outstanding bugs and add new features. It would be so great if mikrotik would throw in some dough (and gear) for development and testing also. To this day I do not know what, if any, fq...
I have never really understood why qosify seemed so needed, with torrent especially. A) configure cake with nat on, and run torrent on a different IP. B) tell torrent to use CS1 for uploads.
0) In general, I do recommend experimentation with these concepts. Get the L4S code - try it on your machines at home! But a few facts, where I will attempt to be unbiased. For the record, I originally backed an RFC3168 backward compatible version of this idea in the IETF, called SCE, but backed out...
We have quite a few mikrotik users in the LibreQos chat server, and a burgeoning WISP community sharing insights and code. We recently switched to "zulip" chat which is highly interactive and a great deal of fun to use. It's here: https://chat.libreqos.io/join/fvu3cerayyaumo377xwvpev6/ I a...
Offtopic here but: I tried OpenWRT on a Linksys WRT3200ACM with CAKE enabled. Connected my Mikrotik Chateau LTE12 on WAN-port (all ROS queues disabled). It did not perform any better and bufferbloat was still evident. Then I tried https://github.com/lynxthecat/cake-autorate and I did not know why, ...
In many cases people are still using the factory, out of tree, wifi device driver supplied by the manufacturer, and their UI tightly tied to those APIs. Some of the factory drivers have features that are needed, also, like txop reduction in the beacon. We tried to make all the wifi manufacturers awa...
The fq_codel type is set for wired (Ethernet, SFP) interfaces in order to reduce bufferbloat. No interface queue for LTE interface itself. @MikroTik staff. Yes, this is good news, this is a massive step forward in the industry (Yes, I am serious). But there's a problem. MikroTik RouterOS Linux queu...
The fq_codel type is set for wired (Ethernet, SFP) interfaces in order to reduce bufferbloat. No interface queue for LTE interface itself. That's true, physically ports can use "non rate limit" queues. But still not sure what the change does... e.g. what does "wired port" mean i...
Over here is an implementation of BQL for the mvpp2 chip, which may be similar to the chip you are using, with test results. https://github.com/wojtas-marcin/Linux-Kernel/commit/44dacb14698b0e7756d427acd3591980057a70fb Having BQL is a godsend, it lets you run fq_codel at line rate at about 1/20th th...
QoS-HW is compatible with L3HW. You can use both features together. Every supported device has 8 TX queues per port , and users will be able to assign QoS profiles to TX queues: either grant a QoS profile exclusive access to a queue or share a queue (or group of queues) between multiple profiles. T...
I just stumbled across this thread. 6 comments: cake's diffserv4 model is the closest we could come to matching multiple mutually conflicting rfcs. because of confusion over the cs1 codepoint as background, the LE codepoint was created: RFC8622. There is a new codepoint pending with basically the sa...
I would be interested in that but I have a feeling you would wipe out the memory and CPU. It would be cool of you to try. A method that *would* work would be to divide the bandwidth evenly between all those subscribers by bumping up the number of flows from the default, and using a tc filter by dest...
I don't know if you also requested RFC3168 marks be logged also? A surprising amount of packets are being marked by fq_codel/cake etc, today, being signaled end to end by at least some (apple?s?) devices requesting ECN support. RFC3168 ecn is a way to do congestion control without dropping packets, ...
Packet loss is not a particularly good metric to use against a cake or fq_codel instance, as it uses packet loss to control congestion if RFC3168 is not enabled by the endpoints. In exchange for decreased latency, you get more packet loss. So in both fq_codel and cake you should have seen an increas...
I am always looking for more stories about how well (or not) fq_codel and cake are working for y'all. Recently I had a bit of a tiff with juniper, solving an uplink problem on a circuit ostensibly rate limited to 2Gbits but only delivering 1.2. I honestly don't know how often stuff like this happens...
The word priority does not apply to fq_codel or cake (in besteffort mode). A better word would be "sparsity", the relevant paper on this is here: https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8469111 - but in a nutshell it translates to flows having an arrival rate of less than the ...
Thank you for the detailed explanation. You are right! I will implement and check what impact it has on the CPU. I read your post and my my, what a gem of information it is. Ill test it out and share results as well. I have 15.2 GB memory free on the CCR so I think wont be needing to change memlimi...
I am of two minds about BBR. I have been mostly waiting for BBRv2 to come out before recommending it for anything other than its original purpose: being better than DASH (netflix) style traffic for youtube. It is presently ill-suited for sharded web sites in particular (does not compete with itself ...
I am of two minds about BBR. I have been mostly waiting for BBRv2 to come out before recommending it for anything other than its original purpose: being better than DASH (netflix) style traffic for youtube. It is presently ill-suited for sharded web sites in particular (does not compete with itself ...
I do hope a table of performance metrics appears for more mikrotik gear. Sometimes I just hope they will put the fq_codel or cake algorithms more directly into their PCQ implementation. I am blogging more and more, and hoped that someday that insanely long thread on cake for mikrotik got boiled down...
single threaded like iperf tests, without also monitoring tcp latency (Try a packet capture in wireshark, do a RTT plot) will not, in general, show the underlying benefits of fq_codel or cake. You are measuring a dragster on a single track, with a plain iperf test. What of the ability to steer? The ...
I'm not ;-) I only apply such filters for stupid paying customers wanting it. Because they only know YouTube for video and Facebook for social media. So they think trying to block those two sites helps anything. What I sometimes do on sites with low bandwidth uplink is using tls-host rules to apply...
I would just implement it and go measure, and be ready to roll back. Pound it flat with artificial traffic at 4am? (I am one of the authors of fq_codel and cake, but I do not have enough data either, on how well this stuff scales on given bits of mikrotik hardware). In most cases it is the per custo...
I am interested in success and failure stories with folk deploying fq_codel and cake, the kind of problems you ran into, and features you would be interested in. In particular, I am curious as to how or if you modified BNG-style head-end deployments to suit your needs leveraging either or both these...
Hi Dave (dtaht), > A modern version of cake has support for the new diffserv LE codepoint. I'd dearly like support for that in mikrotik given how problematic CS1 proved to be, and it's a teeny patch +1! Would be great if you could submit a request at https://help.mikrotik.com so it is formalized. T...
I don't know what they mean by no queues. There *always* is a queue. If they have a zero length fifo + a ringbuffer of some size on this product, it's still a queue. Figuring out when they drop packets from it and how big it is is always on my mind. A lot of subsystems have BQL now in the linux kern...
I don't understand your methodology for this bit. It doesn't look like you tested fq_codel on the interface queue, all by itself, no trees, no shaping, just replacing what is a (very small) fifo, in their default modes? In my part of the world (linux, openwrt, ios), fq_codel is the native qdisc on t...
I am in general, a big fan of the rrul test suite for evaluating workloads against router types. There's a huge thread using that, over here: viewtopic.php?p=961294
I have CCR2004's, RB4011's, and RB5009's handling 2Gbps of traffic, but without queues. All of them have quad-core 1+GHz CPU's. Each 1Gbps of traffic (without queues) uses about 10% of CPU throughput. Supposedly Cake is less CPU intensive than fq-codel, but I haven't enabled either one on customer-...
Cake co-author here. I feel a need to clarify a few things. cake unshaped - running at the native line rate of the interface - should be able to do its job on most of the hardware discussed. So if your service is gbit, and that's the line rate of the interface, you are in business on outbound. Howev...
This youtube came out a few days ago: https://www.youtube.com/watch?v=UICh3ScfNWI Which does a great job of explaining the symptoms of bufferbloat, and how to fix it. While it doesn't call out mikrotik's recent adoption of cake and fq_codel also, it would be great if you'all capitalized on the incre...
puckishly, I could suggest you change your business model, guaranteeing a minimum of number_of_ips_on_the_link/bandwidth, and just applying cake, which does that automatically (it is fair to each ip). Customers would then be able to use up to the 500mbits available, and most usage patterns are *way*...
We (the cake developers) are struggling on how to implement your design with cake as well. We designed primarily to control the cpe (and would recommend you implement it there as well as it pays to shape/drop packets/etc as close to the exit point as possible. We know how to do it using htb, in linu...
cake's default mode of per host/per flow fairness would be simplest. If you are natting at the router that would be cake nat diffserv4 on both up and down. You are going to run into severe bandwidth limitations however with that many users on the link, and it might behoove you to add some diffserv p...
HI, could please any one simply expain the diffrence between bestefford, diserv4? besteffort does not attempt any differentiation between diffserv classes. It is equivalent to fq_codel in this mode, except it uses an 8-way set associative method to (nearly) garuntee each flow it's own queue, and th...
Through persistence, jim gettys bugged enough smart people to care, until he found me. Through our persistence, we found over 500+ other people to participate on the mailing list, came up with ways to make wifi and routers much more performant, got 1000s of researchers involved on google scholar, fo...
I managed to reproduce some fq_codel spikes with flent (100 connections on IPv6): flent 100s ipv6 fq-codel.png I also implemented measurement of up and down latencies in crusader and redid some fq_codel tests (100 connections on IPv6). 5 tests had spikes on the down latency and 1 on the up latency....
I keep trying to get ISPs to put better queue management on their links, so perhaps you can show them your data, and point them at Preseem and LibreQos as one key way for them to manage their bandwidth better? These sorts of middleboxes within the ISP can help immensely and are very inexpensive to s...
I'm back from vacation and you seem to have achieved satori whilst I was gone. :) The research into trying to make cake adapt to LTE better is over here, with a shell and lua script: https://forum.openwrt.org/t/cake-w-adaptive-bandwidth/108848 There is a severe (mostly multicast related) bug in the ...
I am on vacation and in general far from internet. Expect sparse replies if any for the next week or so. 1) I had thought crusader was sampling TCP_INFO, not using another measurement flow. My bad. The "packet loss" you are reporting is actually "measurement packet loss", not the...
Turn off wifi location services and afd on osx, if that's your wifi client. To be clear, ROS and openwrt on any given piece of hardware should have roughly the same performance, although logging into openwrt shouldn't burp as bad as ROS. Yes, all the latest de-bloating code goes into openwrt first. ...
Why do you feel it is necessary to also rate limit each ap? cake does per host fq itself (although, if natting, you need the nat option on) Certainly it is helpful to run most wifi APs at well below their maximum rate unless they have this: https://www.cs.kau.se/tohojo/airtime-fairness/ An easy way ...
@zoxc I just was about to file a feature request for staggered start over here: https://github.com/Zoxc/crusader and then I realized you were the author!!!
GREAT WORK. Nice to meet ya! Thx for writing this tool.
Your last result is really puzzling, though. 30sec of oscillation like that... this on ethernet or wifi? this was to, rather than through? What if you tune the inbound shaper down a bit more (at least another 10% to start with). I wish crusader would distinctly show the up vs down latency on this la...
In looking at your crusader data a couple notes made by inference (can you post your config?) A) On inbound, shaping is less effective than outbound. It's the nature of the beast. You are going to overshoot at least 200ms with codel in place on a test this extreme, and taking 4 seconds to get these ...
One thing I wonder is why there isn't a total queue byte limit with per-packet overhead instead of the very vague packet limit for codel, fq_codel and cake. For fixed bandwidth links I'd like to configure them to constraint the worst case buffering to say 30 ms. codel is quite slow to act which can...
I think will be useful to integrate some kind of strategy or mechanism to differentiate maybe 4 kinds/priorities of traffic: 1. High-priority traffic like VoIP, and real-time gaming match traffic 2. Videoconferencing and remote management protocols. 3. Light, bursty but low-time connections like in...
Thank you for describing the process towards your aha moment. These days it seems sane to try to buy advertising on keywords like these, in the hope that one day, more folk link to the right things. Regrettably the bufferbloat project runs on essentially zero budget.
I think this is the model you have, which was supported by openwrt 19: https://openwrt.org/toh/mikrotik/rb952ui-5ac2nd_hap_ac_lite The reflashing procedure is a bit painful (and I wouldn't risk it on your roomates if that's the only router you got!), but after upgrading your main router, and have sp...
I was very very happy with all your help testing and exploring fq_codel and cake on this enormous thread: https://forum.mikrotik.com/viewtopic.php?p=937633 (which I don't want to add to!) and equally happy that it seems to have stabilized in current mikrotik releases. But I had questions unanswered ...
@Techtress Glad you so thoroughly figured it out for yourself!! I keep hoping to isolate that "aha!" moment where people make that cognitive jump from "my internet is slow" to "oh! it's bufferbloat", and seriously, if you can remember the moment that triggered you findi...
It would be useful to be able to deduce and reduce the hardware tx ring size on wifi, especially on older (6) "stable" releases of mikrotik, to reduce latency and jitter under multi-station contention. This appears to presently be a number in the 128-256 packet range, which is overkill eve...
I don't know if this box can shape gbit in both directions simultaneously. ? A decent i5 or better middlebox can. Doing a transparent bridge IS a good idea if you cannot do anything else: https://apenwarr.ca/log/20180808
--socket-stats will capture the tcp rtt directly for uploads and supply plot options to look at (for example) the difference between target 5ms and 12ms. While you might think you want a little more bandwidth than what you got, smaller queues lead to better behavior in the advent of a hash collision...
The cake_2 plot above seems to show that something went wrong twice during the test - either there was other traffic, or something glitched somewhere. Doing a comparison plot of that test run will show lower bandwidth and higher latency - this is why we show the detailed results first before going t...
Very good post. However the reason why you see worse udp BK ping latency and a worse average latency on the rrul test cdf for cake is that the foreground tcp traffic is getting more priority than the background traffic, which is actually what you want. The rrul_be test, or using cake besteffort, wou...
My dreams: 0) Get more of y'all (ideally mikrotik) to obsolete the default on interface pfifo AND sfq in favor of fq_codel (and test that without shaping - running 2 ports into one). And increase the packet limit especially if they also get BQL. 1) one day!! be able to use cakes integral shaper on b...
Really great report, thank you. I'm hoping this is a really stable release! I'm pleased that in this scenario cake is only about 3% more cpu. 0) something weird happened on "Cake, simple queue configuration, fasttrack disabled." - did you reset the qdisc? A typical "hit" from som...
To avoid further flooding of the 7.30beta thread with Cake topics, here some results taken from my home network: RB5009, ROS 7.2.2, Fiber uplink at SFP1 using PPPoE with NAT capped at nominal 500/100 by the ISP equipment at the other end of the fiber. The ISP UL shaper does a not so bad job, but th...
Random plug: I am very happy to have picked up more flent converts. My favorite feature of flent is the ability to do comparison plots! Given that it is too hard to install easily on OSX, and the core devs don't have much time to maintain it, I am trying to put together a proposal to form a foundati...
as of linux 4.12, we'd put in some major enhancements to linux wifi, originally for the ath9k chip, nowadays the ath10k, most of mediatek's mt76 line, and most of intel's chipsets, making fq_codel run native (near zero cpu cost), and solving the packet aggregation and airtime fairness problems thoro...
Since linux 3.3, many ethernet drivers gained support for the BQL facility, which limits the amount of data in the ethernet tx ring enormously. Basically it stores up "just enough data" to smooth out an interrupt. At a 100Mbit that might be *2* large packets, or 30 small ones, and then pun...
What's new in 7.3beta40 (2022-May-11 12:18): !) queue - do not allow using CAKE type in simple and tree setups (already configured queues will be disabled); Ok. Cake is not allowed for simple queues and tree queues anymore. Will be disabled. Got it. What's new in 7.3rc1 (2022-May-27 11:50): *) queu...
But it works for physical ones, for example, my WAN interface is ether1. I haven't tested if it actually functions properly, but RouterOS let's me assign the queue. In my tests it never was possible to attach cake as interface queues on virtual interfaces. But what works, at least for me up to ROS ...
It's worth noticing that it's not possible to combine or benefit from queues that contain different " shapers " with each other. A "shaper" is just a packet scheduler that delays packets to reach a desired speed thus there is no point in stacking two "shapers" on top o...
PS I'd really like some benchmarks substitutng fq_codel or cake for the default fifo "interface" queue, as well as the sfq based wireless queue, *without a packet limit*. The default mikrotik packet limit of 50 seems far, far too low for modern bandwidths > 100Mbit. Single stream benchmark...
Cake was *designed* to run the same at line rate, with an external shaper, and/or with its own shaper. In the case where it is used as a htb or hfsc leaf qdisc, you have to run it in bandwidth unlimited mode (which is the default). Disallowing or ignoring usage of the "bandwidth" keyword s...
"The only thing you have to prevent users from doing is setting the CAKE 'bandwidth' parameter when it's installed as a leaf qdisc under HFSC. Running cake in 'unlimited' mode (which is the default anyway) as a leaf in an HFSC tree is perfectly fine, so if users want to do that, I'd say let the...
Do you mean that the interface needs to be set to eth1 (as opposed to the pppoe-interface) or rather the upcoming change in ROS 7.3 that does not allow cake as a simple queue type? Mikrotik couldn't fix the bug so they cut the feature.Now cake in 7.3beta40 is useless. they did reach out to me, and ...
Nice result! And yes, under load on some technologies it's possible to get less latency, as remarkable as it is. Powersave is often a problem. A device will go to sleep until there are more packets to transmit. This is a somewhat foolish behavior network-wise, in that - for example - a tcp syn then ...
I have tried to reach out to mikrotik before on how better to implement cake and fq_codel, even better than it was already working. I'm easily found at dave dot taht at gmail dot com. Or on the cake mailing list on lists.bufferbloat.net. I've been working for free, here, on getting a quality impleme...
I didn't run into anything like that during my testing, although it's possible that I wasn't putting enough load on the router to cause a problem like that to occur. My router is also a different model with a different architecture than the ones mentioned there. I'll chime in on the thread you link...
I don't "get" why anyone needs to set a bucket when using cake. Cake has its own shaper. Setting it on the bridge is usually the wrong thing. You want to set it on the actual physical interface you are trying to shape, so it sees all traffic. Per customer shaping is different, but elided h...
I'm glad to hear the ipv6 problem appears to be fixed. I would love to see some flent benchmarks of ipv6 traffic to prove that though.... Also, on the overhead parameter. Certain forms of DSL are very inefficient at small packets, with 60% overhead. So while one benchmark might look ok (using large ...
I am not huge on present-day speedtests and have shown off flent through this thread. I really wish we could get good stats out o mikrotik to see the effectiveness of things. Another thing you can be doing is a packet capture of your speedtests and plotting rtt in wireshark. Historically, lte driver...
I am not much of a lobbyist, but if there is a mikrotik group representing WISP interests?, i'd like to talk to them about how to spend a billion $ better:
I am very happy to see the demand for cake and fq_codel here, and do hope the ipv6 problem is resolved soon. I've been trying to help with cake specific configuration primarily over here: https://forum.mikrotik.com/viewtopic.php?t=179307 and here: https://forum.mikrotik.com/viewtopic.php?t=181289 I ...
In a more ISP-ideal world the ISP router would shape down to the customer, and the CPE router shape the up. It's more cpu efficient, and packets get dropped (and acks filtered) before they hit the bottleneck link. Our efforts to do both sides of the shaping on the consumer end router was more an act...
@mducharme No, I had failed to "connect the dots" on your terminology and subguis. (I'm still seeking a decent high end mikrotik box to try) Sounds like cake's shaper (bandwidth param) can be used natively using that last panel you showed, without an htb. Yay! cake's "bandwidth" ...
thx for attempting to meet my mind in the middle! Perhaps there is a mikrotik person that can fill in more of my blanks? Most likely this would require somebody to explain what is actually happening on the back end (CLI equivalent) for these configuration options I've shown you above for the differ...
I don't have insight into mikrotik terminology. To me, an interface queue is one line of code that can look like this: tc qdisc replace dev eth0 root fq_codel # This runs the egress interface at the L2 negotiated line rate (ethernet 10mbit, 100mbit, 1gbit, 2.5gbit, 10gbit) tc qdisc replace dev eth0 ...
1. it isn't stable and finished 2. it is likely not the optimal type for the usage scenarios hinted by the type name you would use such queues Hmm.. I strongly disagree there. I've been part of the bufferbloat testers for over a decade now. fq_codel can be enabled for everything you'd like to try i...
There has been a lot of work in the openwrt world to make the cake autorate feature work better over here. https://forum.openwrt.org/t/cakes-autorate-ingress/108848/ It's not ported to mikrotik yet, nor do I know how we would do that as yet. The math is shaping up, at least. See also: https://forum....
I am going to take a break from this thread for a while. The bitag report I'd been working on was released yesterday and I hope it's even more incentive for y'all to focus on reducing "working latency' in your networks. Please reshare; https://www.bitag.org/latency-explained.php 1) Can someone ...
I am going to take a break from this thread for a while. The bitag report I'd been working on was released yesterday and I have to go deal with the political fallout. Please reshare; https://www.bitag.org/latency-explained.php 1) Can someone confirm that 7.2 perhaps has a working ipv6? 2) I'd love t...
I am not big on strict priority queues except under strictly controlled circumstances. Cake does soft admission control which gets the game theory more right, all classes are guaranteed service, all classes can borrow. For strict priority queues I might leverage the prio qdisc + fq_codel sub-qdiscs,...
From the list: If you just want to use cake with priority tins based on the MPLS "Traffic Class" (TC) field (i.e. the renamed original "EXP" field, see RFC5462), I think you can use a tc flower filter (https://man7.org/linux/man-pages/man8/tc-flower.8.html) matching on mpls_tc va...
At the ISP level I would really like some of these queuing disciplines to have native support for MPLS. The typical way to do QoS on an MPLS network is with EXP bits and not DSCP for the actual IP packet being transported. It would be ideal if something like cake could support prioritization based ...
Over here is my demo of how the entire field/world mis-understood the benefits and problems of wifi packet aggregation in the 802.11n standard and later. https://www.youtube.com/watch?v=Rb-UnHDw02o&t=1540s If somehow millions more folk spent 8 minutes of time watching that segment demand for the...
David, Thank you for your thoughts, I think info like this really helps people understand where your project is coming from and working towards. I am hoping to pick your brain some more though. There are plenty of us running Mikrotik all the way from the customer to our core and back out to the int...
It would be pretty cool if fq-codel or cake were backported to the most popular stable shipping mikrotik release. That's how we got deployment on ubnt's gear, initially. It might be very hard for them to backport as the kernel that RouterOS v6 runs on is very old (3.3.5), although I don't know how ...
I note I am enjoying "jamming with an audience" and hope in the end we end up with a quality reference implementation guide for mikrotik's stuff (very sad to hear ipv6 is currently borked). I think there is a wonderful "pile of money" and genuine benefit to be created in just ame...
I have been away for the holidays and not paying too much attention to the internet. I'll be back for real next week. Thank y'all for sharing your topologies. Quick question: What statistics are you currently able to collect on the behavior of your bottleneck links (via snmp... or?) Also, a current ...
0) The right way to manage LTE outbound is with backpressure from the radio. :various cusswords elided: as to why we still don't have a good way to do that. 1) Same goes for managing the queues on the enode-bs and other bottlenecks on the path. We designed fq-codel to be lightweight, with backpressu...
Really good example to show the effects of a simultaneous speedtest against a rrul test! But I wouldn't call it "bad bandwidth metrics", but perhaps, "sanely reduced?". What that test shows is the side effect of that traffic on the rrul workload through cake. Given that the laten...
right now I do not trust mikrotik's treatment of the diffserv bits. could you kill the wash option and use rrul_be? No way should that download been able to run away like that. I guess I should get a mikrotik box myself and experiment? I haven't used it in years. I'm very interested in the many core...
@blurrybird, also I thought you were going to test a 100Mbit link, not a gbit one?
I am pretty sure cake has a role in a gbit/50mbit scenario on just the uplink, but it has historically required good x86 hardware to inbound shape the down at a gbit.
My guess is you are thoroughly out of CPU on the download, not being able to crack 400Mbit. So I would suggest applying cake with the ack-filter - to the upload only, at say, 40Mbit, to start with. Cake with the right encapsulation options can get very close to the rated rate (say, 48) on the uplink...
Anyway, since we lost data, and I don't remember what it was, It would be good to post a summary of what the actual mikrotik configurations ended up being. The journey was educational for us, but I imagine to the outside observer, kind of frightening. thx, and merry christmas! I may not be online a ...
"My understanding is that everything Cisco and Juniper is wred. It can handle huge bandwidth amounts due to offloading to the ASIC, but is almost certainly much worse than any of the newer AQM solutions. I believe those running Cisco and Juniper have no ability to even consider codel or fq_code...
Utilization as a metric is useful for backhauls and other links that have a large amount of already statistically mixed traffic from many sources. It's well known also that it's also a lousy metric, in that 50% utilization over an interval might actually mean 100% utilization for half that interval ...
Let me get the biggest negative motivations out of the way first. 1) ISPs have perverse incentives to discourage actual use of the bandwidth. The more bandwidth you can sell, and the less your users use it, the more money they make. Monopoly ISPs, especially, have a tendency to not move very fast. 2...
two last bits of backstory: I founded an ISP in 1993 (sold it in 1999), and was trying to get my WISP off the ground in Nicaragua in 2007, but my fresh shiny new 802.11n network failed when it rained. There were numerous times over the course of my history in these markets where I would have killed ...
I'm sorry it's taken me so long to address this. Backstory first: fq_codel and cake were "bottom up" efforts throughout, developed upon jim gettys seeing a real problem with how the internet had degraded, that nobody understood, that he kept digging into and into, with the aid of many of t...
Well, you shouldn't see that long term growth pattern either. This is after you tuned up the multipath tx/rx thing? What happens with bandwidth down less 20Mbit? Anyway, thx again. I'm packing up for a trip south, (not to mexico! trying to get closer to the spacex launch), and can't look at this har...
cake. also follow with ecn off without resetting the qdisc, to make sure it's not permanently driven wonky? If it's permanently driven wonky, that's almost a CVE, and I've had enough of those this week.
don't celebrate too soon. Luck counts, and there still may be an obscure bug... :) And you mean cake memlimit or physical memory? Is the ack-filter on on brother's egress? Again, given my still held doubts on having the offsets right for dscp, ecn, and that, having it on may do bad things, but it's ...
squinting, in both cases it did get better for all but the furthest distance (which is kind of expected) If you started the dallas flow last, they'd converge quicker, or if you ran the test longer (-l 300). So anyway, I'm pretty sure how we calculate the default for inbound shaping to be wrong, some...
The behavior of multiple queues in series is kind of complex. Theorists like very much to think about things in terms of a fountain of water, but the real world is batchy in so many respects. Take packets hitting the rx ring. A batch arrives and the ring was nearly full in the first place. A whole b...
so this last one had 200MB on ingress? Dang. I gotta point at available queue space at the provider, or a limited rx ring, (or that bug with bursty failures) to explain a failure to improve here.
Moah! Moah! 8x more! You have the memory to burn. (when we developed cake, *32MB* of ram in the router was a lot) I tried to explain the "default" calculation had some overheads in it that didn't make as much sense on inbound shaping as out. I can try to explain that better.... The default...
fqcodel_dl.png despite this being better, it appears to my eye that you were running out of queue on the down due to the synchronized drops - which could be hitting a limit at the provider or... is there a 1000 packet limit or memory limit? Cake scales this correctly for you on the down, or should....
Without ack filtering it is extremely difficult to achieve full download speeds at a 15x1 ratio of down to up or worse. Also rx rings need to be properly sized, as docsis is bursty. A rx ring of 256 is too small. Don't know if you can change that. i wish more folk were taking packet captures of thei...
thank you so much for sharing your raw flent.gz files and packet captures. So many things in this world cannot be captured by a single number, a summary plot, and while a cdf might hint at a problem, looking at a system's evolution, over time, is always helpful. The explanation for why we saw this b...
Well, that grouped bifurcation shouldn't be happening in that way. fq-codel suffers from the birthday problem where you get a hash collission sqrt(1024), so at 32 flows it's likely you'd see 2 flows colliding and getting different behavior from the rest. Cake uses a 8 way set associatve hash so you ...
How much memory does this router have? And if there's a way to, say, double the packet and memory limits on the fq_codel rtt_fair test on your home machine maybe those sync'd drops would go away. I didn't see those options in the gui... a lot of people patch down the 10000 packet limit and 32MB limi...
shouldn't be nat related issue. In wireshark, to verify if ecn was excerted on an upload, filter on tcp.flags.ecn == 1 yes, the flag is getting set. but My wireshark does not appear to show ECN properly on the tcptrace tool. That is not looking particularly healthy on my xplot either. sacks, resets,...
Let me tackle the download portion of the test. :rant: *nobody* for some reason, tests up and downloads and ping simultaneously, as if people just sat there, did an upload, waited, then did a download, and then did a ping. It's a really bothersome aspect of almost all the web tests today. Real traff...
cake on the edgerouter: https://community.ui.com/questions/Cake-compiled-for-the-EdgeRouter-devices/fc1ff27c-f321-4344-8737-fcc755cae8a2 cake on the udm pro: https://github.com/fabianishere/udm-kernel The whole bufferbloat project is full of hackers desperate to have low latency bandwidth and willin...
The download component of your test looks a touch odd to me, I asked above what it was set to. Also the --te=upload_streams parameter has no function on the rtt_fair tests, they generate one stream per -H server option. Here's where fq-codel begins to pull ahead of SFQ in a couple respects. Your bas...
I have since setup whatever ubiquiti's default simple queue is on their USG.. I believe it is fq_codel? He is amazed, and his facetime video is super clear. He is on a 25/5 cable modem so that made an enormous improvement for them. As he said, I can stream netflix and game at the same time now! heh...
This enormous thread, debugging fq_codel and cake on mikrotik ( https://forum.mikrotik.com/viewtopic.php?t=179307 ), spawned this question from @mducharme ... and it seemed best to fork it here. > dtaht: > Lastly, I do not know how much wred is deployed anymore. 5 tuple FQ - all by itself - seems to...
To restore your eyeball to what the current "real world" looks like for everyone else, try that rtt_fair test with all this fancy schmancy stuff off, just the default fifo on the modem. You situation is different than that 2013 demo in that you have a vastly shorter queue than the 250+ms q...
The FQ component of fq_codel, cake, and fq-pie has what we call the "sparse flow optimization". Request/response (DNS, syn, syn/ack) the first packet of any new flow, acks, voip, gaming, packets, usually "fly through" without observing any queuing at all. In this example we have ...
since your eye is now "trained" for a fairly short rtt, try fremont.starlink.taht.net or london,singapore, or sydney .starlink.taht.net we also have tests for these competing against each other, as in the usual case we are not sending flows to a single server. SFQ will start to underperfor...
@kevinb361 the ecn result is very disturbing. But it could be mikrotik (a checksum failure or parsing the wrong bits on this encapsulation, which was a bug that I can't remember when we fixed in some release of linux and cake), the modem, the path, something at linode, where my server is. Anyway, fq...
@kevinb361 I was up very late yesterday and will sleep soon. I can live with not knowing ecn works before I wake.:) thx again for going to town on this and making such "interesting" mistakes. It's all data to me, and I think the bug you had on the xanwhatever itwas kernel was rather intere...
everyone else working on hardware implementations, kind of went dark earlier this year, and stopped returning my emails, I like to think that's a good sign.
@mducharme thx for tagging along through this enormous thread. I do hope we prove the 7.1 implementation of these algorithms is solid... mikrotik is very late to this party but can benefit from - for example - all the progress made since docsis-pie was standardized ( https://blog.apnic.net/2021/12/0...
but a good test of fq-codel with ecn disabled would comfort me, first. There should be differences in the overall distribution particularly in the 32 flows test... but throughput should stay flat, not that horrible thing that just happened....
OK, it's back up. ECN neg is enabled (but the bits could be getting washed out on the path, OR I'd disabled it on the previous boot). To go to your BBR vs cubic question. :lecture mode: TCP reno was the "internet standard" for a long time. It had a "sawtooth", and an initial wind...
To summarize a few things. Yesterday we ended up in a state where a bunch of flows weren't even going through the host at the right rate, so we weren't stress testing the qdisc, and thus not seeing any difference in latency between the three different qdiscs under test. It was seeing SFQ act the sam...
In order for me to look at that machine (ecn neg might be disabled) I will need to shut it down and put a new password on it. Anyway, if yer still testing, let me know when done.
Nope. We failed to negotiate ecn. (in the packet capture the syn had ecn cwr, the syn/ack didn't, could be the modem, could be failure to read the dscp field properly on the mikrotik, could be my server, will check the server as soon as I remember the password) But comforting that the result was ess...
Thank you for the packet capture. You can, btw, filter out all your other traffic by specifying "host dallas.starlink.taht.net" This is the correct sort of carnage that cubic does, there's retransmits, dup acks, out of order stuff - strangely comforting after puzzling over that last captur...
That's MUCH more correct looking, thank you! Next, to see if ecn is working properly, (e.g. the mikrotik marking it correctly, the path not stomping on it) you can run the exact same test series, but with: sudo sysctl -w net.ipv4.tcp_ecn=1 I use ecn primarily as an AQM debugging tool (given how rare...
wow. You don't have enough loss on that link, only a couple retransmits to speak of, and I'm leaning towards an issue with your host tcp. At one level, it's great, but extremely, extremely weird. are you using the "fq" qdisc on your host, also? And sure you are using cubic? throughput.png ...
I'm glad you are digging it, and I can feed off your energy somewhat. for analyzing packet captures I use wireshark a lot, especially looking for retransmits, reorders, and the various plots.... I often use tcptrace and xplot.org - apt-get install tcptrace xplot.org Example of use tcptrace -G thecap...
your also degrading over flows sfq results are perversely cheering me up. flent bug. tcp bug. me, not mentall concieving how a 19mbit bonded uplink "should work". the packet caps will tell. but it's 3am here, going to back to bed, thx for testing sfq. Also I would consider the xianmod kern...
Could you delete the --step-size portion of your flent command line? Really hoping this is flent and sampling error... In fq-codel, we have what is now the second largest queue management system in the world, from a standing start of me and eric dumazet at 4AM PDT in may of 2012, admittedly a distan...
I hate bugs. :/ Anyway, a packet capture of the 16 flow test would be good at this point. tcpdump -i your-interface -s 128 -w 16flowscake.cap We'd never tested bonding until today... and I could imagine us having a lot of packet reordering in a variety of ways. Assuming this is a bug that isn't in f...
I'm off researching kernel versions. NOT relevant to this was the wireguard patch that went into 5.7. https://github.com/dtaht/sch_cake/issues/141#issuecomment-984503893 If you have a mikrotik account (I am not a mikrotik customer), and can file a bug, I'm a bit concerned. I wouldn't mind, however, ...
And at some point, when your gf is not looking, reboot and try cake again at up16? I return to my initial objective, not crashing. This is 5.6.x? cpu arch?
no, I didn't notice. 19 makes my head hurt less for now? In general dsl tends to fluxuate in rain, over the course of a day, etc, so leaving yourself headroom is a good idea.
by "scraping the rate" I meant rolling some sort of script to pull it off the modems sync rate, but since your isp is shaping you instead, stick to the 19.
I am pretty sure you have the overhead right at this point. I'm also happy to see it not crash. In the interest of science, however, if at some point you could also repeat the 4up test with htb + fq_codel, that would be interesting. Also if you were to enable ecn for a fq_codel vs cake comparison on...
Is it possible to scrape that rate? cake supports dynamically changing it's config *without* reloading the qdisc, but I doubt mikrotik can do that with their api (?) tc qdisc change dev whatever cake bandwidth the_new_bandwidth. You should be able to get really close to the actual uplink rate (22xxx...
You don't have hw flow control. Nice to know (I guess) that BBR2 still struggles with itself. Try resetting that to cubic on the up, please, and shape to 19 add ack-filter to the up I'm running cubic on that server for the down. Your baseline rtt might drop in half without bonding OR if you can disa...
i do dream of hardware flow control, so no shaper, bandwidth=0 for cake as a tcp_nup test. But i expect to be unlucky. Anyway, your fiddling with the frame parameters without a cake shaper active should have done nothing (I think), so that run was puzzling... cake nat besteffort the_right_dsl_option...
OK. 0) Still mostly very happy it doesn't crash. 1) Your dsl device's buffer is sized in packets, not bytes. The reason we only saw a 20ms RTT before on the rrul test, vs a vs the tcp-nup test being so much larger RTT, is that the acks from the return flows on the path filled up the queue also. I le...
Does that VDSL device do hardware flow control? Or are you shaping via cake via htb? (I'm happy to hear the bandwidth=0 parameter seems to be working otherwise?), but the only way I can think of you getting results this good is if the vdsl modem is exerting flow control.... Anyway, your last result ...
OK, ok, I gave in, in order to do science, could you also try a tcp_nup with upload_streams=4? and =16? The Test 1 *appears* to show an old issue raising it's head - tcp global synchronization - the amount of queue is so short that all the flows synchronize and drop simultaneously, as per panel 3 of...
Thx so much for testing. I have a low standard right now... "does it crash?", so far, so good. Your first result, sans cake, was really quite good, and indicates your AT&T link has only about 20ms of buffering in it, or so. Believe it or not, that's actually "underbuffered" b...
If you have a correct estimate of RTT across the satellite link, use rtt that_number + 60ms. Definitely do not use the default rtt estimate (100ms) here as it will not fill the link. "satellite" is a SWAG. cake supports RFC3168 - style ecn - if you enable that on your endpoints you can do ...
And re-re-reading this question (wow, did my eyes glaze over), cake pays no attention to vlan priorites. It can, with a tc rule. Assuming it's a modern enough cake. asking your question of the cake mailing list might get you somewhere...
in direct answer to your question, I don't know of any linux mainline device drivers that do anything clever with lte, like bql or aql. Most of these drivers are out of tree, and I do hope somewhere in some OS, for android or for ios, there's intelligent life down there. One of these days someone wi...
one of the things discussed on that openwrt thread was using a tcptrace-like tool, and elsewhere, deeply inspecting tcp rtt inflation with ebpf and one of kathie nichol's innovations, pping. Some info here: https://lists.bufferbloat.net/pipermail/bloat/2020-June/015772.html however microtik is far, ...
re - mpls. I have no idea if the linux flow dissector is good enough to get that far into the packet to do any good there. (I can look). It can cope with ppp-oe. If it can't find "flows", since there is seemingly no way to get at statistics in microtik, you would end up with a single queue...
Good catch. The bandwidth parameter should be optional for cake. As for whether or not you can run an LTE interface at line rate wisely, the state of most of the linux drivers for that were terribly overbuffered, so the amount of backpressure you got was very late. I hope that something like AQL or ...
We try to stress that the default options for cake (essentially just the bandwidth parameter) are good enough for most purposes. That said, there are two important differences between how cake's bandwidth shaper works vis a vis htb that are useful to highlight. Token bucket designs date back to the ...
I thought I'd write a brief note about SFQ vs CAKE. I think highly of SFQ. If I could go back in time to 2002, when it first arrived in linux, I'd have tried to make it the default, instead of a FIFO, given what I know now. It was *the* fundamental component in wondershaper. Nearly any place you hav...
Do you have any tips for LTE connections? Especially ones that go from ~5Mbps to 70Mbps in a few hours? The auto ingress doesn't always act as I'd expect it to, and I'm not sure if it's RouterOS' implementation, or a bug, or me not understanding things. Don't use them? We get the "how can an e...
To give an example of where I'd hoped to see fq_codel or cake make more of a dent in the mikrotik universe, consider a topology like this: 10Gbit -> 1GBit port A -> 1Gbit port B 10 more ports In ANY fast->slow rate transition fair queuing, and aqm, can soften the impact of that 10Gbit interface (or ...
Ah, I read the doc. All those options are exposed. Yay! I put a PSA about cake's options over here: https://forum.mikrotik.com/viewtopic.php?p=885000#p885000 Selfishly I'd really like to see from y'all some before (say, htb + sfq)/ after results (cake) on mikrotik's hw. Particularly the higher end s...
Hi, one of the contributors to cake here. I'm pleased y'all are finally shipping it, but I have a few comments: * A modern version of cake has support for the new diffserv LE codepoint. I'd dearly like support for that in mikrotik given how problematic CS1 proved to be, and it's a teeny patch. * One...
You are the first person I've found on mikrotik fiddling with their fq_codel implementation. (I'm one of the authors of fq_codel, cake, etc) And you are tackling the worst problem "out there" for bufferbloat - inbound shaping a low rate lte connection. These sorts of connections are not on...
Merely someone here posting a tc -s qdisc show with cake output showing some drops or marks and backlog would make me very happy after waiting all this time for mikrotik to catchup Hi Dave, fancy seeing you around here! Thanks for your work in ridding the world of bufferbloat! Unfortunately, Mikrot...
Dave, Thank you for the work in making CAKE even exist. It will finally give me a DSCP aware queuing discipline on Mikrotik. Now I just need them to use a modern implementation of CAKE where the Least Effort DSCP mark is LE(000001) instead of CS1(001000). And if they could please add Queue Type sel...
What's the correlation between the date a user joined a forum and the actual content of it... We all have our needs and what feels important to us, and when we focus on that we love individually, we tend to see more of it and that's natural. So you dismiss it as unimportant in a disdain manner. Why...
I do also keep hoping for fq_codel or cake in miktrotik. However, it's not an "or" choice so much, but an informed one. Wifi: fq_codel for 3 wifi chipsets (mt76, ath9k, and ath10k) have existed now for a couple years ( https://lwn.net/Articles/705884/ ) however a key feature for the ath10k...
Have you got fq_codel working yet in 7.0? I have done fq_codel backports (where needed) for several manufacturers at this point, and can help out - I also do things like BQL - and I would really like to get some performance numbers out of it + sqm-scripts on your higher end hardware, when your next ...
To clear up some stuff - fq_codel became stable in linux 3.6, and was backported by qualcomm-atheros folk as far back as linux 2.6.32 as part of the compat-wireless patchset. If you are using that patchset on an older kernel, you already have fq_codel. Most of the core de-bufferbloating work entered...
A couple notes: 1) codel and fq_codel are not words for the same thing. codel is a drop strategy that keeps queue lengths shorter and overall latency lower. fq_codel combines drr-style packet scheduling with a few twists to give sparser flows (think dns, voip, and gaming packets) priority in the que...