And that objective has been achieved, so maybe you should just leave it at this?I want to use queues for traffic shaping so a single HTTP download doesn't starve more important traffic
But they can... only not in a single TCP session. When many users are trying to up- and download lots of things, it will work just fine.There will be plenty of high bandwidth TCP connections in real world usage (lots of large file uploads for example). If they can't use the full connection capacity that's a bit disappointing.
As R1CH showed in his image, no single CPU is bottlenecking. The load is nicely distributed across all cores.Always check the detailed load of the CCR in tools->profile by selecting CPU: all.
When you get 10 or 20% CPU load on a CCR1009 it can mean that one or two cores are fully loaded and the others are almost idle.
The CPU is still the bottleneck in that case because it apparently is a single-threaded task.
If no single core goes over 60%, how exactly is it limited by CPU?Well, that can still happen when the task is single-threaded and limited by CPU.
The immediate performance is limited by the single CPU, but the actual CPU running the code is switched a few times per second, so you still see evenly loaded processors in the profiling.
Describe the scenario where an average load of 60% would result in a throughput of 62% of max.The load figures you see in profiling are averages and the CPU limits are instaneous.
If that's what is actually happening, I would consider that a pretty big flaw in the scheduler. It shouldn't take away the CPU of a 100% CPU bound process when there are plenty of other CPUs available.Only a single processor can be active for the thread at one time, when the thread is
scheduled on a different processor regularly, the average load of the processors can be low even when the thread is CPU-bound.
I disagree. While it might be implemented like this in RB devices, it is not required to process TCP packets in sequence by intermediate devices (e.g. routers). Every receiver's TCP stack has to implement out-of-order delivery mechanism (incidentally this mechanism is also used when doing retransmits).Unsurprising, because TCP has a sequence number which of course has to be handled atomically.
That is true, but the connection tracking has to implement a sliding sequence number to be able to reject segments with a bad sequence number as "invalid".I disagree. While it might be implemented like this in RB devices, it is not required to process TCP packets in sequence by intermediate devices (e.g. routers). Every receiver's TCP stack has to implement out-of-order delivery mechanism (incidentally this mechanism is also used when doing retransmits).Unsurprising, because TCP has a sequence number which of course has to be handled atomically.
I can't think of a way when NAT would not be atop of load-balancing. E.g. how could NAT work if not all packets would pass single NAT instance (the physical link on either side might be load-balanced)? Or am I misunderstanding what you wrote about packets not seen when load-balancing?That is true, but the connection tracking has to implement a sliding sequence number to be able to reject segments with a bad sequence number as "invalid".
So while the segments itself may arrive out-of-sequence and should not be queued (some segments may not be seen at all, e.g. when load balancing is in use), there still should be handling of the acked sequence numbers.
We are having a similar problem with queueing of 1Gbps of MPLS traffic. In our case, it isn't a single stream performance that we are hitting, but instead is the total MPLS traffic across the interface, but exactly matches your 700Mbps figure. I suspect that the router is treating the bulk MPLS traffic (which just have labels identifying the packets) similarly to how it would a single TCP stream, resulting in similar bottlenecks.That's good to hear it is reproducible. I will contact Mikrotik support and hope for an explanation.
Hi @mducharme, we have experienced the same behavior with ARM devices. On what devices have you experienced this behavior? Have you found any way to solve it?We are having a similar problem with queueing of 1Gbps of MPLS traffic. In our case, it isn't a single stream performance that we are hitting, but instead is the total MPLS traffic across the interface, but exactly matches your 700Mbps figure. I suspect that the router is treating the bulk MPLS traffic (which just have labels identifying the packets) similarly to how it would a single TCP stream, resulting in similar bottlenecks.