Severe Performance Drop RB3011

I have a RB3011 in a location with Cable Internet. We just had the location upgraded from 300mbit download and 20mbit upload to 400mbit download and 20mbit upload. Only the download speed was upgraded.

Over recent weeks I have done periodic speed tests and hit about 120mbit over WiFi. In past times I’ve been able to hit the full 300mbit and actually beyond up to about 350mbit. For some reason speeds haven’t quite been as great but I attributed this to just WiFi issues.

Today when we got the speed upgraded, we did some tests and I had hard wired my laptop to the router and only got an average of about 215mbit, 250mbit maybe at best.

I did a test by going directly to the modem and got full speed, in fact, hitting up to 480mbit down. Quite a difference!

I’m doing some tests I disabled some firewall rules which improved some, but not great.

I’m doing further tests I disabled all my mangle rules, and what do you know, I was able to hit the full 400+mbit behind the 3011.

Simply enabling one mangle rule, any rule, reduced the speed test to an average of about 215mbit.

The 3011 is well capable of beyond 200mbit. What is making the 3011 only perform at this lower speed?

The mangle rules are simply marking packets for our queue tree and QoS.

Why is this having such a huge impact on our speed?

While performing a speed test I’ve checked CPU Usage and it’s hardly ever at 30%. So much more power available.

The 3011 is on the latest software and firmware currently available.

We have about 5 VLANs with DHCP servers for each. Some firewall rules to drop packets between certain VLANs.

Other simple firewall rules to allow IPSec/LT2P.

Some other NAT rules for our PBX and web server.

Nothing super complicated.

Any help is appreciated.

what kind of speed test is it? If it is TCP based, how many paralel streams/connections? If only one, it might be the issue as such test is strongly affected by latency. Adding mangle rule is going to introduce slight delay as the packet must be processed in another block of code.

Just to make sure - do you have fasttrack rule set up?

If you share your /export hide-sensitive it might be easier to spot any possible issue straight away. Feel free to use find&replace in your favourite text editor to systematically replace all occurrences of each public IP address potentially identifying you by a distinctive pattern such as my.public.ip.1. (credit for this sentence goes to sindy)

The test is SpeedTest.net. I’m not sure if they are UDP or TCP.

Regardless of this though, the latency doesn’t change when I enable/disable the mangle rules (we’re only talking about less than 10 of them in total).

Latency remains real well at about 9-14ms depending on the test.

Overall throughout changes big time.

With rules disabled I get all we are allowed to get at about 480mbit down.

With even just one mangle rule enabled a huge drop to about 215mbit.

Both times latency is excellent and seemingly unaffected.

I disabled fastrack in IP → Settings last night when I was trying some stuff but it made no difference. I don’t have any fastrack rules either, though.

I’m not at my computer right now but I will see what I can do about piston the config.

They are multi-stream TCP, testing the download direction first using four streams and then the upload one using another four streams.

This is due to one type of optimization - if there are no rules at all in the firewall, the firewall processing is skipped completely.

Fasttracking is another kind of optimization, where you skip most of firewall processing in a controlled way for most mid-connection packets, so only the packets establishing the connection and every n-th mid-connection packet are handled by all stages of the firewall. Without fasttracking, the CPU may be insufficient to handle the traffic, depending on the RB model. The bad news is that fasttracking is incompatible with mangling (and IPsec policy matching) but the other way round - setting up a mangle rule does not disable fasttracking for all (which is good), you just get unexpected behaviour if you use both without taking additional measures. The correct way to make the two coexist is described here; if the _/ip route rule_s with their limited number of match conditions are sufficient to cover your policy routing needs, you can use them instead of mangle rules and fasttracking will still work without any extra measures to take.

Thank you Sindy. I will take a look at that and do some testing.

I’m still not sure though why the performance is dropping so badly. CPU usage is not above 30% during that test. Even if the CPU was processing all those packets, there is plenty of processing power available. I would expect slow downs if the CPU usage was at 100%, right? Please correct me if my understanding of this is incorrect.

At the end of the day, I am trying to use the mangle to mark packets from specific groups of nodes within the network to prioritize and shape bandwidth.

My understanding to do this correctly is:

Add mangle rule to mark connections FROM IP (Upload traffic), passthrough = yes
Add mangle rule to mark packets matching connection above, passthrough = no

Add mangle rule to mark connections TO IP (Download traffic), passthrough = yes
Add mangle rule to mark packets matching connection above, passthrough = no

Then going to the queue tree, lets say we’re starting from scratch, we add:

Global Upload queue
Global Download queue

Then we add:

Upload queue for marked packets from above (Of course setting any max limits etc) to the Global Upload queue
Download queue for marked packets from above (Of course setting any max limits etc) to the Global Download queue

Essentially this method works great. This way I can assign a priority accordingly as well as any bandwidth limitations.

In the same manner for our public WiFi which is on a separate VLAN, I mark connections from the public IP range, say 172.31.0.0/24, for example. Then mark their packets from that connection.

The queue is similar as described earlier but I now have the bandwidth throttle for the entire group to have a max and set it to a PCQ type of queue.

What I have described here is all I am trying to accomplish but unfortunately lose half our available bandwidth. So I’m trying to figure out why. There is nothing I can think of that should be affecting bandwidth this much, right? Or are the 3011’s just not as good as I think they are?

I am considering a complete factory reset and starting over. It seems that something is just not right here. I appreciate your help in trying to dig in to this.

Sadly, queueing is incompatible with fasttracking as well, for obvious reasons. So if you need to shape the traffic, pour in more horsepower, which means to replace the 3011 with something that can handle the requirements.

What would you suggest?

Going off of MikroTik’s test results I figured the 3011 would be more than sufficient:

Their test in routing with 25 ip filter rules shows a result of 2,453.1 Mbps. Granted, this is not with queuing. But is queuing really this hard on the available horsepower?

Would Simple Queuing be less intensive? I don’t like it though since I can’t use address lists. But if thats the solution then I’d have to make it work.

The new RB4011, only about $200.

I’ve had my eye on the 4011, sadly it’s not quite available just yet.

For test results that approximate real world performance, look at 25 ip filter rules with 512 byte packet size (not 1518 byte packet size). With those test results, the 3011 is only capable of 836.0 Mbps with essentially a default config plus maybe a few more rules. Adding more rules etc are going to drop that rate from the 836 Mbps figure even lower.

But his problem is getting only 120 Mbps - about one eight of the 512 byte/ 25 ip rules scenario. I know one can always use something CPU heavy, and get this result. But seems… excessive, in this case. Don’t You think?

That is exactly my point. Adding one simple mangle rule throws everything off.

I’ve had better results using an older 450G! Although I did use Simple Queues back then.

Something just isn’t right but I can’t put my finger on it. CPU usage never really goes above 30%.

I think i’m just going to have to do a factory reset and start over.

I have a 3011 and do not have this issue. Can you export your mangle rule? I can see if I can test it.

A single mangle rule may just activate some heavy processing which is not done without that rule, so yes, without seeing the actual configuration and pinpoint to that rule it is hard to guess.

Plus if it is as you describe, there is no guarantee that the 4011 will not fall into the same trap, except that it would probably perform better in absolute figures. It uses the same architecture. And so does e.g. hAP ac², so it would be easy to test e.g. for me if I had a fat enough pipe to the net.

Sorry for the delay with this guys.

So here is what I have:

/ip firewall address-list
add address=172.31.0.0/24 list=Public-WiFI
add address=172.31.1.0/24 list=Public-WiFI

The Mangle Rules:

/ip firewall mangle
add action=mark-packet chain=prerouting new-packet-mark="Public-WiFi Upload" passthrough=no src-address-list=Public-WiFI
add action=mark-packet chain=prerouting dst-address-list=Public-WiFI new-packet-mark="Public-WiFi Download" passthrough=no

Essentially the first rule is marking packets coming from IP addresses on our Public WiFi network, and being marked as Upload Packets, and the second rule is marking packets going to those Public WiFi network IP addresses and being marked as Download Packets.

From here under the Queue Tree I have a Global Download and a Global Upload queue with sub items

/queue tree
add name="Public WiFi Upload" parent="Global Upload" max-limit=150M packet-mark="Public-WiFi Upload" queue=pcq-upload-default
add name="Public WiFi Download" parent="Global Download" max-limit=5M packet-mark="Public-WiFi Download" queue=pcq-download-default

So this essentially shares a 150mbit bandwidth pool for download speed for our 2 Public WiFi subnets and essentially shares a 5mbit bandwidth pool for upload speed for our 2 Public WiFi subnets.

Very simple.

At the end of the day, activating just one of these mangle rules from above takes a very big hit on our overall throughput. Running a test from a server which has no restrictions on the router and should get the full 480mbit we can hit, only gets about 200 - 215 mbit, and CPU usage never hitting above 30% during this test. With or without the queue tree options enabled.

Creating a random mangle rule with dummy information that wouldn’t ever hit an actual packet produces the same result. So clearly something is wrong here, and again concluding that I will factory reset the router and start over, as something is terribly wrong. I never had this problem before and have had our mangle rules in place for months without a problem. I only identified an issue now that our pipe got upgraded and I wasn’t getting the expected results during our tests. But had noticed the WiFi not really performing as it should over recent weeks but didn’t clearly identify the problem until now.

I am guessing that perhaps a recent RouterOS update may have caused a problem, but I don’t know.

Another note here, I have tried different mangle rules by first marking the new connections from and to these address lists, and then marking the connections with the packet marks, its no different.

Well, the next test should be to leave the mangle rules in place but disable the queues.

But again, as you have posted only an excerpt from your configuration, it is hard to guess what all happens.

I can imagine that if you have no other firewall rules than these two, and you remove/disable these two, RouterOS says “hooray, no firewall rules, no need to push packets via firewall, so switching on fastpath” (I don’t know whether it takes care about existence of configuration items under /queue tree when taking this decision). But as soon as you add a single mangle rule, fastpath gets disabled, and as you cannot activate it using fasttrack (because you need mangling), you see the performance drop, and further performance drop may come from the queue handling as these mangle rules not only are present but also let at least part of the traffic be handled by queues.

So if you enable the mangle rules but disable the queues, you may see yet another speedtest results.

EDIT: Verbiage updated. Thanks!

I’ve already tried that actually. Disabling the queues and enabling even just one mangle rule, bandwidth drops substantially. Disabling all mangle rules and leaving all the queues on, full bandwidth.

I will work to get the full configuration on here as I understand there could be an underlying issue besides the simple mangle rules, and obviously as something really strange is going on here.

What is the difference between Leaving the queue tree enabled and simply disabling the mangle rules and Disabling all mangle rules and leaving the queues on cases? To me it is the same but you report different impact on bandwidth.