Routerboard: "IP Firewall - VLAN - Bridges" real life performance stats to verify

Hi all,

I have done some performance tests on my (not so typical) homenetwork for which I would like some comments/confirmation. Perhaps my config could/should be optimized or it is not “the way ROS is supposed to be used”, please let me know as well.

The bottomline is that using the IP Firewall under Bridges is a huge performance hit more so than I expected in the first place.

A bit of a wall of text, please bear with me.

  • I have a RB2011USA-2HnD which has eth1 connected upstream and eventually reaching the Internet, eth2-5 have Netgear switches connected. Mainly Netgear GS724T and Netgear GS110T’s.


  • On the RB2011 I have 5 vlans which propagate to all switches on eth2-5 and configured like this:
    I have defined a bridge for each vlan. Each physical port has a vlan interface for each vlan, this vlan interface is connected to the corresponding bridge.
    So eth2 has 5 vlan interfaces and these 5 vlan interfaces connect to their own bridge. As these bridges represent the 5 “security” zones, I can now use firewall rules to filter between bridges only as all traffic within a specific bridge is the same level.

I’m using iperf3 -s on a iMac and iperf3 -c x.x.x.x -b 0 -V -f m -i 1 on an Macbook. I have tested different packetsizes ranging from 64 to 1480. Each testresult is the average of 10 runs.

A packetsize of 256 and up yields similar results, packetsizes below are significantly slower on performance which can be expected so for the test the averages shown are from 256 bytes and up.

The results
The first two tests are to see the maximum performance possible on the components:

  1. Baseline test: macbook - direct cable - imac : 982 Mbits/s
  2. Baseline switch test: macbook - cable - Netgear GS724T - cable - imac : 935 Mbits/s

The next three tests are to see the performance of my ROS config and the impact of different settings. I’m using a RB951 for these three tests but the ROS config is exactly the same as on the RB2011 but without all the other traffic travelling my network;
3) Baseline ROS config test (Bridge do not Use IP Firewall) : macbook - cable - RB951 - cable - imac : 529 Mbits/s
4) Baseline ROS config test (Bridge Use IP Firewall) : macbook - cable - RB951 - cable - imac : 243 Mbits/s
5) Baseline ROS config test (Bridge Use IP Firewall and macbook/imac in different VLANs) : macbook - cable - RB951 - cable - imac : 190 Mbits/s

That is quite an impact. The next two tests, I have connected the macbook directly to the RB2011 on eth5
6) Full test (Bridge do not Use IP Firewall) : macbook - cable - RB2011 - house wall cabling - Netgear GS724T - cable - imac : 453 Mbits/s
7) Full test (Bridge Use IP Firewall and macbook/imac in different VLANs): macbook - cable - RB2011 - house wall cabling - Netgear GS724T - cable - imac : 145 Mbits/s

The final test is done on my full home network under normal operations (including Internet inbound/outbound traffic, Bridge IP Firewall + use firewall for VLAN, cross-VLAN’s, using the house cabling, wall outlets, switches and the RB2011:
8 ) Full test (Bridge Use IP Firewall and macbook/imac in different VLANs): macbook - cable - Netgear GS110T - house wall cabling - RB2011 - house wall cabling - Netgear GS724T - cable - imac : 126 Mbits/s

The RB2011 CPU is between 20%-55% during the testing. The performance hit is huge on using the IP Firewall for bridges and VLAN’s but even if I do not use the IP Firewall for the bridges, just the difference in throughput for two hosts residing in the same VLAN vs the throughput of hosts in different VLAN’s is big too.

I understand there will be performance loses on filtering, CPU vs switch chipset, cabling, distances but this is a bit more than I expected.

  • Can anyone comment/confirm on what I’m seeing here?
  • Am I using/testing this wrong?
  • Would it make a difference if I replace the RB2011 with an CCR1009 for instance?

Hello

Yes, if you replace with a CCR you will get probably wire-speed. But it’s not the same budget. From the routerboard website, you can see the throughput you can expect.
http://routerboard.com/RB2011UiAS-2HnD-IN (bottom of the page)
As you can see it depends heavily on the packet size, and in the worst case you can go down to 30Mbps. Compare will CCR_1009 page and you will notice the difference :slight_smile:

Now, if you want better performance with your RB2011, i suggest avoid using bridges and VLANS when possible, and rather use the integrated switch chip. You can also “tune” your firewall rules and put the most used on top, to avoid packets being processed for nothing and use CPU.

Remember that RB951/RB2011 is very cheap compared to other brands (with same functionalities), they work very well for some uses, but if you want Gbps IP filtering, it’s not the best choice (Firewall and bridging use CPU, and mipsbe CPU is limited.)

Hope this helps

Good points, unfortunately I do need the vlans and firewalling but perhaps it is possible to do it without the bridges. Still it will use the CPU and that’s where I thought the CCR1009 would perhaps be more suitable.

I did read the comparison on various routerboards but also read the fine print underneath the tables :slight_smile: And the performance drop-off in the tables is not as severe is my real-world tests.

I would like to know if someone has done similar tests with similar results, or perhaps someone from Mikrotik can chime in to say if this amount of performance hit is “normal” for each of those tests.

Another thing I would like to know if there is a similar drop in performance (percentage-wise) when using a CCR1009 and I re-run the same tests. Basically if getting a more powerful CPU means having a better performance and I don’t end up with results which are only slightly better.

Not sure how optimized the software code itself is opposed to the underlying hardware platform.

Well, to answer my own question :open_mouth:

I bought a CCR1009-8G-1S-1S+ to replace the RB2011 and re-run tests 3 through 8. The results are .. well .. night and day.

I’ll share only test 8 as this is the one that really matters. The test is done in exactly the same manner as in the first post.

8 ) Full test (Bridge Use IP Firewall and macbook/imac in different VLANs): macbook - cable - Netgear GS110T - house wall cabling - CCR1009 - house wall cabling - Netgear GS724T - cable - imac : 628 Mbits/s

The 9-core CPU doesnt break a sweat, only goes up to 8-10% during the test.

That result still seems rather disappointing to me, there shouldn’t be a reason why you can’t get 900+mbps over a LAN. Granted there will be some overhead, but loss of almost 400mbps is rather poor if the CPU isn’t even close to being loaded.

What does tool / profile say?

My tests #1 and #2 show the performance for each component and I have retested all the components listed in test #8, they all score over 900 Mbits/s on their own.

I agree 300-400 Mbits/s loss is a lot on house wall cabling, not sure what else it can be. I’m already very happy with the increase in performance over the 126 Mbits/s :smiley:

The house wall cabling is CAT 5e runs through walls and does cross powercable junctions. Although long it is well within 100 meters. I’ll see if I can take the house wall cabling out of the equation and test with replacement cables.

on ccr1009 try using Ethernet 6-8 which are direct to the cpu, ports 1-4 have a shared 1g port to cpu

ccr1009 block diagram

http://i.mt.lv/routerboard/files/CCR1009-140630151432.pdf

Good point, I didnt think of the connection between the switch and cpu. I’ll try it and let’s see what happens :slight_smile:

Update: Something else weird happened. A couple of weeks ago the performance dropped significantly on the CCR. I have not changed any configuration item except adding non related firewall rules for the VPN’s and updating the firmware to the latest stable release (currently 6.33.5).

To troubleshoot, I first did a baseline test to see the max cablespeed between the two Macbooks I was going to be using.

BASELINE: connect two macbooks directly via thunderbolt ethernet

[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec  1.10 GBytes   941 Mbits/sec                  sender
[  4]   0.00-10.00  sec  1.10 GBytes   941 Mbits/sec                  receiver

Connected to CCR on switch chipset group 1 (vlan34) port3 to (vlan34) port4

[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec  23.3 MBytes  19.5 Mbits/sec                  sender
[  4]   0.00-10.00  sec  23.2 MBytes  19.5 Mbits/sec                  receiver

As you can see, 20 Mbits/s !!!

Connected to CCR on switch chipset group 1 (vlan34) port4 and directly to cpu (vlan34) port8

[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec  22.7 MBytes  19.1 Mbits/sec                  sender
[  4]   0.00-10.00  sec  22.6 MBytes  19.0 Mbits/sec                  receiver

Same result, so it does not matter if I connect to ports on the switch chipset or if I connect one port to the switch chipset and one port directly to the CPU and copy data towards the chipset

Connected directly to cpu (vlan34) port6 to (vlan34) port8

[ ID] Interval           Transfer     Bandwidth
[  4]   0.00-10.00  sec  1.09 GBytes   938 Mbits/sec                  sender
[  4]   0.00-10.00  sec  1.09 GBytes   938 Mbits/sec                  receiver

Now this is different, if I do not use the switch chipset the speed is what you would expect!

  • So, the question is: is the switch chipset broken on my CCR ?
  • What kind of other tests, logs, debugging can I do to find out ?

Keep in mind that it did work, see the fourth post above.

@chechito , you we’re right. After this issue with the switch chipset performance drop, I’ve now connected my access switches directly to the CPU on ports5-8. The increase in throughput performance has gone up from about 630-ish to above 900 Mbits/s.