Filter rules performance and ordering strategy ?

This thread is a “spin-off” of my thread about “how are counted filter rules on Mikrotik’s products’ test result pages”.
It will focus more on filter rules definition and ordering performance.


I could not find any information about how to “rank” rules by performance (besides the obvious L7 matcher). Question 1: Does anyone have a link to such a ranking ?

For instance, I’m wondering the following from a performance point of view, maybe you could share your experience:

  • globally I have in my rules set 1) input chain rules 2) output chain rules, and finally 3) forward chain rules → Question 2: Is this order important (i.e. would it change anything if I put input and output rules after forward ones) ?
  • for readability reasons, I usually (except for some cases where grouping ports makes sense) have 1 port per rule for a target device/interface. I.e. for N ports I have N rules. Question 3: Does it have a real impact on performance versus having a single rule with “port1, port2, port3, … portN” (if not rechecking the src or destination each time) ?
  • I use “jump” rules based on “in interfaces” (i.e. if in. interface = “interface name” then jump to chain), Question 4: is there a performance difference doing it this way versus using src IP addresses with mask ?

Further here is my global “rules ordering strategy” (for the forward chain):

  • fastrack connection
  • allow established, related, untracked
  • drop invalid
  • if protocol = icmp then jump to ICMP chain (then I do not check again for the protocol in the following rules and last rule of the chain is a drop)
  • if protocol = udp and port 123 allow traffic (ntp)
  • for each in. interface <vlan_interface> jump to specific “vlan_name” chain (Then in each chain I do not check the input interface anymore, and only have rules checking the protocol/dst port/out interface, and finish the chain with a drop)
  • drop all the rest

Logs are only done at “drop” time (I have very few “filter related” logs per day). Maximum number of filter rules (including the rules before the jump and the rules for the specific input vlan) is 19. Around ninety percent of the traffic is handled by the 2-3 first rules in each specific vlan chain (i.e. for legit traffic, rule number 7 up to 9 accepts the traffic, except for some specific chains which may need some more checks) and the vlan chains are ordered such as the ones which have most traffic are put on the top of the list. I also have “sub chains” for specific rules shared by multiple VLANs (for instance for printers, or to access to some shared services).

Question 5: Does this strategy sound “ok” to you ?

I know that one improvement I could do is to replace each “drop” at the end of each specific vlan chain by a “return” and move the jump to the ICMP chain and ntp rules just before the “drop all the rest” rule. This would “save” two matches for most of the non-fast-tracked as well as for non-icmp traffic, But I would lose from a readability perspective. This is why I’m not doing it.

Question 6: Would it be worth it to move these rules to the end in your experience/opinion ?

Thank you in advance !

Yes, it is always advisable to use separate chains for seldomly evaluated rules and branch to them with a jump rule.
Also, when you need content matching (usually you do not need it and if you think you need it, you probably need to think again) it is not advisable to use it in filter rules, but rather you should use it in mangle rules where you filter on L7 only for traffic without connection mark, then apply a connection mark in your rule filtering on L7 content.
Then you can use the connection mark to handle these connections (block them, assign a priority, assign a packet mark to use in a queue, whatever).

It does not matter in which order the chain shows up on the list. What matters or is best practice is a chain is ordered within itself.
Thus one should never see a forward chain rule mixed in with input chain rules etc…

This is problematic. One does not deal with ICMP and udp 123 (NTP) rules in the forward chain, perhaps you meant INPUT CHAIN??
Further here is my global “rules ordering strategy” (for the forward chain):
fastrack connection
allow established, related, untracked
drop invalid
if protocol = icmp then jump to ICMP chain (then I do not check again for the protocol in the following rules and last rule of the chain is a drop)
if protocol = udp and port 123 allow traffic (ntp)
for each in. interface <vlan_interface> jump to specific “vlan_name” chain (Then in each chain I do not check the input interface anymore, and only have rules checking the protocol/dst port/out interface, and finish the chain with a drop)
drop all the rest

Furthermore, do not waste your time with a gazillion ICMP rules, and jump, tis overly complex with no gain. Simply accept ICMP in the input chain and move on to more important things.

KISS → Defaults + What you need to allow + drop all.
https://forum.mikrotik.com/viewtopic.php?t=180838

@Anav, this is an excellent opportunity for you to write a another new “best practice” with explanations. : -)


I may have misunderstood your point but when it comes to for example inp/fwd it’s rather a WinBox gui problem (http://forum.mikrotik.com/t/basic-question-about-firewall-rule-organization-and-grouping-by-chains/146338/1). This does typical confuse new users when performing their first config.


Agreed, XOXOXO :wink:

My 2 cents: most of rules (apart from L7 rules) probably take more or less similar amount of resources to process. So the ordering of rules (within a chain) should consider these rules:

  • rules affecting more packets should be higher in the chain
  • more specific rules should be higher than more general rules with opposite action.
    Example would be: “accept UDP packets with dst-port=123” higher than “drop UDP packets”

The exact priority of rules above is not cast in concrete, the resulting rule order needs to deliver correct behaviour first and good performance comes second.

@anav thank you for your answer and time. But…


Problematic to have ICMP and NTP rules in the forward chain ? Uh… why ?
I have intervlan traffic, and need some ICMP traffic to be allowed between some VLANs which are part of an interface list.
The NTP server is not on the router, and devices in VLANs need to contact it (and the router as well, that’s why there is an output rule as well).
Can you please tell me how to achieve this “with input rules only” to make it less “problematic” ?


Gazillon ? Waste your time ? Overly complex ? No gain ? Simply ? Move on ? More important things ?
Nice combo sentence… I’m happy to read that you know better than I do what should be “more important” to me :smiley:

That being said, I definitively won’t keep a flat 100 rules long forward chain. I prefer to have 12 jumps and chains composed of 3 to 13 rules each.
Now you can prefer coarse-grained filter rules, I prefer fine-grained ones. But that’s a choice.
If you want some “gains” of this choice: accept fast, fast chain evaluation ending, increased readability, increased understandability of what is allowed and happening on the network, increased maintainability, easier detection of suspicious events to name a few.

Completely agree on this. this is what I’m doing, and I add that jumps to sub-chains that used the most should be as high as possible in the chain, or add that what is checked in the “jump” rule should not be re-checked in the sub chain.

The question was more about the computing complexity of each rule.
For instance : which one is faster to match: address or interface ? address list or interface list ?
Or as I wrote above: 4 rules that accept traffic to 4 different tcp ports or a single rule which accepts tcp traffic to all 4 ports (separated by a “,”)

Does anyone have figures showing “when it stops being negligible” ?

Does anyone have figures showing “when it stops being negligible” ?

The impact and scalability of FW rules is depending on device capabilities like number and speed of CPU cores, RAM size and l3hw offload in the switch chip.
So it is hard to come up with numbers among different MT devices.

For RB4011/5009 in my experience routing performance is not impacted in a relevant way up to ca. 100 “normal” FW rules. I never used L7 rules so far.
But your mileage may vary.

And yes, it would be nice if Mikrotik would specify those “25 rules” used for the “official” measurements.

I agree that overall raw performance will depend on the device its resources and how they are used

I was wondering if anyone had access to “official” statements such as (I’m completely inventing the case) “address matching is faster than interface name matching and consumes less memory” or “if it makes sense, regroup ports to be checked in a single rule as port field is checked using regexp matching which is less CPU intensive and memory consuming than having separate rules.”

This kind of statement could help better write rules.


100 total for input + output + forward chain ?
What happens above those 100 rules ? Is the performance drop linear or exponential ? (I’m asking because I’m interested in the 5009 once PIM-SM is supported)

100 total for input + output + forward chain ?

Yes, 100 “non raw” rules in total.

What happens above those 100 rules ? Is the performance drop linear or exponential ? (I’m asking because I’m interested in the 5009 once PIM-SM is supported)

For our uses cases, it is good enough if RB5009 forwarding can max out the 1GB uplinks. This works with 100 rules.
Some not so scientific lab experiments showed 1.5 - 2 GB/s for RB5009 with NAT and 100 rules and 3-4 GB/s with NAT and 25 rules.
L2 Bridging with VLAN filtering has full HW support on RB5009 and works with wirespeed and no CPU impact.

Regarding rule order:
As rule processing is stopped on first matching rule, it is advised to put the rules matched with higher probability first. Idealy, the rule packet counters decend with higher rule number as far as possible to still achieve the desired logic.
On routers, I usually place the accept/forward rules for “established, related” on top of the input/forward rules. They match by far the biggest amount of packets.
Everything possible with RAW rules (like dropping IP spoofs on input) should be done with RAW rules so packets are dropped early before connection tracking starts.

Mixing order of input/forward/output rules does not impact performance, as input packets are only matched against input rules, forward packets only against forward rules etc.
But keeping order of input/forwrad/output rules obviously increases maintanability of the rule set.

No worries, Kraal, now that I know you have a LAN based NTP server on the input chain. That makes more sense for other LAN users to be able to access or not the NTP server.

Also I have never encountered a config where someone felt it necessary to overcontrol ICMP on the forward chain. I think its bogus on your part but as you can tell I am sceptical sort LOL. Most of the time people dont understand that even though vlans are blocked from each other, one can still ping the IP address of any subnet within the router as normal function. If you want to block the ability to do that there was a way but dont remember, as it seemed pointless to me.

I would be curious to see how you have structured your vlan rules as I typically dont have that many or as complex a setup and simply grouping vlans in smart interface lists allows me to minimize the number of rules for the functionality desired.

I am aware of using jump for close to what you stated, in this case DST nat rules where one has a fixed WANIP and many associated rules. In this case it makes sense where if the fixed IP changes one only changes one line of the config!!
Code: Select all

/ip firewall nat
add chain=dstnat dst-address=1.2.3.4 action=jump jump-target=port-forward
add chain=port-forward protocol=tcp dst-port=443 action=dst-nat to-addresses=192.168.1.10
add chain=port-forward protocol=tcp dst-port=3389 action=dst-nat to-addresses=192.168.10.33
...
add chain=port-forward protocol=tcp dst-port=25 action=dst-nat to-addresses=192.168.20.13

https://forum.mikrotik.com/viewtopic.php?t=180838

No, no - I mean a new one that’s more detailed! ; -) “KISS”

Whats missing LOL… Its all there…
Oh I forgot, do not smash the MT device with a bat when frustrated?

So you have no evidence that using some weird matching scheme from which you build a number of chains is better than using a single flat rule list?
Or that processing speed is even affected by the number of chains?
And you did all that just to watch packets flow through your nicely built chains?

The processing of chains is sequential until a rule accepts or rejects the traffic. So when you have 100 rules that each check for a combination of match criteria (input interface, protocol, port number) it is more efficient to group them into separate chains for partly the same criteria but not matching most of the traffic.
E.g. when you really want to split hairs on ICMP traffic (what types/codes you want to allow or not) it is better to first match all ICMP with a rule that jumps to a separate chain for ICMP handling and in that chain do all that checking. Most traffic will not be ICMP and it will not take that branch, and will not have to process all those rules.

Similarly it is often good to separate chains for clarity, e.g. I often match on an interface(list) matching traffic coming from internet and send it off to a separate chain like input-inet. This can also improve performance when there is a lot of traffic from several other interfaces.

But, besides clarity, is there any proven performance improvement?

I explained above how it works. When you have 20 rules that each check different variant of ICMP and you replace that with a jump to a separate chain it will perform a factor of ~20 better.
Of course that will not make your router 20 times faster. It is only about the firewall part.

Yes but 1) I don’t want to give access to anybody to infrastructure devices, 2) I want to be able to use ping between some VLANs without having to access infrastructure devices 3) I don’t want to go to the server rack to plug a laptop into the management vlan access port each time I need to do a ping 4) I didn’t had the time yet to implement dashboards which will give all users an insight of devices / services which may be down or the ping time to a machine. 5) users are able and willing to use ping to check when needed

As a result I had to add forward rules to allow some ICMP.
Again if there is any other way I’ll implement it willingly.
But if you want to call it bogus (as intervlan ping shouldn’t be allowed from a security persoective) or lazyness, feel freee I won’t be angry at you :slight_smile:


In fact, nothing fancy. As stated in the first post, there are some common rules, then jumps to vlan-specific rulesets triggered by a match on the in_interface. The jumps are ordered by priority/occurrence and so are rules within each ruleset.

For instance for the operating VLAN (one with the most specific rules): allow traffic to WAN, allow https traffic to the reverse proxy server hosted on the shared services VLAN, allow tcp/udp for AFP/ NFS to NAS, allow git over ssh, all three deployed on the storage VLAN, jump to the dns chain to allow DNS traffic to the piHole server (tcp/udp + to tcp blocked pages) which is hosted on the shared services VLAN, jump to the printing chain to access to network printers and scanners. 1 rule to drop silently unwanted broadcast traffic, and the catch all final rule to drop anything else and log it… a total of 12 “specific” rules.

Another example: The management VLAN has 16 specific rules as there are specific needs (such as ability to ssh to firewalls and DMZ, access proxy for update, specific protocol/ports for devices remote management, specific logging in order to easily identify issues without having them burried with other logs)

Some do only have 2-3 rules, for instance the printing VLAN which is only allowed to access the local proxy for updates and to access selected ports of devices on the network. Each ruleset has at least 1 rule for dropping silently some traffic (in order to avoid flooding the logs as these packets are understood but considered to be ignored)

Is it difficult to manage ? Definitively not. Was it difficult to set up ? Not in my opinion. It took some time for each VLAN to document requirements, but this was an iterative process over a few days (not full time of course): allow all and log everything, then analyse each log and decide to allow or drop silently and document it. Now, if there is a change, a new service, new device or if anything else new / unexpected happens it get logged and easily detected. As it is documented it could be reused easily.


Exactly

Yes… the concept is simple, and can be applied to all type of traffic.

Ignoring estabilished,related, untracked and invalid, than obviously must be processed first of everything, without any needs to split them,
first splitting traffic by type (ICMP/UDP/IPSEC-ESP/TCP/GRE/IPv6-encap/not redirected) avoid than the packet is processed more times for check each firewall rule,
instead “at least” is processed once for each “redirect” (and “return”, if needed) but not for all subsequently checks on firewall.
The same apply on RAW and NAT section.