You mention VLANs but it seems they exist outside the 'Tik, right? So the 'Tik is a pure router, and there is nothing to save by offloading some bridge functionality to the switch chips.
Even though it has 10+1 gigabit-Ethernet ports, the Mikrotik is a software router, and as you use a lot of routing marks, I assume a significant share of the traffic cannot be handled with reduced CPU effort using fasttracking, which means that your CPU cannot handle more than 100 Mbit/s of this complex packet handling. Use /tool profile to check how busy your CPU cores are when the machine starts dropping packets.
So if you can identify the type of traffic which occupies the most bandwidth and use fasttracking to handle that traffic, it may help a bit or a lot, depending on the percentage of the total traffic which could be fasttracked.
The other thing is that your mark-routing rules are organized in a terribly inefficient way, as every single packet has to be matched against all the million and six rules because you have set passthrough=yes just to be able to mark a few of them in the end of the chain.
As the 'Tik is a software router, you have to think about the firewall as an alghoritm (program) (because that’s what it actually is) and optimise it using the same methods programmers use to make searches efficient. RouterOS won’t do that for you.
Example: you have to map eight src-address-lists (A to H) to eight routing-marks (a to h), and then add a packet mark to some packets. The way you do it is linear:
set routing-mark a if src-address-list=A
...
set routing-mark h if src-address-list=H
set packet-mark x if packet matches some other condition orthogonal to those above
So each packet has to pass through all those 9 rules regardless whether it actually matches already at the first one or only at the last one.
A faster way would be a binary tree:
if src-address-list=ABCD { <--A:1--H:1--
if src-address-list=AB { <--A:2--
if src-address-list=A {set routing-mark a} <--A:3--
else {set routing-mark b}
}
if src-address-list=AB {
if src-address-list=C {set routing-mark c}
else {set routing-mark d}
}
}
if src-address-list=EFGH { <--H:2--
if src-address-list=EF { <--H:3--
if src-address-list=E {set routing-mark e}
else {set routing-mark f}
}
if src-address-list=GH { <--H:4--
if src-address-list=G {set routing-mark g} <--H:5--
else {set routing-mark h} <--H:6--
}
}
if (packet matches some other condition orthogonal to those above) {set packet-mark to x} <--A:4--H:7--
So you can see that in the best case (when the source address of the packet matches src-address-list=A), the packet only had to be matched against 4 rules; even in the worst case (when the source address of the packet matches src-address-list=H), the packet still had to be matched against only 7 rules instead of 9 in the linear case. So in average this step will become two times faster as compared to the linear processing, and that’s only for 8 mark-routing rules; with your 25, the relative improvement will be even better (something like average 6 rules matched instead of 25)
If you can use matching to src-address instead of src-address-list in the example, the matching will be faster too, as address lists have to use an internal hash algorithm while plain src-address matching uses single from-to intervals at worst.
But it is even better to assingn a connection mark, albeit based on relatively expensive operations like address list matching, to the whole connection when processing its initial packet, and translate connection mark to routing mark for all subsequent packets of the connection in upload direction, as described here. You cannot organize the connection mark->routing mark translation rules into a binary tree, but matching a connection mark is a faster operation than matching an address list.
Yet another option, as you seem to use routing marks only based on source addresses: I was always wondering whether routing rules were just a relict of past or only applicable together with dynamic routing protocols, but inspired by your multi-tenant configuration, I’ve done a quick test and it seems to me that, unlike routing marking in mangle, routing rules can coexist with fasttracking. They support much less match conditions, actually only source and destination prefixes (no lists, no intervals) and routing-mark, but for your scenario the src-address should be sufficient. And fasttracking really does make a difference.