I have added a new VLAN to a couple of RB1000 to act as intranet routers/firewalls. I have done many tests and the best I could get is 600Mbps full duplex (600Mbps in at eth3, 600Mbps out at eth1). At those rates, CPU usage rises to 85%-95% and I can perceive increased latency, but nothing to worry about. I have no data about packet sizes.
I’m not using queues nor QoS rules. I have more than 200 firewall/nat/mangle rules, but I have a selector at the beginning of the forward chain which distributes packets acording to source and/or destination IP, so packets don’t have to be check against every rule.
Lets assume that average packet size is 512bytes. Max routed traffic that RB1000 can handle in this case is ~750Mbps. If you are saying that firewall rules are optimized and packet do not travel through all the rules, then there is not much to optimize. You need faster hardware.
Ok, so if I want to use it as a firewall the limit is somewhere at 750Mbps. If I just need a router I could disable conntrack and get nearly 1Gbps performance (tested). Not too bad for this little board