CPU Load for bridging with queues

We have a Pent4 3.0GHz unit that was running MikroTik v2.8.28. It acts as an Ethernet bridge with 2 interfaces, bridging Internet-bound traffic and shaping with rules based on the local IP. We had the unit crash with kernel panic errors 3 times over the past week, so last night I decided to try upgrading the software and to keep the changes as limited as possible I went with the oldest version I could download, v2.9.51. There were still some changes necessary to the mangle and queue setup, but after those changes I found that the unit got backlogged and cpu load pegged at 100% once traffic levels neared typical ranges. I’m wondering whether we really need to upgrade the hardware or if there is some way to change the queueing/bridging method to be more efficient. Here are the details:

Typical traffic passing through the unit peaks around 50Mbps
There are about 1500 IPs we are shaping separately
In v2.8.28 we were using firewall mangle with mark-flow and then queue tree with pfifo type for each IP - one src-addr and one dst-addr mangle match for each IP
Now in v2.8.28 I switched to using firewall mangle with new-packet-mark and passthrough=no and still queue tree with pfifo

Originally in v2.8.28 cpu load would peak around 50% at 40-50Mbps, now cpu load hits 100% around 20Mbps. We also tried disabling conn tracking and still couldn’t get over about 30Mbps.

Any ideas on what we should change? Or maybe where we could find the most recent v2.8.xx This setup had been running okay for several years without trouble.

Thanks,
-Ryan

So if I understood correctly, you have 3000 queue tree rules? :open_mouth:
I would suggest to group users with similar bandwidth and use PCQ.

We do have 3000 firewall mangle rules (2 per IP), but we have just 1500 queue tree rules (pfifo type), one for each IP with both mangle rules pointing to the same queue tree rule. We did this so that we could use a single queue for incoming and outgoing so that if we assign them 1Mb they could use 1Mb up or 1Mb down or 512K down + 512K up but not 1Mb + 1Mb down. I think that wasn’t possible with PCQ… is that right?

Below is a sample of what rules we use.
Type rules:
/queue type add name=“tower” kind=pfifo pfifo-limit=10
/queue type add name=“customer” kind=pfifo pfifo-limit=10

Queue rule per tower:
/ip queue tree add name=“Tower1” parent=global-out packet-mark=“” limit-at=1024 queue=tower priority=8 max-limit=8192 burst-limit=0 burst-threshold=0 burst-time=0s

Rules for each customer:
/ip firewall mangle add chain=forward action=mark-packet new-packet-mark=Cust50 passthrough=no src-address=192.168.2.50
/ip firewall mangle add chain=forward action=mark-packet new-packet-mark=Cust50 passthrough=no dst-address=192.168.2.50
/ip queue tree add name=“Cust50” parent=Tower1 packet-mark=Cust50 limit-at=512 queue=customer priority=8 max-limit=1024 burst-limit=2048 burst-threshold=512 burst-time=30m

Any recommendations for how we can make a similar setup work on this box?

Thanks,
-Ryan