we run into some trouble with our PPPoE Access Concentrator (CCR1072 with ROSv6.46.4 stable). We have >~1k sessions. and around 2Gbit/s peak traffic. The bandwith of our DSL customers is shaped at the DSLAMs (in the most cases, not all) but our FTTH and radio customers are shaped by Queue on the CCR1072. At the moment that are ~300 Queues. Most Queues are dynamic via PPPoE but some are static simple queues with standard settings (besides Upload/Download Limit).
Here is the problem: When a customer with FTTH 500Mbit/s uses his bandwith, the CPU load increases to nearly 100% with all resulting consequences!
Please, take a look on the pix:
Queues on the Access Concentrator, sorted by Max Limit.
Last 12h statistics. You can see the massive CPU load + very high latency in the hole network
Last 7 days statistics. You can see that there is no problem with traffic/performance aslong as the “big” queues aren’t fully used.
Does anyone have an idea how to manage that problem? The customers are connected to the network with CRS317/CRS328-4C-20S-4S+. We used to shape costomers by “Switch → Port → Ingress/Egress Rate” but our customers complained about packet loss. So we changed to “open” Ports and Queues…
If you need more Infos pleas ask! Thanks in advance.
no the router did not reboot. We do have firewall and nat on the router but no masquerading, only src nat and only for IPs that we use for our servers etc. Our customers get IPv4 adresses out of a dedicated pool. For all IPs in that pool we have RAW rules which disable connection tracking.
About the trouble queues: Those queues are set on bridges.
For customers without PPPoE, we add a bridge (1 for each customer). In this bridge we add a VLAN interface (interface → vlan). The bridges are needed for L2 filters. Then we configure an /30 IP address on the bridge and put the queue for bandwith control on the bridge.
Is it possible that the problem is the bridge as queue target? Should we put the queue on the VLAN interface itself? Or wrong queue type?
The CCr1072 isnt made for queues or firewall/nat, it was made to route packets. While it can handle some queues and firewall/natting, it doesnt have the clockspeed to handle it well.
Ok than maybe the queues on the CCR aren’t the most efficient way but it’s the only way to shape customers’ traffic. …and why 100% CPU?? we have queues with 250MBit/s => no problem, but 500Mbit/s tear down the hole device??
I did now a lab test with 3 CCR1072. I connected the 3 CCRs to each other like that:
CCR1 sfp+1 ↔ sfp+1 CCR2 sfp+2 ↔ sfp+1 CCR3 all connections with 10G DAC
Than I put IP adresses onto the interfaces and start Bandwithtest from CCR3 to CCR1 through CCR2
=> ~10Gbit/s throughput
Then I add on CCR2 a simple queue with 200Mbit/s Max Limit => CPU 100% on CCR2 for a while, than 0% for a while and after a minute 100% again…
There is no other config on the CCRs, just what I’ve written. So ONE queue with 200Mbit/s knocks out the CCR1072. Maybe the 1072 is not the right device for traffic shaping but since the MT switches for customers access can’t handle the shaping I’m forced to do it on the 1072.
+1
I have the same problem. Active 1 Gbit / s usage is available. When I try to add queues, the cpu becomes 100%. The device freezes and reboot itself auto after 2 minutes. There is a very serious problem.
We want a solution. Mikrotik, what is the solution?
Hi,
The strange thing is that with a CCR1036 I can apply up to 3 Gbps queues without issues and with less than 50% cpu usage, so the ccr1036 probably supports 6 Gbps queues without issues, but the CCR1072 can’t handle 1 Gbps queues