I have a MikroTik CCR2216 setup with the following configuration:
Two CCR2216 devices are used for PPPoE, each handling approximately 3000 customers.
NAT is managed by one CCR2216.
Issue:
I am experiencing high CPU load (>70%) on the CCR2216 that is handling NAT, which is impacting performance.
Questions:
1.What might be causing the high CPU load in this configuration?
2.Are there any recommended configurations or optimizations for handling such a large number of NAT connections?
3.Would it be advisable to distribute the NAT load between additional devices?
Additional Information:
Current RouterOS version.
Any relevant logs or CPU usage statistics.
@Znevna is right, professionals should know that PPPoE and NAT are never done on the same machine... it only takes one user losing connection, for whatever reason, and the whole machine slows down to clear the connection tables... that's why separate machines are used, so that a trivial flickering of some Ethernet doesn't block the connection of 6000 people for minutes...
While reading that post again, he does seem to have 2x CCR2116 for PPPoE and one extra CCR2116 for NAT. Either way, those customers deserve professional help.
I am looking for a solution to reduce CPU utilization. When CPU load increases, I observe packet loss and ping timeouts, which indicate network performance degradation.
Yep, but - at the moment - you are failing to clearly describe your setup, your post is ambiguous, from what you wrote you may have:
TWO CCR2216 devices in total, one managing 3000 users PPPoE and the other one managing 3000 PPPoE users AND managing NAT
THREE CCR2216 devices in total, one managing 3000 users PPPoE, the second one managing 3000 PPPoE users and the third one managing NAT
something else
Back to your questions:
Q1. What might be causing the high CPU load in this configuration?
A1. Something in configuration, assumed that it is not normal with so many customers.
Q2.Are there any recommended configurations or optimizations for handling such a large number of NAT connections?
A2. Maybe yes, maybe no.
Q3. Would it be advisable to distribute the NAT load between additional devices?
A3. Yes, meaning no , a machine doing NAT should ONLY do NAT (and needs to be powerful enough for the amount of all customers).
Official test results indicate that without L3 HW offload and without very carefully configured firewall things, this device is capable of dealing with anything between 11Gbps and 30Gbps. Could be you're hitting current performance plateau?
When thinking about L3HW, beware of limitations: 4.5k fasttrack connections and 4k NAT entries ... if only IPv4 is used. These numbers don't seem enough in your use case (2x3k subscribers).
I dont think it is not polite at all wasting our time that are here to help for free. If you think that people will stay here GUESSING because you dont post your config… you are wrong. See ya.