CCR1072 running out of CPU, what next for a PPPoE ISP?

Hi Michael - we are currently testing a split-duties configuration on one of our smaller networks. In essence, we will use a 1036 as NAT/router, and one or more 1016s as PPPoE concentrators enforcing rate limits via queues. We can thus scale with more 1016s (cheaper) as required.

I’ll post here when we see results, the next step will be to implement on our network with 1500+ CPEs.

A couple of mitigating actions we have taken, that have helped reduce the duration of “flaps” and service loss from 15-20 minutes to around 2 minutes:

  1. Reduce all the connection tracking timeouts to the bare minimum, see ours:

Screen Shot 2020-12-02 at 09.42.08.png
2. Offload DNS to an external server (eg. Bind9 or Unbound), thus liberating the CCR.
3. RADIUS timeout to 1000ms, and go with RADSEC instead of UDP. We find TCP a lot more solid at times of reconnection floods.
4. Make sure your RADIUS server is tuned to cope with large peak request volumes.

Good luck!