Page 1 of 1

High CPU load when PPPoE sessions disconnects

Posted: Tue Feb 07, 2017 7:27 pm
by rpra
I think this problem appeared in some recent versions of ROS.
It definitely exists on 6.35.4 and 6.38.1.

When we have network outages, some PPPoE sessions are disconnecting with 'peer not responding' errors.
In this moments CCR CPUs are 100% utilised, so than router almost stop passing any traffic.

This can continue for some minutes. Router looses OSPF neighbors, and it's all becomes a catastrophe!
It number of disconnecting sessions is over 200-300 it collapses 100%.

Seeing this on several different CCRs.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Jul 02, 2017 3:06 pm
by yovetal
Hello!

We have the same problem on our CCR-1036. Now we use RouterOS v.6.39.2. And the problem is still exist.

I created a TriubleTicket in Mikrotik Support half a year ago, but they still have not fond the solution.

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Jul 17, 2017 10:25 am
by supaul77
I am facing same Problem. Any solution ?

Re: High CPU load when PPPoE sessions disconnects

Posted: Wed Jul 19, 2017 12:18 am
by p3rad0x
Hi,

Are you using masquerade?

Re: High CPU load when PPPoE sessions disconnects

Posted: Tue Aug 22, 2017 10:39 pm
by stergulc
Any workaround?
We have same issue...

Re: High CPU load when PPPoE sessions disconnects

Posted: Tue Sep 26, 2017 10:48 pm
by ntmanxp
+1 for me

Re: High CPU load when PPPoE sessions disconnects

Posted: Thu Sep 28, 2017 4:49 pm
by tomaskir
If you are using Masquarade on the router, that is the problem.
When using Masquarade, RouterOS has to do full connection tracking recalculation on EACH interface connect/disconnect.

So if you have lots of PPPoE session connecting/disconnecting, connection tracking will constantly be recalculated which will can high CPU usage.

Solution:
Stop using Masquarade on routers that have a lot of dynamic interfaces.
Either use srcnat, or fix your architecture (use routing).

Re: High CPU load when PPPoE sessions disconnects

Posted: Fri Sep 29, 2017 4:10 pm
by ntmanxp
In my case, I do not use masquerading, just src-nat.
And even selecting sessions ( i.e. 20/30 of them ) that uses Public IP address from the pool, the CPU raises, and tooks a while to go down.
Months ago, I did use Interim Update. Was a disaster.Lot's of load to the CCR.
Adding 1+1, 'maybe' it's related to the accounting. Or the packets count that the sessions sent to the Radius ( the count process itself )
Let's think that the support/development staff can dig deeper and shows us a fix.

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Oct 02, 2017 10:05 am
by aacable
Facing same issue here as well. using following RB's (with 6.40.3)
RB3011xxx
CCR1016
CCR1036

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Oct 02, 2017 11:29 am
by Chupaka
Facing same issue here as well
same as what? do you use 'action=masquerade'?

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Oct 02, 2017 1:39 pm
by aacable
Yes Masquerade and Routing both in one box.
Scenario is something like below ...
4 wan dsl links configured with PCC using SRC-ADDRESS approach. for specific group of users.
1 wan link for public ip users routing. for users with public ips.

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Oct 02, 2017 1:56 pm
by Chupaka
So if you have the same issue, then the solution should also be the same: viewtopic.php?p=620765#p620765

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Oct 02, 2017 2:30 pm
by aacable
So if you have the same issue, then the solution should also be the same: viewtopic.php?p=620765#p620765
Noted. I saw that later. Will test it.

Re: High CPU load when PPPoE sessions disconnects

Posted: Tue Oct 03, 2017 4:34 pm
by rsvieira
Hello, we have the same problem, using SRC and not MASQUERADE. Already we see our architecture and we are not finding problem. In a box with 1000 session when falls between 100 or 200 session, falls all the routing and tunnels, CPU rises to 90 / 95%.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Oct 07, 2017 1:33 pm
by ViREnG
I have the same problem , and don't using any masquerade .

My pppoe clients are connected via VPLS and each vpls handle about 100~200 client and if one of them disconnect because of network fault , CPU will increase to 100% usage and freeze router .

ROS : 6.38.7
Devices : 3x CCR1036
Each Device handle about 1K PPPoE Client

* If I disable the connection tracking , problem will solve but then I can't use src-nat .

cpu_usage_pppoe_disconnect.png
anybody know what's the problem ?

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Oct 07, 2017 4:59 pm
by ViREnG
..
..
..
..
..
..
..
Hi, guys, take a look :)
Image

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Oct 07, 2017 11:55 pm
by mducharme
anybody know what's the problem ?
Are your clients using public IPs or private?

If your clients are on public IPs for the most part, you can have connection tracking turned off for some things and on for other things, controlling that with the Raw table.

If your clients are on private IPs, a good workaround might be to do the NAT on a separate router rather than the same device.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Oct 08, 2017 7:40 am
by ViREnG
anybody know what's the problem ?
If your clients are on public IPs for the most part, you can have connection tracking turned off for some things and on for other things, controlling that with the Raw table.
.
:D Yes , Yes :)
I must using action notrack ?
is it enough ?
/ip firewall raw
add action=notrack chain=prerouting src-address-list=public_pools

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Oct 08, 2017 7:51 am
by mducharme
anybody know what's the problem ?
If your clients are on public IPs for the most part, you can have connection tracking turned off for some things and on for other things, controlling that with the Raw table.
.
:D Yes , Yes :)
I must using action notrack ?
is it enough ?
/ip firewall raw
add action=notrack chain=prerouting src-address-list=public_pools
Yes, also note that if you want some contents of your public_pools to still be processed by connection tracking, you can "accept" that traffic above the notrack rule, accepted traffic is still processed by the main firewall as well, so accept doesn't mean "I trust this", it means "I want to track this"

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Oct 08, 2017 8:09 am
by ViREnG
Still have some connection with destination address of my public pools .
Do I need to turn of to dst-address of my public pools too ?

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Oct 08, 2017 8:12 am
by mducharme
Still have some connection with destination address of my public pools .
Do I need to turn of to dst-address of my public pools too ?
Yes probably.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Oct 08, 2017 8:56 am
by ViREnG
@mducharme Thanks , Problem Solved.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Dec 30, 2017 3:56 pm
by ntmanxp
Hi There

I was facing same problem.
I did 'no track' rule for both src address and dst address with the public ip pools.
No OSPF, no other routing protocols than the ones involved in PPP concentrator. Radius is authenticating users.
No using masquerading, just src-nat in case of the public IP pools get full, and yielding to private ones.
In case of private pools, the NAT is doing it with src-nat.

But anyway, If a do select for example 25 sessions, all public ip pool for example, and hit the disconnect, the CPU raises close to 90/95%, and the profile shows that firewall is taking most of it.
It tooks about 10/15 seconds to decrease to normal state.

In case of power blackout, then got massive disconnects. This cause a long time high cpu usage.
Then, the LCP echo of the PPP server got many timeouts of actual and running sessions ( the lucky ones with energy at home ), which leads to more disconnects for 'peer is not responding', which raise CPU usage, and this cause more 'peer is not responding'...and you can imagine the movie, right? Snowball!!!

If someone got an idea, I'll be glad to hear it.

Regards

Andres.-

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Dec 30, 2017 4:01 pm
by tomaskir
Just DO NOT use NAT on any routers that have high number of connecting/disconnecting interfaces.

Use basic networking principle of 'separation of concerns'.
Each device in your network should be responsible for one function - don't mix too many things into one device.

Place an additional router "in front" of the PPPoE concentrator, and do NAT there.
Problem solved.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Dec 30, 2017 5:31 pm
by ntmanxp
Hi Tomaskir
I do not have option to have more public IPs to give.
More users than available public IPs.
So must give them private IPs and NAT'em to navigate.

Besides, if you read the post, even with PUBLIC IP POOLs disconnects ( 25 at once for example), the CPU raises to the top.
NO matter if public pool, private pool, the CPU hit 100% on 20+ disconnects.
Although i'll like, no way to control the electricity of the town. So no way to control massive disconnects. :(

By this point of view, if the public IP disconnection raise CPU to 90/100%, no sense to add a point of failure natting between the router and the ppp concentrator.
First, the CPU raise must be close to zero, then I can think to solve nat.

Regards

Andres-.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Dec 30, 2017 6:46 pm
by tomaskir
It doesn't matter if the user has public or private IP, it's about interfaces.

When interfaces connect/disconnect, with combination with NAT, it gives you high CPU usage.
So simply eliminate NAT from that router.

Have a separate router "in front" of the PPPoE concentrator, that NATs the traffic from the private IPs.
Setup routing (even static routes) between the PPPoE concentrator and the new router.
Terminate public and private IPs on the PPPoE concentrator.

That way, you will not have CPU usage issues.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Dec 30, 2017 7:21 pm
by n21roadie
If you are using Masquarade on the router, that is the problem.
When using Masquarade, RouterOS has to do full connection tracking recalculation on EACH interface connect/disconnect.

So if you have lots of PPPoE session connecting/disconnecting, connection tracking will constantly be recalculated which will can high CPU usage.

Solution:
Stop using Masquarade on routers that have a lot of dynamic interfaces.
Either use srcnat, or fix your architecture (use routing).
Just curious - if static interfaces were also disconnecting/reconnecting would this also result in high CPU usage by connection tracking ?
Also if dynamic interfaces use more resources than static, is it possible to set PPPoE client interfaces as static?

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Dec 31, 2017 12:12 am
by tomaskir
Any interface connecting/disconnecting - does not matter if dynamic or static.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Dec 31, 2017 2:30 am
by sebastia

Re: High CPU load when PPPoE sessions disconnects

Posted: Thu Jan 04, 2018 12:06 pm
by deansouwit1996
Hi all..

My clients make use of a basic Mikrotik to PoE to Radio setup , however when viewing the logs on our NAS\High-Site, we notice that an error keeps appearing, it connects, then terminates and says "Peer not responding"
any idea as to what would cause this issue ?

Re: High CPU load when PPPoE sessions disconnects

Posted: Thu Jan 04, 2018 8:06 pm
by ntmanxp
Hi Tomaskir

Well, I did setup an CCR as NAT Server, doing all the stuff instead of doing it on PPPoE Concentrator.
Besides some turnarounds I did to solve misconfigured/hijacked DNS natting, and issues like that ( finally solved with mangle and routing mark ), finally disabled every ip-Firewall-NAT rule, and the disconnection ( tested with 900 sessions, public and private IPs ) worked like a charm.
CPU even notice the hit.
Radius ate 900 requests, and soon reconnected more than 600 sessions ( other absorbed by other PPP Concentrators ).

So, thanks all

Regards

Andres.-

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Jan 06, 2018 2:22 am
by jimi
Hi ntmanxp ,
I have the same problem with PPPoE disconnects , can you show me how did you configured CCR as NAT Server and other CCR as PPPoE Concentrator ?
thnx

Re: High CPU load when PPPoE sessions disconnects

Posted: Tue Jan 09, 2018 9:45 pm
by ntmanxp
HI there!

In the NAT server just added the IP address to be on the WAN side, and the IP address to be on the LAN side, paying attention of the mask to cover the network to be natted.
Then, adding the src-nat rules to do the thing.

/ip address
add address=181.xxx.yyy.235/24 comment="--- PPP1 ---" interface=WAN10 network=181.xxx.yyy.0
add address=172.221.1.1/16 comment="--- PPP1 Privada ---" interface=WAN10 network=172.221.0.0
#
/ip firewall nat
add action=src-nat chain=srcnat comment="--- Nateo PPP1 Privadas ---" src-address=172.21.0.0/20 to-addresses=181.xxx.yyy.235

Then on each PPP Server, uncomment every NAT rule. Then:

/ip firewall mangle
add action=mark-routing chain=prerouting comment="--- EnvĂ­o IPs-PRIVADAS al SVR-NAT ---" new-routing-mark=SVR-NAT passthrough=no src-address-list=IPs_Privadas

/ip route
add distance=1 gateway=181.xxx.yyy.235 routing-mark=SVR-NAT

This way, marking the routing to be natted, points to the NAT Server.
Wich do the NAT to the PPP Concentrator.
Both NAT an PPP on same WAN segment.

Regards

Re: High CPU load when PPPoE sessions disconnects

Posted: Thu Jan 11, 2018 6:12 am
by cicserver
Following is my current scenario. only 1 CCR which is configured as PPPoE Server. it have public ip interface and few dsl links too.
So all is running in one box. I am using mark routing to distinguish users , like if user have private ip he will nat via dsl, and if he has public ip, he will go via default route of public. now when pppoe users disconnects the CPU touches to 90-100, I understand its dueto interface status change, how can I introduce 2nd Mikrotik so that all natting should go there?
ccr.png

Re: High CPU load when PPPoE sessions disconnects

Posted: Sat Jan 13, 2018 3:20 pm
by aacable
@cicserver

I would suggest to mark private ip series and then make a route so they should ROUTE to 2nd router, where NAT will happen. you may need to create reverse route on 2nd router as well so communications between subnet works fine,

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Feb 11, 2018 1:59 pm
by rocknight13
If you are using Masquarade on the router, that is the problem.
When using Masquarade, RouterOS has to do full connection tracking recalculation on EACH interface connect/disconnect.

So if you have lots of PPPoE session connecting/disconnecting, connection tracking will constantly be recalculated which will can high CPU usage.

Solution:
Stop using Masquarade on routers that have a lot of dynamic interfaces.
Either use srcnat, or fix your architecture (use routing).
can you please give us how we can do that I am facing same problem
assume this
ISP--->Switch----->Real IP:Radius Server
------>Real IP:CCR1036(PPPOEserver---->NAT---->)EndUser
Where I have to add NAT server and how to configer it to work with PPPOE Server

Real IP:for Radius and CCR1036 with 212.XX.YY.ZZ/26
EndUserIP:172.18.255.0/16
Thanks in advance
regards

Re: High CPU load when PPPoE sessions disconnects

Posted: Thu Mar 15, 2018 9:29 am
by abaidwellnet
It doesn't matter if the user has public or private IP, it's about interfaces.

When interfaces connect/disconnect, with combination with NAT, it gives you high CPU usage.
So simply eliminate NAT from that router.

Have a separate router "in front" of the PPPoE concentrator, that NATs the traffic from the private IPs.
Setup routing (even static routes) between the PPPoE concentrator and the new router.
Terminate public and private IPs on the PPPoE concentrator.

That way, you will not have CPU usage issues.

Can Some One Explain Whole Situation with Configuration Guide... Because we also right now facing same problem...

Re: High CPU load when PPPoE sessions disconnects

Posted: Thu Mar 15, 2018 9:35 am
by abaidwellnet
We Setup Last month 1 CCR 1036 12G-4S
With load balancing or Natting in same Router...
Problem just Connecting or disconnecting PPPoE clients cause Cpu usages High 100%...

Now Yesterday we Purchase another CCR 1036 12G-4s...
We do not understanding how we do natting or load balancing in 2nd router...Or PPPOE users in 1st Router...


Can some Give Us Configurations with any Ip Series So we can adjust them with our Network...
Thanks...

Re: High CPU load when PPPoE sessions disconnects

Posted: Fri Mar 16, 2018 11:56 pm
by sindy
We Setup Last month 1 CCR 1036 12G-4S
With load balancing or Natting in same Router...
Problem just Connecting or disconnecting PPPoE clients cause Cpu usages High 100%...

Now Yesterday we Purchase another CCR 1036 12G-4s...
We do not understanding how we do natting or load balancing in 2nd router...Or PPPOE users in 1st Router...


Can some Give Us Configurations with any Ip Series So we can adjust them with our Network...
Thanks...
Have you read the previous posts of this topic above? Especially, have you noticed that most likely the use of masquerade to do src-nat is the reason of the problem? Are the addresses you get via PPPoE static or do they change over time?

Re: High CPU load when PPPoE sessions disconnects

Posted: Fri Apr 13, 2018 12:04 pm
by Kusaybia001
Also I face this problem terminating disconnecting peer is not responding

not in all routers
can any one help me?

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 2:00 pm
by rini
Hi

I was getting the same problem and I do NAT in different CCR and pppoe server in another CCR. All is fine but i have some problems to ping mikrotik devices behind pppoe server


I have a CCR1036-8G-2S+ configured it as pppoe server. Pppoe Clients are ok with ip range 10.10.0.0/24
On the interface that is pppoe server i have some CRS, Access point Mikrotik, AP UBIQUITI. (all those are in ip range 10.20.22.0/24)

example MIKROTIK AP 10.20.22.55 or 10.20.22.56 , UBIQUITI AP 10.20.22.65 or 10.20.22.66

I have a route where I route all this to nat server. I dont do mangle here or nat or filter. in nat section i have nothing

In route section is:

dst-address=0.0.0.0/0 gateway=NAT IP

On this pppoe server i can ping all my devivces mikrotik ap or ubiquiti. I can manage from here with telnet and ssh

Problem.

A client connected here to this pppoe server is getting ip 10.10.0.2. All is ok.

I Do traceroute www.google.com and it is shown

hop1: 192.168.2.1 internal ip of client router
hop2: 10.10.0.1
hop3: NAT IP
hop4: Gateway of my ISP Provider
......
......
.........GOOGLE IP

Internet is all ok on all ppppoe clients

But if i want to manage an Access point in my network i can only manage Ubiquity.
When i ping from client side i can ping ubiquiti but cant ping mikrotik devices.

I do traceroute from client and I get
for ubiquiti
Hop 1: 10.10.0.1
Hop 2: 10.20.22.65

for Mikrotik
Hop 1: 10.10.0.1
Hop 2: Timeout

But from this client i can ping all the other clients that have mikrotik cpe or other devices

Problem 2:

On the NAT side. I have only a firewall (masquerade). I dont have any filters or mangle. Nothing only masquerade.

In the route section i have the default gateway of my isp and static routes to see pppoe clients directly and static route for my ap

static route are

dst-address=0.0.0.0/0 gateway=ISP Gateway
dst-address=10.10.0.0/24 gateway=PPPOE SERVER IP
dst-address=10.20.22.0/24 gateway=PPPOE SERVER IP

I can ping all clients. when i want to ping my Access Points that are behind pppoe server I can ping and manage only ubiquity. I cant ping and manage mikrotik devices..



What i am missing ?

Best Regards

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 2:11 pm
by mkx
I'd say check FW rules on those Mikrotik devices.

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 9:56 pm
by rini
I'd say check FW rules on those Mikrotik devices.
Those are all configured in bridge mode.
I have created a bridge and put all ports to that bridge. There are no firewall at all

Do i have to create forward rules for icmp ??

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 11:01 pm
by sindy
Do these Mikrotik devices have any route configured on them? I.e. do they know where to send the icmp response?

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 11:13 pm
by rini
Do these Mikrotik devices have any route configured on them? I.e. do they know where to send the icmp response?
No. they dont have any firewall. Is only a bridge on ap

/interface bridge
add mtu=1500 name=bridge1 protocol-mode=none
/interface bridge port
add bridge=bridge1 interface=ether1
add bridge=bridge1 interface=wlan1

/ip address
add address=192.168.200.2/30 interface=bridge1 network=192.168.200.0

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 11:18 pm
by sindy
If they have no route, you can only access/ping them from their own subnet. They will bridge any traffic on L2, but if they have to send a packet anywhere outside 192.168.200.0/30, they need a route.

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 11:23 pm
by rini
If they have no route, you can only access/ping them from their own subnet. They will bridge any traffic on L2, but if they have to send a packet anywhere outside 192.168.200.0/30, they need a route.
Thanks for fast response.

can you write me an example how to add route there ?

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 11:29 pm
by rini
If they have no route, you can only access/ping them from their own subnet. They will bridge any traffic on L2, but if they have to send a packet anywhere outside 192.168.200.0/30, they need a route.
thank you.

problem solved
like you said.

they needed a route.

Now i can access them.

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Aug 27, 2018 11:31 pm
by sindy
/ip route add dst-address=0.0.0.0/0 gateway=192.168.200.1 (the other IP address in the only connected subnet which hopefully is a device closer to the machine from which you ping).

Re: High CPU load when PPPoE sessions disconnects

Posted: Thu Dec 13, 2018 2:14 pm
by okoun
It doesn't matter if the user has public or private IP, it's about interfaces.

When interfaces connect/disconnect, with combination with NAT, it gives you high CPU usage.
So simply eliminate NAT from that router.

Have a separate router "in front" of the PPPoE concentrator, that NATs the traffic from the private IPs.
Setup routing (even static routes) between the PPPoE concentrator and the new router.
Terminate public and private IPs on the PPPoE concentrator.

That way, you will not have CPU usage issues.
Masquerade is not a problem.
Generally, NAT is a problem.
Is there any rule exactly what makes the problem?
Should not Mikrotik exclude the dynamic interface to any exception, so that adding / removing does not affect the connection table in the firewall?

Separate pppoe and NAT solutions are nice but need to buy HW hard drives unnecessarily.

Re: High CPU load when PPPoE sessions disconnects

Posted: Mon Jan 14, 2019 6:48 pm
by josueflat
I think this problem appeared in some recent versions of ROS.
It definitely exists on 6.35.4 and 6.38.1.

When we have network outages, some PPPoE sessions are disconnecting with 'peer not responding' errors.
In this moments CCR CPUs are 100% utilised, so than router almost stop passing any traffic.

This can continue for some minutes. Router looses OSPF neighbors, and it's all becomes a catastrophe!
It number of disconnecting sessions is over 200-300 it collapses 100%.

Seeing this on several different CCRs.
I have de same Problem In CCR1072 V6.43.8

Re: High CPU load when PPPoE sessions disconnects

Posted: Wed Jan 16, 2019 4:02 pm
by AVA
We have also the same problem. I changed all masquerade rules to src nat rules but the CCR1036 (pppoe server with ~1000 sessions) went down again, when the router lost ~200 sessions due to a failing link... I decided to build up a ospf system with redundant pppoe server (2 routers without connection tracking) + 2 routers for bgp with connection tracking.

Re: High CPU load when PPPoE sessions disconnects

Posted: Sun Oct 06, 2019 4:15 am
by vipnet
The problem is queue use 100% cpu please fix