pppoe connection was already active closing previous one

oxigeno20 · Tue Nov 23, 2021 11:24 pm

Hi, this is the part 2 of the post: viewtopic.php?p=892815 Because its marked as Solved.

I have a CCR 1072 with almost 1600 PPPoE tunnels with Public Address and Private with IPv6 Dual stack.
I have been using the same version of RouterOS for a long time.
We have the whole network running with VLAN; every PPPoE Server running over a diferent VLAN.
The memory and CPU resources are really low.
Everyting was going perfectly fine, and a few weeks ago we started with this unexpected bulk randomly disconections.
I updated our routerOS from 6.48 to 6.48.5, but at least once a day this problems apears again.
Could be nice hace an answer from Mikrotik staff

It's quite strange that when 200 customers disappear, the graphs shows a total cut like the whole network haven't internet.

oxigeno20 · Wed Nov 24, 2021 2:18 pm

I think may be this problem started 1 month and a half ago when I changed the PPP Secrets to migrate from Public Static IP to Private Static with IPV6 Dual Stack.
My CPU Changed a little bit from that point, but anyway i wonder if the using of netmap (or the increasing CPU usage) could be related with these massive randomly dissconections.

CPU Usage

oxigeno20 · Wed Nov 24, 2021 4:43 pm

In another posts I had read that the solution could be do NAT in another Router. It not seems to be a professional solution. Be supposed 1072 is big enough to attend all the petitions.

Maggiore81 · Mon Jan 03, 2022 7:10 pm

Well, you ask question, then by yourself give answers... so what is the purpose of the forum?
You didnt even share your configuration to check if everything is ok...

advaitha · Wed Jan 05, 2022 5:21 pm

Hello All,
I have configured by RG750 with LAN Pool and VPN Pool. However, when we connect to VPN the IP is showing as per LAN but gateway is 0.0.0.0 and subnet mask is showing as 255.255.255.255. We require subnet must be 255.255.255.0 like our LAN. Pl. advice.
Thanking you in advance.

harjeetv · Thu Feb 03, 2022 11:18 am

Doing NAT in other router doesn't do any good. We are facing same issue with this setup.

sindy · Thu Feb 03, 2022 12:41 pm

Doing NAT (in fact, anything related to connection tracking) on another router addresses just one possible source of CPU load, but there are other sources too.

The real issue is that tearing down and establishing a PPPoE connetion is a CPU-intensive task, and that there is apparently no prioritization of processing of PPPoE connection control packets as compared to other traffic. So when "something big" happens, that affects processing of connection control packets so much that the connections are considered dead, the teardown process begins, loads the CPU even more, so even more connections are considered dead, so ultimately all connections go down and start re-establishing slowly. And if the CPU load is close to the edge, that "something big" may not be that big at all, it is just big enough to push the load over the edge and the self-locking effect takes care of the rest.

So moving "NAT" away from the PPPoE server just lowers the base load and leaves more space for some traffic spikes to be handled without hitting disaster threshold. However, if you move "NAT" away from the PPPoE server but keep connection tracking active on it, you gain nothing as the real load associated to NAT consists in matching every single packet to the full list of existing connections, regardless whether the connection to which the packet belongs is actually NATed or not - the rewriting of the packet source and/or destination address takes much less CPU than the search.

In specific environments, that "something big" may be the removal of masqueraded connections from connection tracking, but it only happens when the IP address, to which these connections are masqueraded, changes. So a teardown of a single PPPoE connection triggers that removal at the client, not at the server (unless the client acts as an uplink of the server but that doesn't happen in real world deployments). At the server side, it only happens in multi-WAN environments with masquerade rules, where a WAN interface going down triggers the removal of masqueraded connections, but that's a special case and should normally not happen unless your WAN IPs are dynamic and you thus cannot use a plain src-nat instead of masquerade.

Also, people sometimes forget it is necessary to add blackhole routes towards the subnets from which the PPPoE clients get their address assignments. So when a client disconnects, responses to connections open by this client keep coming, but as the client is down, the connected /32 route to its address doesn't exist any more, and the packets are sent down the default route, which means back to the upstream router. So depending on their TTL, their keep circulating between the PPPoE server and the upstream router, adding extra load to the CPU. A blackhole route to the pool subnet, with distance higher than 0, makes sure that this does not happen.

Yelyah · Fri Sep 08, 2023 4:47 pm

Hi, it's been a long time spent a month looking for answers for this problem.

Has anyone found a solution for this problem?

Need help

pppoe connection was already active closing previous one

pppoe connection was already active closing previous one

Re: pppoe connection was already active closing previous one

Re: pppoe connection was already active closing previous one

Re: pppoe connection was already active closing previous one

Re: VPN connectivity having different subnet mask

Re: pppoe connection was already active closing previous one

Re: pppoe connection was already active closing previous one

Re: pppoe connection was already active closing previous one

Who is online