I run a network with nearly 350 clients connected via PPPoE over EOIP.
I run a fully routed underlying network with a bunch of EOIP tunnels, one to each AP
These tunnels are bridged to the customer wireless interfaces on the APs, and on the server side they are bridged together in groups based on area, with one PPPoE server per bridge, a total of 8 servers.
About half of the clients are connected through one single licensed link to a second “main tower”, from which the connections are spread out to the Aps.
The last few days, and especially tonight I have experienced sudden disconnects of all clients connected over this link, a total of about 150 clients.
They disconnect with the message “terminating..peer is not responding”. No other clients go down, i.e. all clients connected through other links from the server to other areas stay connected.
I have a lot of watchdogs for monitoring power, none of these watchdogs go down along with the clients, meaning that the underlying routed network must be ok.
Can you please help me on this one?
There must be something about either the EOIP tunnels or PPPoE.
just a blind guess: Do you use connection tracking / stateful firewalling for your underlaying network together with dynamic routing?
I ask, because the dnymic routing is agnostic regarding the connection states in firewall. Think about this scenario: EoIP tunnel gets established over a specific route, the device inbetween tracking the connection recognize it as a “new” connection, everythig will be fine. Suddenly, this route is no longer available and EoIP packets get routed over a different route. Because these packets are not “new” in the sense of connection tracking, they will propably beeing dropped as “invalid”, therefore your EoIP tunnel (and of course your PPPoE session) will drop.
In any case, I would recommend using MPLS/VPLS instead of EoIP in this case.
Ape:
Thanks for your reply and sorry for late reply from me.
Somehow a reboot of various routerboard in the “path” solved the problem, now the PPPoE connections are rock solid
I don’t know why, would I be better off using static routing?
Problem is that there will be a very large amount of static routes for the /32 loopback addresses, therefore I tried OSPF.