I have a CCR1036-8G-2S+ running as a PPPoE server for ±700 customers with external RADIUS authentication. Currently running v6.46.7
I have the majority of my customers on VLAN 106 with a PPPoE server configured on that interface. VLAN 106 designates FNO1
I have a second PPPoE server configured on VLAN 3638 currently servicing new customers on FNO2.
At random, customers on V3638 disconnect and then fail to re-connect. There is nothing in the logs showing login attempts or even failed login attempts, the only way to reconnect the customers is to disable and re-enable the PPPoE server on V3638. This could happen multiple times a day, or it will only happen once every two weeks - seems to be at random.
Customers on V106 with FNO1 do not have this problem at all, it is only customers on V3638.
The only difference in terms of configuration for the two PPPoE servers is:
The VLAN
V106 PPPoE server is configured on a bridge interface while V3638 PPPoE server is configured on the VLAN interface itself.
I’m not sure I understand the whole topology, i.e. how the physical interface(s), the bridge, the two VLANs, and the two PPPoE servers are linked together. Can you post an export of the configuration?
I have two VLANs running on SFPPLUS1 interface, one for FNO1 and one for FNO2. Each VLAN has it’s own PPPoE Server.
FNO1 is on V106 and FNO2 is on V3463.
V106 is bridged on it own with no other interface and the PPPoE Server is attached to the bridge
V3463 has the PPPoE Server attached directly to it.
The PPPoE Server on V3463 needs to be disabled and then re-enabled anytime a client disconnects (router reboot, etc) in order for the client to re-auth. No authentication failures or issues appear in the logs when this happens. cleanconfig.txt (14.4 KB)
If no failures appear in the log, it looks as if the PPPoE-discovery and PPPoE frames from the client didn’t reach the PPPoE server process; on the other hand, if disabling and re-enabling the server makes things work again, it seems that the issue is the process itself.
I cannot see any obvious mistakes, like sfp-sfpplus1 itself being a member port of a bridge and serving as the carrier interface for VLANs. So as a fast try I’d suggest you replicate the V106 topology also for V3463, i.e. insert a dedicated bridge between the /interface vlan and the server.
The longer path would be to run _/tool sniffer quick interface=sfp-sfpplus1-LAN 10G,SFP1+FF_PPPOE_V3463 mac-address=mac:of:****a:test:client while the test client is trying to connect, and see whether any PADI frames indeed arrive from the client and whether any PADO responses leave towards it.
The fast path, if it succeeds, will be just a workaround, because if it turns out to work, it means that the current setup should have worked too. So the next step would be to open a trouble ticket with Mikrotik.