PPPoE reconnect script failed

I have a script that is scheduled to reconnect PPPoE every day at 00:05:00.
This script has worked for a long time without any issues, but today for some reason after disconnecting the connection never came back up.

This is the script:

/log info message="pppoe-reconnect-script start"
/interface pppoe-client disable pppoe-isp
:delay 3s
/ip firewall connection remove [find]
/interface pppoe-client enable pppoe-isp
/log info message="pppoe-reconnect-script done"

Here is the relevant log from today, notice the “pppoe-reconnect-script done” message was never logged.

May  1 00:05:00 router script,info pppoe-reconnect-script start
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: LCP close
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: LCP closed
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: CCP lowerdown
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: BCP lowerdown
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: BCP down event in starting state
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: IPCP lowerdown
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: IPCP closed
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: IPV6CP lowerdown
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: IPV6CP down event in starting state
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: MPLSCP lowerdown
May  1 00:05:00 router pppoe,ppp,debug,packet  pppoe-isp: sent LCP TermReq id=0x2e
May  1 00:05:00 router pppoe,ppp,debug,packet     administrator request
May  1 00:05:00 router pppoe,ppp,debug,packet  pppoe-isp: rcvd LCP TermAck id=0x2e
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: LCP lowerdown
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: CCP close
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: BCP close
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: IPCP close
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: IPV6CP close
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: MPLSCP close
May  1 00:05:00 router pppoe,ppp,info pppoe-isp: terminating... - administrator request
May  1 00:05:00 router pppoe,debug,packet ether5-gateway: sent PADT to <filtered>
May  1 00:05:00 router pppoe,debug,packet     session-id=0x1a1d
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: LCP lowerdown
May  1 00:05:00 router system,info device changed by admin
May  1 00:05:00 router pppoe,ppp,debug pppoe-isp: LCP down event in initial state
May  1 00:05:00 router pppoe,ppp,info pppoe-isp: disabled
May  1 00:05:00 router script,info pppoe interface down

Here is what should happen after this, but didn’t:

Apr 30 00:05:00 router system,info device changed by admin
Apr 30 00:05:04 router system,info device changed by admin
Apr 30 00:05:04 router script,info pppoe-reconnect-script done
Apr 30 00:05:04 router pppoe,ppp,info pppoe-isp: initializing...
Apr 30 00:05:04 router pppoe,ppp,info pppoe-isp: connecting...
...

Was there some kind of problem while clearing connections? Because there should be another “device changed by admin” message if that was completed.
Is there some way I should improve this script?
I ran the script manually a few times after this and every time the connection came back up.

RouterOS version was 6.46.5

That’s the most likely explanation. When printing the connection list periodically (/ip firewall connection print interval=1s), I sometimes get a message no such item(4) and the printing stops.


Consider running the connection cleanup this way or, if you use NAT on that interface anyway, use action=masquerade rather than action=src-nat. Whilst the popular belief is that the only purpose of masquerade is to automatically use the IP address dynamically assigned to the interface as to-addresses, there is also another feature associated - whenever that address is de-assigned (because the interface goes down) or changed, the tracked connections which use that address get removed automatically, i.e. you don’t need to do that using the script.

Or, if you actually do use action=masquerade, your script may have failed because the connections got removed automatically before your script could do it.

Just to illustrate what else can go wrong (I think in your case the diagnosis by sindy is likely correct), I have a problem in my router at home:

I am connected to a VDSL network which is provided by a telecom provider, and which connects to several ISPs.
Apparently inside this VDSL network there are routers that add identifying information about my connection, lookup what ISP I have subscribed to, and send the data to the correct ISP.
These routers keep state information. When there is some interruption in the internal network of that telecom provider, it can happen that the PPPoE connection gets disconnected but those routers do not understand that. From then on, it is impossible to re-establish PPPoE. It will just go on and try and always fail.
When I disable the connection for a couple of minutes, it will work again. It will also work again when I reset the VDSL modem (not due to the time it takes to train, but this VDSL carrier loss is correctly communicated to the telecom provider network and they reset their session state).
So as a workaround I have made a script like yours, but it only becomes active when the PPPoE already has failed, and it disables it for 5 minutes and re-enables it.
This triggers maybe once in 3 months but at least my connection always comes back up without manual intervention.