PPPoE Failover Script Enhancement - Adding Internet Connectivity Verification

majestic · December 4, 2024, 9:37pm

Hello MikroTik Community,

I’ve developed a failover script for PPPoE connections that uses detect-internet, but I’m looking to enhance its reliability. Currently, the script checks if the PPPoE connection is established and uses detect-internet state, but I’ve identified a scenario where the interface shows as connected yet no traffic can pass through.

Current Script:

:local primaryState [/interface detect-internet state get [find name="pppoe0"] state]
:local backupState [/interface detect-internet state get [find name="ether2"] state]
:local powerState [/interface ethernet poe get ether2 poe-out]

# When primary WAN is down
:if ($primaryState != "internet") do={
    # Check if ether2 is not powered on
    :if ($powerState != "auto-on") do={
        /interface ethernet poe set ether2 poe-out=auto-on
        /ip firewall connection remove [ find ]
        /log info "IntelliFailover: PPPoE interface (pppoe0) is DOWN"
        /log info "IntelliFailover: Switching to backup (ether2) connection"
    }
} else={
    # When primary WAN is restored, check if both interfaces have internet
    :if ($primaryState = "internet" && $backupState = "internet") do={
        # Check if ether2 is powered on
        :if ($powerState != "off") do={
            /interface ethernet poe set ether2 poe-out=off
            /ip firewall connection remove [ find ]
            /log info "IntelliFailover: PPPoE interface (pppoe0) is RESTORED"
            /log info "IntelliFailover: Restoring primary (pppoe0) connection"
        }
    }
}

Current Behavior:

[] Script uses detect-internet state to check interface status
[] Manages PoE power for backup interface (ether2)
[] Clears existing connections when switching
[] Logs failover events

Issue: While the script works well in most scenarios, there are cases where the PPPoE interface might show as having internet (via detect-internet) but actually can’t pass traffic. I’m looking to add additional verification methods to make the failover more robust.

Desired Enhancements:

Additional connectivity checks beyond detect-internet state
Verification of actual traffic flow through the PPPoE interface
Potential implementation of multiple check methods before triggering failover

Questions:

What’s the most reliable method to verify actual internet connectivity through the PPPoE interface beyond detect-internet?
Would adding ping tests to multiple hosts ( like 1.1.1.1, 8.8.8.8 ) through the specific interface be beneficial?
Are there any recommended timings or thresholds for these additional checks to avoid false positives?
How can we implement these checks while maintaining the script’s current PoE management functionality?

Environment:

RouterOS version: 7.16.2
Hardware model: RB5009

Any guidance, especially regarding:

Implementation of multiple check methods
Best practices for timing and thresholds
Script optimization
Additional safeguards against false positives

Would be greatly appreciated.

Thank you in advance!

rextended · December 9, 2024, 9:34am

It’s a failure from the start using detect-internet, so I don’t even finish reading it, I don’t waste time on it.
It’s clear from the start that even using detect-internet successfully for this purpose, it ruins the functioning of other parts.
Sources: the forum.

majestic · December 9, 2024, 9:42am

Hi @rextended,

Thank you for mentioning this. To be honest I wasn’t sure if it was the best way to go, because the first way was using ping check inside the script but I was having problems using the respective interfaces to force the ICMP out of the right connection. Will do some more testing and thanks for your input.

rextended · December 9, 2024, 9:47am

Hypothesis 1:
The pppoe-client does not work, fine, what do you have in reserve?
It is all automatic, if configured, if the reserve is a DHCP-client, static route, or another pppoe-client.

Hypothesis 2:
The pppoe-client works, but the ISP does not work, what do you have in reserve?
as above, with netwatch and a route force an IP (as a start) that is only reachable by the pppoe-client and deactivate it if the remote IP no longer responds, the rest is automatic.

Hypothesis 3:
Everything works, except the method of checking if the ISP is working.
Do not check only one IP, but more than one, to be sure…

Hypothesis 4:
etc.
You didn’t specify what alternatives you have for failover…

majestic · December 9, 2024, 9:53am

Got ya. Will refactor/rework this and also look more at netwatch which is something I must admit I overlooked.