Script for recursive failover based on packet loss

Recursive failover has been working perfect for my connection with 2 ISPs. Last week my main ISP experienced intermittent issues. You could still ping but it was intermittent with latency spikes. I ended up having to disable (renumber) the route manually.

How would I apply recursive failover by checking for packet loss?

I suppose the script logic would be:

  1. Check 8.8.8.8 (recursive for ISP1) & 8.8.4.4 (recursive for ISP2)
  2. Ping 8.8.8.8 or 8.8.4.4 ten times
  3. If packet loss >=20% then find route by comment and disable
  4. If packet loss <20% then find route by comment and enable

Are there past topics that you can refer me to.

Thank you.

That logic would not work. Once you disable a route you can’t check whether it is back to normal.

Updated logic:

  1. Ping 8.8.8.8 (recursive for ISP-1) 20 times

  2. If packet loss = 0%,
    do nothing

  3. Else If packet loss >= 20% and route to ISP-1 (find by comment) distance = 1,
    find route to ISP-1 (find by comment) and change distance to 91

  4. Else if packet loss = 0% and route to ISP-1 distance != 1,
    find route to ISP-1 and change distance to 1

I would then have a similar script for ISP-2 except the distances would be different.

Do you agree with the script logic?

FYI this is the relevant screen in fortigate to achieve the above.
Fortigate SLA Config.jpeg

You could always do something like this:

:if ([/ping XXX.XXX.XXX.XXX count=100] = 0) do={

you can change it to greater than or less than, so if you send 100 pings and the returned packets are 95 or less you can do something.

I think i saw something about ROS v7 having a new netwatch tool that can do packet loss detection.

:local "stat_period" (5*60);
:local "traffic";
:local "tx-packets-per-second";
:local "rx-packets-per-second";
:local "packet_tx_drops_rate";
:local "packet_rx_drops_rate";
#:local "packet_tx_errors_rate";
#:local "packet_rx_errors_rate";

:for "count" from=1 to=$"stat_period" do={
 :set $"traffic" [/interface/monitor-traffic interface=current_wan as-value once];
 :set $"packet_tx_drops_rate" ($"packet_tx_drops_rate"+($"traffic"->"tx-queue-drops-per-second"));
 :set $"packet_rx_drops_rate" ($"packet_rx_drops_rate"+($"traffic"->"rx-drops-per-second"));
# :set $"packet_tx_errors_rate" ($"packet_tx_errors_rate"+($"traffic"->"tx-errors-per-second"));
# :set $"packet_rx_errors_rate" ($"packet_rx_errors_rate"+($"traffic"->"rx-errors-per-second"));
 :set $"tx-packets-per-second" ($"tx-packets-per-second"+($"traffic"->"tx-packets-per-second"));
 :set $"rx-packets-per-second" ($"rx-packets-per-second"+($"traffic"->"rx-packets-per-second"));

 :delay 1s;
}

:set $"packet_tx_drops_rate" (($"packet_tx_drops"/$"stat_period")*100)/($"tx-packets-per-second");
:set $"packet_rx_drops_rate" (($"packet_rx_drops"/$"stat_period")*100)/($"rx-packets-per-second");
#:set $"packet_tx_errors_rate" ($"packet_tx_errors_rate"/$"stat_period");
#:set $"packet_rx_errors_rate" ($"packet_rx_errors_rate"/$"stat_period");

than evaluate failover criteria based on collected values and switch gateway by manipulating routing table…

i think new implementation of netwatch in Routeros 7.4 and later can help

https://help.mikrotik.com/docs/display/ROS/Netwatch
netwatch.png

ashpri, take a look: https://github.com/vikilpet/mikrotik-interface-check