VRRP and ISP Failover

Morning,
I would need help for your great expertise.
I’m setting-up two RB5009 routers in VRRP active balanced configuration, so that means both of them are active at the same time.
this is the summary setup description (currently running ROS 7.8 )

router 1:

  • ether1 pppoe ISP1 (2.5 gbps) (distance 1)
  • ether2: backup LTE connection (dhcp client with an LTE modem) (distance 3)
  • 4 VLANS (vlans 1 & 2 acting as VRRP master and vlans 3 & 4 as backup with router 2) all 4 have dhcp servers
  • sfp-sfpplus1 trunk with the 4 VLANS → mikrotik switch CRS305 (sfp-sfpplus1 in trunk)

router 2:

  • ether1 pppoe ISP2 (2.5 gbps) (distance 1)
  • ether2: backup LTE connection (dhcp client with the same LTE modem, as above) (distance 3)
  • 4 VLANS (vlans 1 & 2 acting as VRRP backup and vlans 3 & 4 as master with router 1) all 4 have dhcp servers
  • sfp-sfpplus1 trunk with the 4 VLANS → mikrotik switch CRS305 (sfp-sfpplus2 in trunk)

With regard to the firewall filter rules I followed this guide (https://forum.mikrotik.com/viewtopic.php?t=180838) and the approach I followed, as described in this guide is that I accept what I need to accept and at the end of both input and forward chains, I drop everythting else.

I connect test PCs on the other available sfp+ ports on the CRS305 in access mode, for testing.
what is above described, seems to work properly. If I switch-off one of the two routers, the other one takes over promptly all the 4 vlans and when it is back the 4 vlans are managed between the two routers as above described.

I’m currently having issues and struggling in managing failover of the 2 ISPs lines between the two routers.
Following this guide (https://forum.mikrotik.com/viewtopic.php?t=182373), I setup dual wan recursive (using 2 recursive routes - flat) rules to manage the two ISP failover.
That means that if ISP1 goes down, router 1 should use (distance 2) the ISP2 from the router 2 and viceversa. See below.

router #01 routes definition

/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment=“WAN1 ISP1 via PPPoE - ping host 1”

+++++++++++++++++++

add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment=“WAN1 ISP1 via PPPoE - ping host 2”

+++++++++++++++++++

add distance=3 dst-address=0.0.0.0/0 gateway=ROUTER2-GW-LAN-IP scope=3 target-scope=30 comment=“ISP2 via Backup Router”

+++++++++++++++++++

add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src=“”
routing-table=main suppress-hw-offload=no comment=“4G/LTE ISP via ether2”

router #02 routes definitions

/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment=“WAN1 ISP2 via PPPoE - ping host 1”

+++++++++++++++++++

add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment=“WAN1 ISP2 via PPPoE - ping host 2”

+++++++++++++++++++

add distance=3 dst-address=0.0.0.0/0 gateway=ROUTER1-GW-LAN-IP scope=3 target-scope=30 comment=“ISP1 via Backup Router”

+++++++++++++++++++

add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src=“”
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2

The problem I’ve is that in case router 1 has to use the ISP2 connection established by router 2 as failover (the same is true viceversa), than I discovered the firewall filter rule that in the forward chain should drop invalid packets (chain=forward action=drop connection-state=invalid) drops all the packets related to that connection via the other router. Probably this is happening since these packets are considered invalid since they are not coming from new connections properly established. To avoid this I’ve to put before the drop invalid packets rule a couple of rules to accpet such packets (in-interface-list=VRRP out-interface=VLAN and out-interface=pppoe-out). Nevertheless I’m not sure this is the proper way to do it and I’m not generating strange things like loops, traffic overhead and/or any other unwanted unknown side effects.

I’ve read that probably the good thing to do is to use a src-nat masquerade nat rule, but I’ve no idea how to do this. I googles a lot within the forum posts but I was not able to understand how to do this.

So I would greatly appreciate your precious help.
Thank you in advance.

Please could try to help me?
Thank you.

The quick suggestion is to make sure the VRRP interfaces are in the LAN and the VRRP ip address is a /32. As to what you think WAN/failover and VRRP will do with PPPoE and LTE, I’m not sure.

You should post your entire config as VRRP and ISP/LTE failover are exactly related.

For each VRRP interface one is always active and the others are backup. There is no “both of them are active” with VRRP. Now each VRRP interface can live on different routers.

Here is the config for the router 01

apr/26/2023 19:16:30 by RouterOS 7.8

software id = XXX

model = RB5009UG+S+

/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=
admit-only-vlan-tagged name=bridge protocol-mode=none pvid=20
vlan-filtering=yes
/interface vlan
add interface=bridge name=GUEST_VLAN vlan-id=3090
add interface=bridge name=MGMT_VLAN vlan-id=20
add interface=bridge name=VLAN5 vlan-id=5
add interface=bridge name=VLAN10 vlan-id=10
add interface=ether1 name=VLAN835 vlan-id=835
/interface pppoe-client
add disabled=no interface=VLAN835 name=pppoe-out user=XXX
/interface vrrp
add interface=VLAN5 name=vrrp5 vrid=5
add interface=VLAN5 name=vrrp6 priority=254 vrid=6
add interface=VLAN10 name=vrrp10 vrid=10
add interface=VLAN10 name=vrrp11 priority=200 vrid=11
add interface=MGMT_VLAN name=vrrp20 vrid=20
add interface=MGMT_VLAN name=vrrp21 priority=200 vrid=21
add interface=GUEST_VLAN name=vrrp3090 vrid=90
add interface=GUEST_VLAN name=vrrp3091 priority=254 vrid=91
/interface list
add comment=defconf name=WAN
add comment=defconf name=LAN
add name=VRRP
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/ip pool
add name=MGMT_POOL ranges=XX.YY.72.201-XX.YY.72.250
add name=VLAN5_POOL ranges=XX.YY.70.201-XX.YY.70.250
add name=VLAN10_POOL ranges=XX.YY.71.231-XX.YY.71.250
add name=GUEST_POOL ranges=192.168.ZZ.101-192.168.ZZ.150
/ip dhcp-server
add address-pool=MGMT_POOL comment=“DHCP Server”
interface=MGMT_VLAN name=MGMT_DHCP
add address-pool=VLAN5_POOL comment=“DHCP Server” interface=VLAN5
name=VLAN5_DHCP
add address-pool=VLAN10_POOL comment=“DHCP Server” interface=VLAN10 name=
VLAN10_DHCP
add address-pool=GUEST_POOL comment=“Guest DHCP Server” interface=GUEST_VLAN
name=GUEST_DHCP
/user group
add name=remote policy=“ssh,read,write,!local,!telnet,!ftp,!reboot,!policy,!te
st,!winbox,!password,!web,!sniff,!sensitive,!api,!romon,!rest-api”
/interface bridge port
add bridge=bridge comment=defconf frame-types=
admit-only-untagged-and-priority-tagged interface=ether3 pvid=5
add bridge=bridge comment=defconf frame-types=
admit-only-untagged-and-priority-tagged interface=ether4 pvid=10
add bridge=bridge comment=defconf frame-types=
admit-only-untagged-and-priority-tagged interface=ether5 pvid=3090
add bridge=bridge comment=defconf interface=ether6
add bridge=bridge comment=defconf interface=ether7
add bridge=bridge comment=“Management Access Port” frame-types=
admit-only-untagged-and-priority-tagged interface=ether8 pvid=20
add bridge=bridge comment=defconf frame-types=admit-only-vlan-tagged
interface=sfp-sfpplus1
/ip neighbor discovery-settings
set discover-interface-list=LAN
/ip settings
set rp-filter=loose
/ipv6 settings
set disable-ipv6=yes
/interface bridge vlan
add bridge=bridge comment=“VLAN 5” tagged=bridge,sfp-sfpplus1
untagged=ether3 vlan-ids=5
add bridge=bridge comment=“VLAN 10” tagged=bridge,sfp-sfpplus1 untagged=
ether4 vlan-ids=10
add bridge=bridge comment=“VLAN 20” tagged=
bridge,sfp-sfpplus1 untagged=ether8 vlan-ids=20
add bridge=bridge comment=“VLAN Guest” tagged=bridge,sfp-sfpplus1 untagged=
ether5 vlan-ids=3090
/interface list member
add comment=defconf interface=bridge list=LAN
add comment=defconf interface=ether1 list=WAN
add interface=ether2 list=WAN
add interface=pppoe-out list=WAN
add interface=vrrp5 list=VRRP
add interface=vrrp6 list=VRRP
add interface=vrrp10 list=VRRP
add interface=vrrp11 list=VRRP
add interface=vrrp20 list=VRRP
add interface=vrrp21 list=VRRP
add interface=vrrp3090 list=VRRP
add interface=vrrp3091 list=VRRP
/ip address
add address=XX.YY.72.111/24 comment=“VLAN Gateway” interface=
MGMT_VLAN network=XX.YY.72.0
add address=XX.YY.70.111/24 comment=“VLAN Gateway” interface=VLAN5
network=XX.YY.70.0
add address=XX.YY.71.111/24 comment=“VLAN Gateway” interface=VLAN10
network=XX.YY.71.0
add address=192.168.ZZ.253/24 comment=“VLAN Guest Gateway” interface=
GUEST_VLAN network=192.168.ZZ.0
add address=XX.YY.72.115 interface=vrrp20 network=XX.YY.72.115
add address=XX.YY.72.116 interface=vrrp21 network=XX.YY.72.116
add address=XX.YY.70.115 interface=vrrp5 network=XX.YY.70.115
add address=XX.YY.70.116 interface=vrrp6 network=XX.YY.70.116
add address=XX.YY.71.115 interface=vrrp10 network=XX.YY.71.115
add address=XX.YY.71.116 interface=vrrp11 network=XX.YY.71.116
add address=192.168.ZZ.1 interface=vrrp3090 network=192.168.ZZ.1
add address=192.168.ZZ.2 interface=vrrp3091 network=192.168.ZZ.2
/ip dhcp-client
add interface=ether2
/ip dhcp-server network
add address=XX.YY.70.0/24 dns-server=XX.YY.70.111,1.1.1.1,8.8.8.8 gateway=
XX.YY.70.116
add address=XX.YY.71.0/24 dns-server=XX.YY.71.111,1.1.1.1,8.8.8.8 gateway=
XX.YY.71.115
add address=XX.YY.72.0/24 dns-server=XX.YY.72.111,1.1.1.1,8.8.8.8 gateway=
XX.YY.72.115
add address=192.168.ZZ.0/24 dns-server=192.168.ZZ.253,1.1.1.1,8.8.8.8
gateway=192.168.ZZ.1
/ip dns
set allow-remote-requests=yes servers=1.1.1.1,8.8.8.8
/ip dns static
add address=XX.YY.72.115 comment=“Secured / Management Network” name=
router.lan
add address=159.148.172.226 name=upgrade.mikrotik.com
/ip firewall filter
add action=accept chain=input comment=
“defconf: accept established,related,untracked” connection-state=
established,related,untracked
add action=drop chain=input comment=“defconf: drop invalid” connection-state=
invalid log=yes log-prefix=“*** drop invalids "
add action=accept chain=input comment=“accept vrrp packets” protocol=vrrp
add action=accept chain=input comment=“defconf: accept ICMP” disabled=yes
protocol=icmp
add action=accept chain=input comment=
“allow VLAN 5 only (inter-vlan is blocked)” dst-address=XX.YY.70.0/24
src-address=XX.YY.70.0/24
add action=accept chain=input comment=
“allow VLAN 10 only (inter-vlan is blocked)” dst-address=XX.YY.71.0/24
src-address=XX.YY.71.0/24
add action=accept chain=input comment=
“allow VLAN 20 only (inter-vlan is blocked)” dst-address=
XX.YY.72.0/24 src-address=XX.YY.72.0/24
add action=accept chain=input comment=
“allow GUEST VLAN 3090 only (inter-vlan is blocked)” disabled=yes
dst-address=192.168.ZZ.0/24 src-address=192.168.ZZ.0/24
add action=accept chain=input comment=“"defconf: accept local loopback (for D
ude, RADIUS, user-manager, CAPsMAN, Wireguard) (https://forum.mikrotik.com
/viewtopic.php?t=180838)” dst-address=127.0.0.1
add action=reject chain=input comment="
TBC LOGGING *** optional → useful
_but only if interested in tracking LAN issues (https://forum.mikrotik.co
m/viewtopic.php?t=180838) - The purpose of the action=reject rule is to p
revent users in LAN from waiting for tens of seconds to get a timeout if t
hey are trying to connect to forbidden destinations, and of course for the
_admin to be aware of traffic that has the potential to be a problem (aka
_pinpoint device with issues).” in-interface-list=LAN log=yes
log-prefix=“*** TRACKING LAN ISSUES " reject-with=
icmp-admin-prohibited
add action=drop chain=input comment=“block everything else”
add action=drop chain=input comment=“defconf: drop all not coming from LAN”
disabled=yes in-interface-list=!LAN
add action=accept chain=forward comment=“defconf: accept in ipsec policy”
ipsec-policy=in,ipsec
add action=accept chain=forward comment=“defconf: accept out ipsec policy”
ipsec-policy=out,ipsec
add action=fasttrack-connection chain=forward comment=“defconf: fasttrack”
connection-state=established,related hw-offload=yes
add action=accept chain=forward comment=
“defconf: accept established,related, untracked” connection-state=
established,related,untracked
add action=accept chain=forward comment=“need this rule to manage the ISP fail
over on the other VRRP router, otherwise these packets will be discarded a
s invalid by the next rule.” in-interface-list=VRRP out-interface=
MGMT_VLAN
add action=accept chain=forward comment=“need this rule to manage the ISP fail
over on the other VRRP router, otherwise these packets will be discarded a
s invalid by the next rule.” in-interface-list=VRRP out-interface=
pppoe-out
add action=drop chain=forward comment=“defconf: drop invalid”
connection-state=invalid log=yes log-prefix="
drop invalid "
add action=accept chain=forward comment=“allow internet traffic (all vrrp inte
rfaces) - non presente in RB5009 default, aggiunto da CCR2216 (che usava i
nvece all-vlan).” in-interface=all-vlan out-interface-list=WAN
add action=accept chain=forward comment=“allow port forwarding
\ (https://forum.mikrotik.com/viewtopic.php?t=180838)
" connection-nat-state=dstnat disabled=yes
add action=reject chain=forward comment=”
TBC LOGGING *** optional → usef
ul for tracking LAN issues - in most installations the rule doesn’t have t
o care about multicast traffic because it never sees it (https://forum.mik
rotik.com/viewtopic.php?t=180838) - The purpose of the action=reject rule
_is to prevent users in LAN from waiting for tens of seconds to get a tim
eout if they are trying to connect to forbidden destinations, and of cours
e for the admin to be aware of traffic that has the potential to be a prob
lem (aka pinpoint device with issues).” dst-address=!0.0.0.0/0
in-interface-list=LAN log=yes log-prefix=“*** TRACK LAN ISSUES "
reject-with=icmp-admin-prohibited
add action=drop chain=forward comment=“defconf: drop all from WAN not DSTNATed
_- drop access to clients behind NAT from WAN - drops all new connection
attempts from the WAN port to our LAN network (unless DstNat is used). Wit
hout this rule, if an attacker knows or guesses your local subnet, he/she
can establish connections directly to local hosts and cause a security thr
eat.” connection-nat-state=!dstnat connection-state=new
in-interface-list=WAN
add action=drop chain=forward comment=“block everything else - non presente in
_RB5009 default” log-prefix="
blocked fwd ***”
/ip firewall nat
add action=masquerade chain=srcnat comment=“defconf: masquerade”
ipsec-policy=out,none out-interface-list=WAN
/ip route
add comment=“WAN1 ISP1 via PPPoE” disabled=yes distance=1 dst-address=
0.0.0.0/0 gateway=pppoe-out pref-src=“” routing-table=main scope=30
suppress-hw-offload=no target-scope=10
add comment=“4G/LTE ISP via ether2” disabled=yes distance=2 dst-address=
0.0.0.0/0 gateway=ether2 pref-src=“” routing-table=main scope=30
suppress-hw-offload=no target-scope=10
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1
scope=10 target-scope=12
add comment=“WAN1 ISP1 via PPPoE - ping host 1” distance=1 dst-address=
1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9
scope=10 target-scope=12
add comment=“WAN1 ISP1 via PPPoE - ping host 2” distance=2 dst-address=
9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11
add comment=“ISP2 via Backup Router” disabled=no distance=3 dst-address=
0.0.0.0/0 gateway=XX.YY.72.112 pref-src=“” routing-table=main scope=3
suppress-hw-offload=no target-scope=30
add comment=“4G/LTE ISP via ether2” disabled=no distance=4 dst-address=
0.0.0.0/0 gateway=ether2 pref-src=“” routing-table=main
suppress-hw-offload=no
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www disabled=yes
set www-ssl certificate=Webfig disabled=no
set api disabled=yes
/system clock
set time-zone-name=Europe/Rome
/system identity
set name=“MikroTik RB5009 #01
/system ntp client
set enabled=yes
/system ntp client servers
add address=194.0.5.123
add address=216.239.32.15
/tool mac-server
set allowed-interface-list=none
/tool mac-server mac-winbox
set allowed-interface-list=LAN


and here the config for the router 02

apr/26/2023 19:17:17 by RouterOS 7.8

software id = XXX

model = RB5009UG+S+

/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=
admit-only-vlan-tagged name=bridge protocol-mode=none pvid=20
vlan-filtering=yes
/interface vlan
add interface=bridge name=GUEST_VLAN vlan-id=3090
add interface=bridge name=MGMT_VLAN vlan-id=20
add interface=bridge name=VLAN5 vlan-id=5
add interface=bridge name=VLAN10 vlan-id=10
add interface=ether1 name=VLAN835 vlan-id=835
/interface pppoe-client
add disabled=no interface=VLAN835 name=pppoe-out user=XXX
/interface vrrp
add interface=VLAN5 name=vrrp5 priority=200 vrid=5
add interface=VLAN5 name=vrrp6 vrid=6
add interface=VLAN10 name=vrrp10 priority=254 vrid=10
add interface=VLAN10 name=vrrp11 vrid=11
add interface=MGMT_VLAN name=vrrp20 priority=254 vrid=20
add interface=MGMT_VLAN name=vrrp21 vrid=21
add interface=GUEST_VLAN name=vrrp3090 priority=200 vrid=90
add interface=GUEST_VLAN name=vrrp3091 vrid=91
/interface list
add comment=defconf name=WAN
add comment=defconf name=LAN
add name=VRRP
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/ip pool
add name=MGMT_POOL ranges=XX.YY.72.201-XX.YY.72.250
add name=VLAN5_POOL ranges=XX.YY.70.201-XX.YY.70.250
add name=VLAN10_POOL ranges=XX.YY.71.231-XX.YY.71.250
add name=GUEST_POOL ranges=192.168.ZZ.101-192.168.ZZ.150
/ip dhcp-server
add address-pool=MGMT_POOL comment=“DHCP Server”
interface=MGMT_VLAN name=MGMT_DHCP
add address-pool=VLAN5_POOL comment=“DHCP Server” interface=VLAN5
name=VLAN5_DHCP
add address-pool=VLAN10_POOL comment=“DHCP Server” interface=VLAN10 name=
VLAN10_DHCP
add address-pool=GUEST_POOL comment=“Guest DHCP Server” interface=GUEST_VLAN
name=GUEST_DHCP
/interface bridge port
add bridge=bridge comment=defconf frame-types=
admit-only-untagged-and-priority-tagged interface=ether3 pvid=5
add bridge=bridge comment=defconf frame-types=
admit-only-untagged-and-priority-tagged interface=ether4 pvid=10
add bridge=bridge comment=defconf frame-types=
admit-only-untagged-and-priority-tagged interface=ether5 pvid=3090
add bridge=bridge comment=defconf interface=ether6
add bridge=bridge comment=defconf interface=ether7
add bridge=bridge comment=defconf frame-types=
admit-only-untagged-and-priority-tagged interface=ether8 pvid=20
add bridge=bridge comment=defconf frame-types=admit-only-vlan-tagged
interface=sfp-sfpplus1
/ip neighbor discovery-settings
set discover-interface-list=LAN
/ip settings
set rp-filter=loose
/ipv6 settings
set disable-ipv6=yes
/interface bridge vlan
add bridge=bridge comment=“VLAN 5” tagged=bridge,sfp-sfpplus1
untagged=ether3 vlan-ids=5
add bridge=bridge comment=“VLAN 10” tagged=bridge,sfp-sfpplus1 untagged=
ether4 vlan-ids=10
add bridge=bridge comment=“VLAN 20” tagged=
bridge,sfp-sfpplus1 untagged=ether8 vlan-ids=20
add bridge=bridge comment=“VLAN Guest” tagged=bridge,sfp-sfpplus1 untagged=
ether5 vlan-ids=3090
/interface list member
add comment=defconf interface=bridge list=LAN
add comment=defconf interface=ether1 list=WAN
add interface=ether2 list=WAN
add interface=pppoe-out list=WAN
add interface=vrrp5 list=VRRP
add interface=vrrp6 list=VRRP
add interface=vrrp10 list=VRRP
add interface=vrrp11 list=VRRP
add interface=vrrp20 list=VRRP
add interface=vrrp21 list=VRRP
add interface=vrrp3090 list=VRRP
add interface=vrrp3091 list=VRRP
/ip address
add address=XX.YY.72.112/24 comment=“VLAN Gateway”
interface=MGMT_VLAN network=XX.YY.72.0
add address=XX.YY.70.112/24 comment=“VLAN Gateway” interface=VLAN5
network=XX.YY.70.0
add address=XX.YY.71.112/24 comment=“VLAN Gateway” interface=VLAN10
network=XX.YY.71.0
add address=192.168.ZZ.254/24 comment=“VLAN Guest Gateway” interface=
GUEST_VLAN network=192.168.ZZ.0
add address=XX.YY.72.115 interface=vrrp20 network=XX.YY.72.115
add address=XX.YY.72.116 interface=vrrp21 network=XX.YY.72.116
add address=XX.YY.70.115 interface=vrrp5 network=XX.YY.70.115
add address=XX.YY.70.116 interface=vrrp6 network=XX.YY.70.116
add address=XX.YY.71.115 interface=vrrp10 network=XX.YY.71.115
add address=XX.YY.71.116 interface=vrrp11 network=XX.YY.71.116
add address=192.168.ZZ.1 interface=vrrp3090 network=192.168.ZZ.1
add address=192.168.ZZ.2 interface=vrrp3091 network=192.168.ZZ.2
/ip dhcp-client
add interface=ether2
/ip dhcp-server network
add address=XX.YY.70.0/24 dns-server=XX.YY.70.112,1.1.1.1,8.8.8.8 gateway=
XX.YY.70.116
add address=XX.YY.71.0/24 dns-server=XX.YY.71.112,1.1.1.1,8.8.8.8 gateway=
XX.YY.71.115
add address=XX.YY.72.0/24 dns-server=XX.YY.72.112,1.1.1.1,8.8.8.8 gateway=
XX.YY.72.115
add address=192.168.ZZ.0/24 dns-server=192.168.ZZ.254,1.1.1.1,8.8.8.8
gateway=192.168.ZZ.1
/ip dns
set allow-remote-requests=yes servers=1.1.1.1,8.8.8.8
/ip dns static
add address=XX.YY.72.116 comment=“Secured / Management Network Gateway” name=
router.lan
add address=159.148.172.226 name=upgrade.mikrotik.com
/ip firewall filter
add action=accept chain=input comment=
“defconf: accept established,related,untracked” connection-state=
established,related,untracked
add action=drop chain=input comment=“defconf: drop invalid” connection-state=
invalid
add action=accept chain=input comment=“accept vrrp packets” protocol=vrrp
add action=accept chain=input comment=“defconf: accept ICMP” disabled=yes
protocol=icmp
add action=accept chain=input comment=
“allow VLAN 5 only (inter-vlan is blocked)” dst-address=XX.YY.70.0/24
src-address=XX.YY.70.0/24
add action=accept chain=input comment=
“allow VLAN 10 only (inter-vlan is blocked)” dst-address=XX.YY.71.0/24
src-address=XX.YY.71.0/24
add action=accept chain=input comment=
“allow VLAN 20 only (inter-vlan is blocked)” dst-address=
XX.YY.72.0/24 src-address=XX.YY.72.0/24
add action=accept chain=input comment=
“allow GUEST VLAN 3090 only (inter-vlan is blocked)” disabled=yes
dst-address=192.168.ZZ.0/24 src-address=192.168.ZZ.0/24
add action=accept chain=input comment=“"defconf: accept local loopback (for D
ude, RADIUS, user-manager, CAPsMAN, Wireguard) (https://forum.mikrotik.com
/viewtopic.php?t=180838)” dst-address=127.0.0.1
add action=reject chain=input comment=“*** TBC LOGGING *** optional → useful
_but only if interested in tracking LAN issues (https://forum.mikrotik.co
m/viewtopic.php?t=180838) - The purpose of the action=reject rule is to p
revent users in LAN from waiting for tens of seconds to get a timeout if t
hey are trying to connect to forbidden destinations, and of course for the
_admin to be aware of traffic that has the potential to be a problem (aka
_pinpoint device with issues).” in-interface-list=LAN log=yes
log-prefix=“*** TRACKING LAN ISSUES ***” reject-with=
icmp-admin-prohibited
add action=drop chain=input comment=“block everything else”
add action=drop chain=input comment=“defconf: drop all not coming from LAN”
disabled=yes in-interface-list=!LAN
add action=accept chain=forward comment=“defconf: accept in ipsec policy”
ipsec-policy=in,ipsec
add action=accept chain=forward comment=“defconf: accept out ipsec policy”
ipsec-policy=out,ipsec
add action=fasttrack-connection chain=forward comment=“defconf: fasttrack”
connection-state=established,related hw-offload=yes
add action=accept chain=forward comment=
“defconf: accept established,related, untracked” connection-state=
established,related,untracked
add action=accept chain=forward comment=“need this rule to manage the ISP fail
over on the other VRRP router, otherwise these packets will be discarded a
s invalid by the next rule.” in-interface-list=VRRP out-interface=
MGMT_VLAN

pppoe-out not ready

add action=accept chain=forward comment=“need this rule to manage the ISP fail
over on the other VRRP router, otherwise these packets will be discarded a
s invalid by the next rule.” in-interface-list=VRRP out-interface=
pppoe-out
add action=drop chain=forward comment=“defconf: drop invalid”
connection-state=invalid log=yes log-prefix=“*** invalid "
add action=accept chain=forward comment=“allow internet traffic (all vrrp inte
rfaces) - non presente in RB5009 default, aggiunto da CCR2216 (che usava i
nvece all-vlan).” in-interface=all-vlan out-interface-list=WAN
add action=accept chain=forward comment=“allow port forwarding
\ (https://forum.mikrotik.com/viewtopic.php?t=180838)
" connection-nat-state=dstnat disabled=yes
add action=reject chain=forward comment=”
TBC LOGGING *** optional → usef
ul for tracking LAN issues - in most installations the rule doesn’t have t
o care about multicast traffic because it never sees it (https://forum.mik
rotik.com/viewtopic.php?t=180838) - The purpose of the action=reject rule
_is to prevent users in LAN from waiting for tens of seconds to get a tim
eout if they are trying to connect to forbidden destinations, and of cours
e for the admin to be aware of traffic that has the potential to be a prob
lem (aka pinpoint device with issues).” dst-address=!0.0.0.0/0
in-interface-list=LAN log=yes log-prefix=“*** TRACK LAN ISSUES "
reject-with=icmp-admin-prohibited
add action=drop chain=forward comment=“defconf: drop all from WAN not DSTNATed
_- drop access to clients behind NAT from WAN - drops all new connection
attempts from the WAN port to our LAN network (unless DstNat is used). Wit
hout this rule, if an attacker knows or guesses your local subnet, he/she
can establish connections directly to local hosts and cause a security thr
eat.” connection-nat-state=!dstnat connection-state=new
in-interface-list=WAN
add action=drop chain=forward comment=“block everything else” log-prefix=
"
blocked by fwd ***”
/ip firewall nat
add action=masquerade chain=srcnat comment=“defconf: masquerade”
ipsec-policy=out,none out-interface-list=WAN
/ip route
add comment=“WAN1 ISP2 via PPPoE” disabled=yes distance=1 dst-address=
0.0.0.0/0 gateway=pppoe-out pref-src=“” routing-table=main scope=30
suppress-hw-offload=no target-scope=10
add comment=“4G/LTE ISP via ether2” disabled=yes distance=2 dst-address=
0.0.0.0/0 gateway=ether2 pref-src=“” routing-table=main scope=30
suppress-hw-offload=no target-scope=10
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1
scope=10 target-scope=12
add comment=“WAN1 ISP2 via PPPoE - ping host 1” distance=1 dst-address=
1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9
scope=10 target-scope=12
add comment=“WAN1 ISP2 via PPPoE - ping host 2” distance=2 dst-address=
9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11
add comment=“ISP1 via Backup Router” disabled=no distance=3 dst-address=
0.0.0.0/0 gateway=XX.YY.72.111 pref-src=“” routing-table=main scope=3
suppress-hw-offload=no target-scope=30
add comment=“4G/LTE ISP via ether2” disabled=no distance=4 dst-address=
0.0.0.0/0 gateway=ether2 pref-src=“” routing-table=main
suppress-hw-offload=no
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www disabled=yes
set www-ssl certificate=Webfig disabled=no
set api disabled=yes
/system clock
set time-zone-name=Europe/Rome
/system identity
set name=“MikroTik RB5009 #02
/system ntp client
set enabled=yes
/system ntp client servers
add address=194.0.5.123
add address=216.239.32.15
/tool mac-server
set allowed-interface-list=none
/tool mac-server mac-winbox
set allowed-interface-list=LAN

VRRP interfaces should be ok. I used the /32 address. I confirm that from VRRP standpoint everything seems to be working ok. If I switch off one of the two routers the other takes over and when both are working they have 2 vlans running as master and 2 vlans as backup of the other router and viceversa.

I understand that VRRP does not take care of the ISPs failover, that’s why I have defined the routing rules to manage this, but the problem I believe, since I’m referring to the GW IP of the other router, since the subnets (having VRRP) are the same, this is creating issues and in the firewall filters rule there are packets considered invalid and then discarded. I believe that to avoid this, I should point to the other router GW with the failover route entering from the wan side and not from the LAN side. To do this I beleive I shoudl use srcnat, but I do not know how to do this.

Finally the backup LTE line is the last resort and if ISP1 and ISP2 are down the maximum distance is for this line that will take over the internet connection.

I hope I was clear.

I fully agree with you. When I say that both are active, I mean the both routers are running. Router 1 has two vrrp interfaces (vlans are attached to these) running as master (with their corresponding backup on router 2) and the other two vrrp interfaces are in backup mode, since their corresponding masters are running on router 2.

Again I confirm that the VRRP mechanism is working ok, dhcp clients are connecting properly (based on the vlan they are belonging to) to the respective router and if one of the router is off-line, the other is taking over everything and when it is back, both of them are working together (vrrp interfaces master and backup states are correctly shown within the interfaces list).

My problem is only related (I believe) on having proper management of the ISP failover through the recursive routes that should point to the the WAN side of the failover router and not to its lan side.

Thanks for your support.

The routing solution I-m trying to implement is mentioned by Sindy in this conversation, response #11: http://forum.mikrotik.com/t/how-to-connect-vrrped-routers-to-wan-isp/146850/1

Here Sindy is clearly mentioning: “you can use src-nat when routing via the other router; this will make the other router see that the packet as coming from the local router’s own address, so it can use e.g. an /ip route rule row matching on a particular src-address to choose a routing table that only contains a default route via its own WAN, and it will automatically deliver the response back to that address. The address used for the src-nat must not be from the LAN subnet to avoid ICMP redirect to be sent to the sender, but you can use the same link which the LAN subnet is using for the interconnection of the routers”.

Unfortunately, I do not understand how to specify this route referring to the WAN address of the other router.

Now I get it more. VRRP is for you 4 LANs. On WAN, you have a VLAN 835 with the PPPoE WAN connection and the two routers, with two PPPoE client on each router. LTE should be the last choice.

At a high level, I think I’d just use an additional VLAN, without VRRP, and separate from the PPPoE one, to use for WAN routing between the two routers. e.g. use the IP of the other router on that new VLAN as the destination for the backup 0.0.0.0 route. Let PPPoE get a WAN address on each router, and that it’s for VLAN835. The new router-to-router VLAN can have recursive route on it to check it’s valid (thus no need for VRRP on this particular VLAN). I also suppose you could use the MGMT IP of the far-end router as the 2nd route, but it likely just be cleaner to create a new “WAN routing” VLAN to keep it cleaner.

Otherwise, re-using the PPPoE VLAN for traffic between the routers requires more extensive firewall treatment. Perhaps the src-nat as suggest by @sindy in the other thread, but that’s not the entire story I suspect. Personally, I’d avoid additional NAT’ing since it really shouldn’t be needed. If router1’s internet is down, you want to route to router2 first, then get NAT’ed there. Still need a recursive route on it since if the other router was down, then you do want to go to the 3rd LTE route on the same router.

I’m not sure this is an issue yet. But Ether2/LTE modem, you have a /ip/dhcp-client and a static /ip/route for ether2… likely just need to set the default-route-distance and a default route will be created automatically… so no need for the static route for ether2 since dhcp-client should take care of it. As the “last chance” internet, you likely don’t want any recursive routes and you’re not doing this which seems right — both since there isn’t another choice & also it adds more complexity (requires script on dhcp-client for LTE and additional “canary addresses” (e.g. 8.8.8.8, etc.) if you do want recursive routes).

One more thing, for the /ip/dhcp-server, I normally have them listen on the VRRP interface. But this is just preference since like seeing the leases for a VRRP’d VLAN on one router (e.g. the VRRP master). In theory, it doesn’t matter who provides the DHCP address for a VRRP VLAN, but it does get annoying to trace down an lease down the road since you have to look in two places…

@ rickpal

router 1:

  • ether1 pppoe ISP1 (2.5 gbps) (distance 1)
  • ether2: backup LTE connection (dhcp client with an LTE modem) (distance 3)
  • 4 VLANS (vlans 1 & 2 acting as VRRP master and vlans 3 & 4 as backup with router 2) all 4 have dhcp servers
  • sfp-sfpplus1 trunk with the 4 VLANS → mikrotik switch CRS305 (sfp-sfpplus1 in trunk)

router 2:

  • ether1 pppoe ISP2 (2.5 gbps) (distance 1)
  • ether2: backup LTE connection (dhcp client with the same LTE modem, as above) (distance 3)
  • 4 VLANS (vlans 1 & 2 acting as VRRP backup and vlans 3 & 4 as master with router 1) all 4 have dhcp servers
  • sfp-sfpplus1 trunk with the 4 VLANS → mikrotik switch CRS305 (sfp-sfpplus2 in trunk

are you saying that you have 4 internet links, which are 2 internet for each router?

— edit

i am not saying that your vrrp setup is not doable. it’s really nice to have everything in place - but i just thought that maybe you could make your homework easier by doing better design, better assessment? I’m sure that there are many other things to do with your network?:thinking:

Amm0, first I’ve to thank you for your precious support.
Let me go through your different points / suggestions:

Can you kindly elaborate a bit more (I’m quite new to mikrotik and this is one of my first router setup) about this “WAN Routing” vlan? Should I create it in the same way I did for the 835 (required by the ISP), separate from this one(?), on which interface(?) and how can I use it on both routers and have 2 IPs on it to use on the other router as 2nd route (getting internet access from each of the pppoe connections)? You mention also to use the MGMT IP of the far-end router as the 2nd route, but this is exactly what I’m doing today:

/ip route
add comment=“ISP2 via Backup Router” disabled=no distance=3 dst-address=
0.0.0.0/0 gateway=XX.YY.72.112 pref-src=“” routing-table=main scope=3
suppress-hw-offload=no target-scope=30

but the problem I’m having in doing that is that the firewall filter rule that in the forward chain (chain=forward action=drop connection-state=invalid) drops all the packets related to that connection via the other router. I believe this is due to the fact these packets are coming not through the nat masquerade. So to avoid these packets to be dropped I added 2 accept rules before the drop invalids.

So, using this new “WAN Routing” VLAN will I have the same issue? But I need your help on how to understand how to create it between the two routers and have the 2 IPs for the 2 routers and internet connection too.


the @Sindy suggestion is the only one I’ve found on the forum that could address my issue having VRRP and having a 2nd route for ISP failover on a far-end router, that I suppose could avoid to have the invalid packets dropped, passing them to the WAN of the other router.


With regard to the ether2/LTE modem, I’m still in the configuration process, so it’s simpler for me to test it with a dhcp client getting an IP from an ether cable with just internet access. The final situation will be a static address e.g. 192.168.1.2 with a GW 192.168.1.1 from the LAN side of the LTE modem. I created a static route just to be sure to have the highest distance and use it only as last resort (without risk to dragging the LTE connection without knowing it). I will use this LTE connection for both routers, so I’ll put a dumb switch on the LAN LTE modem and I’ll connect it with 2 ethernet cables to the ether2 of both routers.
BTW, on this regard I still have a doubt since when ISP1 will go down and the router 1 will use the 2nd route to get internet from the router 2 ISP2, if this ISP is down too, router 1 will take internet through ether2 of the router 2 (and exactly would happen vicevers router 2-> router 1). But I do not know if your solution of using the “WAN Routing” VLAN will be able to address this issue too, making the 2nd route working only with the pppoe connections (maybe yes).


To implement VRRP I google around the mikrotik forums and manuals (and I did not find too much honestly), so I understood how it should work and I tried on my own. I started to create the VLANs for each router and got them working, then I added the VRRP on top of them. Now, if you would suggest me a different way and I could avoid to have only 4 active dhcp servers instead of 8, for me it would be better and I’d be happier. Consider anyway that I split master and backup between the 2 routers and the only way I thought it would be possible in case of crash of one of the 2 routers is that the remaining one should have all the 4 dhcp servers, but this is the first VRRP experience for me, so the solution could be different. I believe you are suggesting that putting the VRRP (instead of VLAN) interface on the DHCP server this will have only 4 dhcp server active at any time. If you would suggest me how to implement this (maybe just change the interface reference in the dhcp server), I’d be very happy.

Finally, I’m so grateful to you, since you are spending your time helping me in addressing these issues, otherwise I’ll not know how to solve.
Thanks again.

I’ve three: 1 iSP for each router (ISP1, ISP2) and a third 4G/LTE modem shared btw the 2 routers


I really appreciate your help, if you have any suggestion. pls let me know, I’m quite new to mikrotik, but I love it. I started with switches and I’m currently trying to place the router too.

@ rikpal

I’ve three: 1 iSP for each router (ISP1, ISP2) and a third 4G/LTE modem shared btw the 2 routers

how about to put the lte modem in the drawer first, so that you can focus on the vrrp ?

ok. so you have 2 vrrp routers in active - active mode?

r1 master for wan1,
r2 master for wan2,

am i correct?

@wiseroute
yes

hello rikpal,

ok. since I have read that your vrrp have worked perfectly, then this is your next step right?

I’m currently having issues and struggling in managing failover of the 2 ISPs lines between the two routers.
Following this guide (viewtopic.php?t=182373), I setup dual wan recursive (using 2 recursive routes - flat) rules to manage the two ISP failover.
That means that if ISP1 goes down, router 1 should use (distance 2) the ISP2 from the router 2 and viceversa. See below.

well, it’s not that easy to separate your 2 isp fail over with your vrrp active - active setup. since each vrrp router actually in active forwarding state.

i am sorry, my eyes couldn’t read long config any longer, so i better ask a simple one.

the question is: when designing those vrrp active - active mode, do you have any LAN gateway separation? ie. vlan 1 & 2 mainly for wan1, 3 & 4 goes to wan2? are those vlans existed on each switch?

if each 2 vlans are allocated on single switch, then it should be simple enough to do

  1. gateway detection for each one. distance will have no effect in the wan transition process, it got overridden by your gateway detection and the script.

the problem is if those vlans are spread across those 2 switches.

  1. if wan1 offline, kick script to re-route to wan2, and vice versa.

make new routing tables for fail over, examples : wan1_active, wan1_backup. because each wan represent different vlan.

so there will be 2 routing tables besides main table on each router.

  1. just don’t look at the vrrp setup. it’s physical. any devices connected to either one of the switch, will lost if it is offline. so stay focus on your gateway re-route.

ok. this will be great :+1:t2: good luck :+1:t2:

@wiseroute
Thank you for your suggestions and advices.
I would prefer avoiding scripts, whenever possible.
I understood from @Amm0 that a dedicated VLAN for the WAN (ISP1 and ISP2) could be the right way, but I do not know how to implement it, since my PPPoE interfaces on ether1 are already associated to the VLAN 835 required by my ISPs. Moreover, can I trunk this WAN VLAN into the other bridges’ VLANs to share it between the two routers. How this WAN VLAN will have access to internet? I hope that somebody could help me on this.
The way depicted by @Amm0 seems to answer to most of my questions and doubts.
Tnx.

hello rikpal

I would prefer avoiding scripts, whenever possible.

ok. here is the thing - in every fail over schema there should be some kind of mechanism to move the failing routes to working one. be it static by script or dynamic using routing protocol.

are you familiar with it? at least rip should be enough.

and, maybe now you have learned that any vrrp deployment needs thorough assessment.

for each vrrp link, they should independent from their gateways. different physical devices. so they not having split brain when the fail over system happens.

ok. good luck :+1:t2:

I was suggesting that you can create a new VLAN network, calling it VLAN 100 with 172.22.1.0/24. Same steps to create any VLAN…
/interface/vlan with vlan-id=100 and /ip/address of 172.22.1.1/24 and 172.22.1.2/24 for your two routers.
Then for the “2nd ISP” use 172.22.1.2 as the default route 1st router, and 172.22.1.1 on 2nd router. The idea being is the each router just routes to it’s brother. You’d also need tagging of new VLAN 100 in the bridges and your switch. All the recursive routing same – just using the 172.22.1.x as the 8.8.4.4 (or whatever) route.

For failure modes: If one route is physically off, VRRP get the LANs to the working one, and recursive route would kill the route to the 2nd router since since 8.8.4.4 wouldn’t be pingable. If ISP1 died (but still working locally), on one of the router (e.g. 8.8.8.8 primary recusive route fails ping check), then it route to the 2nd router.

With @wiseroute here…leave LTE out initially (although it could be 172.22.1.3 as 2nd IP on VLAN 100 that use also use in a recursive route). VRRP is totally right approach for LANs, but WAN recusive routes part is just tricky & with two router even more tricky. Not sure about dynamic routing protocols (RIP or OSPF) here, with only two routes and a lot going on already, not sure that help – still needs a live-ness check someplace, which unforentently involves recursive routes to do it well today :wink:

@Ammo
Tnx a lot, I’ll try in the next days and I’ll let you know.
I hooe I’ll not have the same issue with the invalid packets being discarded.
Cheers

@ amm0 @ rikpal

very nice discussion :+1:t2:

for failure modes: If one route is physically off, VRRP get the LANs to the working one, and recursive route would kill the route to the 2nd router since since 8.8.4.4 wouldn’t be pingable. If ISP1 died (but still working locally), on one of the router (e.g. 8.8.8.8 primary recusive route fails ping check), then it route to the 2nd router.

well, probably this vrrp discussion can’t take a short answer :joy:

ok. let us observe this example from the wiki.

https://wiki.mikrotik.com/wiki/File:Vrrp-load-sharing.png

i think that is what @ rikpal has in mind.

now, from that picture, do you notice that shared media marked as LAN??

well, in real world - that shared media is supposed to be the real vrrp switches, which then attached to upper real routers. so, let’s say many of us got confused in understanding how the vrrp works.

just don’t think that vrrp switches are the real router. no. they still need the real gateway to the internet so that those vrrp routers don’t have split brain, and the most important thing is that vrrp only works in physical hardware error.

question:
so, what is the point having another router which attached to the vrrp switch before go to the internet?

answer: the wan detection should shut down the interface connected to the vrrp switch when the wan fails. hence the vrrp switches sense that physical failure which then triggers to re-route to wan2.

but.. as for @ rikpal vlans.. i think maybe he got the idea :light_bulb:

hope this helps.