Community discussions

MikroTik App
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

VRRP and ISP Failover

Thu Apr 27, 2023 8:55 am

Morning,
I would need help for your great expertise.
I'm setting-up two RB5009 routers in VRRP active balanced configuration, so that means both of them are active at the same time.
this is the summary setup description (currently running ROS 7.8 )

router 1:
- ether1 pppoe ISP1 (2.5 gbps) (distance 1)
- ether2: backup LTE connection (dhcp client with an LTE modem) (distance 3)
- 4 VLANS (vlans 1 & 2 acting as VRRP master and vlans 3 & 4 as backup with router 2) all 4 have dhcp servers
- sfp-sfpplus1 trunk with the 4 VLANS -> mikrotik switch CRS305 (sfp-sfpplus1 in trunk)

router 2:
- ether1 pppoe ISP2 (2.5 gbps) (distance 1)
- ether2: backup LTE connection (dhcp client with the same LTE modem, as above) (distance 3)
- 4 VLANS (vlans 1 & 2 acting as VRRP backup and vlans 3 & 4 as master with router 1) all 4 have dhcp servers
- sfp-sfpplus1 trunk with the 4 VLANS -> mikrotik switch CRS305 (sfp-sfpplus2 in trunk)

With regard to the firewall filter rules I followed this guide (viewtopic.php?t=180838) and the approach I followed, as described in this guide is that I accept what I need to accept and at the end of both input and forward chains, I drop everythting else.

I connect test PCs on the other available sfp+ ports on the CRS305 in access mode, for testing.
what is above described, seems to work properly. If I switch-off one of the two routers, the other one takes over promptly all the 4 vlans and when it is back the 4 vlans are managed between the two routers as above described.

I'm currently having issues and struggling in managing failover of the 2 ISPs lines between the two routers.
Following this guide (viewtopic.php?t=182373), I setup dual wan recursive (using 2 recursive routes - flat) rules to manage the two ISP failover.
That means that if ISP1 goes down, router 1 should use (distance 2) the ISP2 from the router 2 and viceversa. See below.

# router #01 routes definition
/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 2"
# +++++++++++++++++++
add distance=3 dst-address=0.0.0.0/0 gateway=ROUTER2-GW-LAN-IP scope=3 target-scope=30 comment="ISP2 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2"

# router #02 routes definitions
/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 2"
# +++++++++++++++++++
add distance=3 dst-address=0.0.0.0/0 gateway=ROUTER1-GW-LAN-IP scope=3 target-scope=30 comment="ISP1 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2

The problem I've is that in case router 1 has to use the ISP2 connection established by router 2 as failover (the same is true viceversa), than I discovered the firewall filter rule that in the forward chain should drop invalid packets (chain=forward action=drop connection-state=invalid) drops all the packets related to that connection via the other router. Probably this is happening since these packets are considered invalid since they are not coming from new connections properly established. To avoid this I've to put before the drop invalid packets rule a couple of rules to accpet such packets (in-interface-list=VRRP out-interface=VLAN and out-interface=pppoe-out). Nevertheless I'm not sure this is the proper way to do it and I'm not generating strange things like loops, traffic overhead and/or any other unwanted unknown side effects.

I've read that probably the good thing to do is to use a src-nat masquerade nat rule, but I've no idea how to do this. I googles a lot within the forum posts but I was not able to understand how to do this.

So I would greatly appreciate your precious help.
Thank you in advance.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri Apr 28, 2023 6:04 am

Please could try to help me?
Thank you.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Fri Apr 28, 2023 6:52 am

The quick suggestion is to make sure the VRRP interfaces are in the LAN and the VRRP ip address is a /32. As to what you think WAN/failover and VRRP will do with PPPoE and LTE, I'm not sure.

You should post your entire config as VRRP and ISP/LTE failover are exactly related.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Fri Apr 28, 2023 6:56 am

I'm setting-up two RB5009 routers in VRRP active balanced configuration, so that means both of them are active at the same time.
For each VRRP interface one is always active and the others are backup. There is no "both of them are active" with VRRP. Now each VRRP interface can live on different routers.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri Apr 28, 2023 8:34 am

Here is the config for the router 01

# apr/26/2023 19:16:30 by RouterOS 7.8
# software id = XXX
#
# model = RB5009UG+S+
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=\
admit-only-vlan-tagged name=bridge protocol-mode=none pvid=20 \
vlan-filtering=yes
/interface vlan
add interface=bridge name=GUEST_VLAN vlan-id=3090
add interface=bridge name=MGMT_VLAN vlan-id=20
add interface=bridge name=VLAN5 vlan-id=5
add interface=bridge name=VLAN10 vlan-id=10
add interface=ether1 name=VLAN835 vlan-id=835
/interface pppoe-client
add disabled=no interface=VLAN835 name=pppoe-out user=XXX
/interface vrrp
add interface=VLAN5 name=vrrp5 vrid=5
add interface=VLAN5 name=vrrp6 priority=254 vrid=6
add interface=VLAN10 name=vrrp10 vrid=10
add interface=VLAN10 name=vrrp11 priority=200 vrid=11
add interface=MGMT_VLAN name=vrrp20 vrid=20
add interface=MGMT_VLAN name=vrrp21 priority=200 vrid=21
add interface=GUEST_VLAN name=vrrp3090 vrid=90
add interface=GUEST_VLAN name=vrrp3091 priority=254 vrid=91
/interface list
add comment=defconf name=WAN
add comment=defconf name=LAN
add name=VRRP
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/ip pool
add name=MGMT_POOL ranges=XX.YY.72.201-XX.YY.72.250
add name=VLAN5_POOL ranges=XX.YY.70.201-XX.YY.70.250
add name=VLAN10_POOL ranges=XX.YY.71.231-XX.YY.71.250
add name=GUEST_POOL ranges=192.168.ZZ.101-192.168.ZZ.150
/ip dhcp-server
add address-pool=MGMT_POOL comment="DHCP Server" \
interface=MGMT_VLAN name=MGMT_DHCP
add address-pool=VLAN5_POOL comment="DHCP Server" interface=VLAN5 \
name=VLAN5_DHCP
add address-pool=VLAN10_POOL comment="DHCP Server" interface=VLAN10 name=\
VLAN10_DHCP
add address-pool=GUEST_POOL comment="Guest DHCP Server" interface=GUEST_VLAN \
name=GUEST_DHCP
/user group
add name=remote policy="ssh,read,write,!local,!telnet,!ftp,!reboot,!policy,!te\
st,!winbox,!password,!web,!sniff,!sensitive,!api,!romon,!rest-api"
/interface bridge port
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether3 pvid=5
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether4 pvid=10
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether5 pvid=3090
add bridge=bridge comment=defconf interface=ether6
add bridge=bridge comment=defconf interface=ether7
add bridge=bridge comment="Management Access Port" frame-types=\
admit-only-untagged-and-priority-tagged interface=ether8 pvid=20
add bridge=bridge comment=defconf frame-types=admit-only-vlan-tagged \
interface=sfp-sfpplus1
/ip neighbor discovery-settings
set discover-interface-list=LAN
/ip settings
set rp-filter=loose
/ipv6 settings
set disable-ipv6=yes
/interface bridge vlan
add bridge=bridge comment="VLAN 5" tagged=bridge,sfp-sfpplus1 \
untagged=ether3 vlan-ids=5
add bridge=bridge comment="VLAN 10" tagged=bridge,sfp-sfpplus1 untagged=\
ether4 vlan-ids=10
add bridge=bridge comment="VLAN 20" tagged=\
bridge,sfp-sfpplus1 untagged=ether8 vlan-ids=20
add bridge=bridge comment="VLAN Guest" tagged=bridge,sfp-sfpplus1 untagged=\
ether5 vlan-ids=3090
/interface list member
add comment=defconf interface=bridge list=LAN
add comment=defconf interface=ether1 list=WAN
add interface=ether2 list=WAN
add interface=pppoe-out list=WAN
add interface=vrrp5 list=VRRP
add interface=vrrp6 list=VRRP
add interface=vrrp10 list=VRRP
add interface=vrrp11 list=VRRP
add interface=vrrp20 list=VRRP
add interface=vrrp21 list=VRRP
add interface=vrrp3090 list=VRRP
add interface=vrrp3091 list=VRRP
/ip address
add address=XX.YY.72.111/24 comment="VLAN Gateway" interface=\
MGMT_VLAN network=XX.YY.72.0
add address=XX.YY.70.111/24 comment="VLAN Gateway" interface=VLAN5 \
network=XX.YY.70.0
add address=XX.YY.71.111/24 comment="VLAN Gateway" interface=VLAN10 \
network=XX.YY.71.0
add address=192.168.ZZ.253/24 comment="VLAN Guest Gateway" interface=\
GUEST_VLAN network=192.168.ZZ.0
add address=XX.YY.72.115 interface=vrrp20 network=XX.YY.72.115
add address=XX.YY.72.116 interface=vrrp21 network=XX.YY.72.116
add address=XX.YY.70.115 interface=vrrp5 network=XX.YY.70.115
add address=XX.YY.70.116 interface=vrrp6 network=XX.YY.70.116
add address=XX.YY.71.115 interface=vrrp10 network=XX.YY.71.115
add address=XX.YY.71.116 interface=vrrp11 network=XX.YY.71.116
add address=192.168.ZZ.1 interface=vrrp3090 network=192.168.ZZ.1
add address=192.168.ZZ.2 interface=vrrp3091 network=192.168.ZZ.2
/ip dhcp-client
add interface=ether2
/ip dhcp-server network
add address=XX.YY.70.0/24 dns-server=XX.YY.70.111,1.1.1.1,8.8.8.8 gateway=\
XX.YY.70.116
add address=XX.YY.71.0/24 dns-server=XX.YY.71.111,1.1.1.1,8.8.8.8 gateway=\
XX.YY.71.115
add address=XX.YY.72.0/24 dns-server=XX.YY.72.111,1.1.1.1,8.8.8.8 gateway=\
XX.YY.72.115
add address=192.168.ZZ.0/24 dns-server=192.168.ZZ.253,1.1.1.1,8.8.8.8 \
gateway=192.168.ZZ.1
/ip dns
set allow-remote-requests=yes servers=1.1.1.1,8.8.8.8
/ip dns static
add address=XX.YY.72.115 comment="Secured / Management Network" name=\
router.lan
add address=159.148.172.226 name=upgrade.mikrotik.com
/ip firewall filter
add action=accept chain=input comment=\
"defconf: accept established,related,untracked" connection-state=\
established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=\
invalid log=yes log-prefix="*** drop invalids ***"
add action=accept chain=input comment="accept vrrp packets" protocol=vrrp
add action=accept chain=input comment="defconf: accept ICMP" disabled=yes \
protocol=icmp
add action=accept chain=input comment=\
"allow VLAN 5 only (inter-vlan is blocked)" dst-address=XX.YY.70.0/24 \
src-address=XX.YY.70.0/24
add action=accept chain=input comment=\
"allow VLAN 10 only (inter-vlan is blocked)" dst-address=XX.YY.71.0/24 \
src-address=XX.YY.71.0/24
add action=accept chain=input comment=\
"allow VLAN 20 only (inter-vlan is blocked)" dst-address=\
XX.YY.72.0/24 src-address=XX.YY.72.0/24
add action=accept chain=input comment=\
"allow GUEST VLAN 3090 only (inter-vlan is blocked)" disabled=yes \
dst-address=192.168.ZZ.0/24 src-address=192.168.ZZ.0/24
add action=accept chain=input comment="\"defconf: accept local loopback (for D\
ude, RADIUS, user-manager, CAPsMAN, Wireguard) (https://forum.mikrotik.com\
/viewtopic.php\?t=180838)" dst-address=127.0.0.1
add action=reject chain=input comment="*** TBC LOGGING *** optional --> useful\
\_but only if interested in tracking LAN issues (https://forum.mikrotik.co\
m/viewtopic.php\?t=180838) - The purpose of the action=reject rule is to p\
revent users in LAN from waiting for tens of seconds to get a timeout if t\
hey are trying to connect to forbidden destinations, and of course for the\
\_admin to be aware of traffic that has the potential to be a problem (aka\
\_pinpoint device with issues)." in-interface-list=LAN log=yes \
log-prefix="*** TRACKING LAN ISSUES ***" reject-with=\
icmp-admin-prohibited
add action=drop chain=input comment="block everything else"
add action=drop chain=input comment="defconf: drop all not coming from LAN" \
disabled=yes in-interface-list=!LAN
add action=accept chain=forward comment="defconf: accept in ipsec policy" \
ipsec-policy=in,ipsec
add action=accept chain=forward comment="defconf: accept out ipsec policy" \
ipsec-policy=out,ipsec
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
connection-state=established,related hw-offload=yes
add action=accept chain=forward comment=\
"defconf: accept established,related, untracked" connection-state=\
established,related,untracked
add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." in-interface-list=VRRP out-interface=\
MGMT_VLAN
add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." in-interface-list=VRRP out-interface=\
pppoe-out
add action=drop chain=forward comment="defconf: drop invalid" \
connection-state=invalid log=yes log-prefix="*** drop invalid ***"
add action=accept chain=forward comment="allow internet traffic (all vrrp inte\
rfaces) - non presente in RB5009 default, aggiunto da CCR2216 (che usava i\
nvece all-vlan)." in-interface=all-vlan out-interface-list=WAN
add action=accept chain=forward comment="allow port forwarding \
\ (viewtopic.php\?t=180838)\
" connection-nat-state=dstnat disabled=yes
add action=reject chain=forward comment="*** TBC LOGGING *** optional --> usef\
ul for tracking LAN issues - in most installations the rule doesn't have t\
o care about multicast traffic because it never sees it (https://forum.mik\
rotik.com/viewtopic.php\?t=180838) - The purpose of the action=reject rule\
\_is to prevent users in LAN from waiting for tens of seconds to get a tim\
eout if they are trying to connect to forbidden destinations, and of cours\
e for the admin to be aware of traffic that has the potential to be a prob\
lem (aka pinpoint device with issues)." dst-address=!0.0.0.0/0 \
in-interface-list=LAN log=yes log-prefix="*** TRACK LAN ISSUES ***" \
reject-with=icmp-admin-prohibited
add action=drop chain=forward comment="defconf: drop all from WAN not DSTNATed\
\_- drop access to clients behind NAT from WAN - drops all new connection \
attempts from the WAN port to our LAN network (unless DstNat is used). Wit\
hout this rule, if an attacker knows or guesses your local subnet, he/she \
can establish connections directly to local hosts and cause a security thr\
eat." connection-nat-state=!dstnat connection-state=new \
in-interface-list=WAN
add action=drop chain=forward comment="block everything else - non presente in\
\_RB5009 default" log-prefix="*** blocked fwd ***"
/ip firewall nat
add action=masquerade chain=srcnat comment="defconf: masquerade" \
ipsec-policy=out,none out-interface-list=WAN
/ip route
add comment="WAN1 ISP1 via PPPoE" disabled=yes distance=1 dst-address=\
0.0.0.0/0 gateway=pppoe-out pref-src="" routing-table=main scope=30 \
suppress-hw-offload=no target-scope=10
add comment="4G/LTE ISP via ether2" disabled=yes distance=2 dst-address=\
0.0.0.0/0 gateway=ether2 pref-src="" routing-table=main scope=30 \
suppress-hw-offload=no target-scope=10
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 \
scope=10 target-scope=12
add comment="WAN1 ISP1 via PPPoE - ping host 1" distance=1 dst-address=\
1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 \
scope=10 target-scope=12
add comment="WAN1 ISP1 via PPPoE - ping host 2" distance=2 dst-address=\
9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11
add comment="ISP2 via Backup Router" disabled=no distance=3 dst-address=\
0.0.0.0/0 gateway=XX.YY.72.112 pref-src="" routing-table=main scope=3 \
suppress-hw-offload=no target-scope=30
add comment="4G/LTE ISP via ether2" disabled=no distance=4 dst-address=\
0.0.0.0/0 gateway=ether2 pref-src="" routing-table=main \
suppress-hw-offload=no
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www disabled=yes
set www-ssl certificate=Webfig disabled=no
set api disabled=yes
/system clock
set time-zone-name=Europe/Rome
/system identity
set name="MikroTik RB5009 #01"
/system ntp client
set enabled=yes
/system ntp client servers
add address=194.0.5.123
add address=216.239.32.15
/tool mac-server
set allowed-interface-list=none
/tool mac-server mac-winbox
set allowed-interface-list=LAN

-------------------------------------------------------------------------------------------

and here the config for the router 02

# apr/26/2023 19:17:17 by RouterOS 7.8
# software id = XXX
#
# model = RB5009UG+S+
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=\
admit-only-vlan-tagged name=bridge protocol-mode=none pvid=20 \
vlan-filtering=yes
/interface vlan
add interface=bridge name=GUEST_VLAN vlan-id=3090
add interface=bridge name=MGMT_VLAN vlan-id=20
add interface=bridge name=VLAN5 vlan-id=5
add interface=bridge name=VLAN10 vlan-id=10
add interface=ether1 name=VLAN835 vlan-id=835
/interface pppoe-client
add disabled=no interface=VLAN835 name=pppoe-out user=XXX
/interface vrrp
add interface=VLAN5 name=vrrp5 priority=200 vrid=5
add interface=VLAN5 name=vrrp6 vrid=6
add interface=VLAN10 name=vrrp10 priority=254 vrid=10
add interface=VLAN10 name=vrrp11 vrid=11
add interface=MGMT_VLAN name=vrrp20 priority=254 vrid=20
add interface=MGMT_VLAN name=vrrp21 vrid=21
add interface=GUEST_VLAN name=vrrp3090 priority=200 vrid=90
add interface=GUEST_VLAN name=vrrp3091 vrid=91
/interface list
add comment=defconf name=WAN
add comment=defconf name=LAN
add name=VRRP
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/ip pool
add name=MGMT_POOL ranges=XX.YY.72.201-XX.YY.72.250
add name=VLAN5_POOL ranges=XX.YY.70.201-XX.YY.70.250
add name=VLAN10_POOL ranges=XX.YY.71.231-XX.YY.71.250
add name=GUEST_POOL ranges=192.168.ZZ.101-192.168.ZZ.150
/ip dhcp-server
add address-pool=MGMT_POOL comment="DHCP Server" \
interface=MGMT_VLAN name=MGMT_DHCP
add address-pool=VLAN5_POOL comment="DHCP Server" interface=VLAN5 \
name=VLAN5_DHCP
add address-pool=VLAN10_POOL comment="DHCP Server" interface=VLAN10 name=\
VLAN10_DHCP
add address-pool=GUEST_POOL comment="Guest DHCP Server" interface=GUEST_VLAN \
name=GUEST_DHCP
/interface bridge port
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether3 pvid=5
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether4 pvid=10
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether5 pvid=3090
add bridge=bridge comment=defconf interface=ether6
add bridge=bridge comment=defconf interface=ether7
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether8 pvid=20
add bridge=bridge comment=defconf frame-types=admit-only-vlan-tagged \
interface=sfp-sfpplus1
/ip neighbor discovery-settings
set discover-interface-list=LAN
/ip settings
set rp-filter=loose
/ipv6 settings
set disable-ipv6=yes
/interface bridge vlan
add bridge=bridge comment="VLAN 5" tagged=bridge,sfp-sfpplus1 \
untagged=ether3 vlan-ids=5
add bridge=bridge comment="VLAN 10" tagged=bridge,sfp-sfpplus1 untagged=\
ether4 vlan-ids=10
add bridge=bridge comment="VLAN 20" tagged=\
bridge,sfp-sfpplus1 untagged=ether8 vlan-ids=20
add bridge=bridge comment="VLAN Guest" tagged=bridge,sfp-sfpplus1 untagged=\
ether5 vlan-ids=3090
/interface list member
add comment=defconf interface=bridge list=LAN
add comment=defconf interface=ether1 list=WAN
add interface=ether2 list=WAN
add interface=pppoe-out list=WAN
add interface=vrrp5 list=VRRP
add interface=vrrp6 list=VRRP
add interface=vrrp10 list=VRRP
add interface=vrrp11 list=VRRP
add interface=vrrp20 list=VRRP
add interface=vrrp21 list=VRRP
add interface=vrrp3090 list=VRRP
add interface=vrrp3091 list=VRRP
/ip address
add address=XX.YY.72.112/24 comment="VLAN Gateway" \
interface=MGMT_VLAN network=XX.YY.72.0
add address=XX.YY.70.112/24 comment="VLAN Gateway" interface=VLAN5 \
network=XX.YY.70.0
add address=XX.YY.71.112/24 comment="VLAN Gateway" interface=VLAN10 \
network=XX.YY.71.0
add address=192.168.ZZ.254/24 comment="VLAN Guest Gateway" interface=\
GUEST_VLAN network=192.168.ZZ.0
add address=XX.YY.72.115 interface=vrrp20 network=XX.YY.72.115
add address=XX.YY.72.116 interface=vrrp21 network=XX.YY.72.116
add address=XX.YY.70.115 interface=vrrp5 network=XX.YY.70.115
add address=XX.YY.70.116 interface=vrrp6 network=XX.YY.70.116
add address=XX.YY.71.115 interface=vrrp10 network=XX.YY.71.115
add address=XX.YY.71.116 interface=vrrp11 network=XX.YY.71.116
add address=192.168.ZZ.1 interface=vrrp3090 network=192.168.ZZ.1
add address=192.168.ZZ.2 interface=vrrp3091 network=192.168.ZZ.2
/ip dhcp-client
add interface=ether2
/ip dhcp-server network
add address=XX.YY.70.0/24 dns-server=XX.YY.70.112,1.1.1.1,8.8.8.8 gateway=\
XX.YY.70.116
add address=XX.YY.71.0/24 dns-server=XX.YY.71.112,1.1.1.1,8.8.8.8 gateway=\
XX.YY.71.115
add address=XX.YY.72.0/24 dns-server=XX.YY.72.112,1.1.1.1,8.8.8.8 gateway=\
XX.YY.72.115
add address=192.168.ZZ.0/24 dns-server=192.168.ZZ.254,1.1.1.1,8.8.8.8 \
gateway=192.168.ZZ.1
/ip dns
set allow-remote-requests=yes servers=1.1.1.1,8.8.8.8
/ip dns static
add address=XX.YY.72.116 comment="Secured / Management Network Gateway" name=\
router.lan
add address=159.148.172.226 name=upgrade.mikrotik.com
/ip firewall filter
add action=accept chain=input comment=\
"defconf: accept established,related,untracked" connection-state=\
established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=\
invalid
add action=accept chain=input comment="accept vrrp packets" protocol=vrrp
add action=accept chain=input comment="defconf: accept ICMP" disabled=yes \
protocol=icmp
add action=accept chain=input comment=\
"allow VLAN 5 only (inter-vlan is blocked)" dst-address=XX.YY.70.0/24 \
src-address=XX.YY.70.0/24
add action=accept chain=input comment=\
"allow VLAN 10 only (inter-vlan is blocked)" dst-address=XX.YY.71.0/24 \
src-address=XX.YY.71.0/24
add action=accept chain=input comment=\
"allow VLAN 20 only (inter-vlan is blocked)" dst-address=\
XX.YY.72.0/24 src-address=XX.YY.72.0/24
add action=accept chain=input comment=\
"allow GUEST VLAN 3090 only (inter-vlan is blocked)" disabled=yes \
dst-address=192.168.ZZ.0/24 src-address=192.168.ZZ.0/24
add action=accept chain=input comment="\"defconf: accept local loopback (for D\
ude, RADIUS, user-manager, CAPsMAN, Wireguard) (https://forum.mikrotik.com\
/viewtopic.php\?t=180838)" dst-address=127.0.0.1
add action=reject chain=input comment="*** TBC LOGGING *** optional --> useful\
\_but only if interested in tracking LAN issues (https://forum.mikrotik.co\
m/viewtopic.php\?t=180838) - The purpose of the action=reject rule is to p\
revent users in LAN from waiting for tens of seconds to get a timeout if t\
hey are trying to connect to forbidden destinations, and of course for the\
\_admin to be aware of traffic that has the potential to be a problem (aka\
\_pinpoint device with issues)." in-interface-list=LAN log=yes \
log-prefix="*** TRACKING LAN ISSUES ***" reject-with=\
icmp-admin-prohibited
add action=drop chain=input comment="block everything else"
add action=drop chain=input comment="defconf: drop all not coming from LAN" \
disabled=yes in-interface-list=!LAN
add action=accept chain=forward comment="defconf: accept in ipsec policy" \
ipsec-policy=in,ipsec
add action=accept chain=forward comment="defconf: accept out ipsec policy" \
ipsec-policy=out,ipsec
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
connection-state=established,related hw-offload=yes
add action=accept chain=forward comment=\
"defconf: accept established,related, untracked" connection-state=\
established,related,untracked
add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." in-interface-list=VRRP out-interface=\
MGMT_VLAN
# pppoe-out not ready
add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." in-interface-list=VRRP out-interface=\
pppoe-out
add action=drop chain=forward comment="defconf: drop invalid" \
connection-state=invalid log=yes log-prefix="*** invalid ***"
add action=accept chain=forward comment="allow internet traffic (all vrrp inte\
rfaces) - non presente in RB5009 default, aggiunto da CCR2216 (che usava i\
nvece all-vlan)." in-interface=all-vlan out-interface-list=WAN
add action=accept chain=forward comment="allow port forwarding \
\ (viewtopic.php\?t=180838)\
" connection-nat-state=dstnat disabled=yes
add action=reject chain=forward comment="*** TBC LOGGING *** optional --> usef\
ul for tracking LAN issues - in most installations the rule doesn't have t\
o care about multicast traffic because it never sees it (https://forum.mik\
rotik.com/viewtopic.php\?t=180838) - The purpose of the action=reject rule\
\_is to prevent users in LAN from waiting for tens of seconds to get a tim\
eout if they are trying to connect to forbidden destinations, and of cours\
e for the admin to be aware of traffic that has the potential to be a prob\
lem (aka pinpoint device with issues)." dst-address=!0.0.0.0/0 \
in-interface-list=LAN log=yes log-prefix="*** TRACK LAN ISSUES ***" \
reject-with=icmp-admin-prohibited
add action=drop chain=forward comment="defconf: drop all from WAN not DSTNATed\
\_- drop access to clients behind NAT from WAN - drops all new connection \
attempts from the WAN port to our LAN network (unless DstNat is used). Wit\
hout this rule, if an attacker knows or guesses your local subnet, he/she \
can establish connections directly to local hosts and cause a security thr\
eat." connection-nat-state=!dstnat connection-state=new \
in-interface-list=WAN
add action=drop chain=forward comment="block everything else" log-prefix=\
"*** blocked by fwd ***"
/ip firewall nat
add action=masquerade chain=srcnat comment="defconf: masquerade" \
ipsec-policy=out,none out-interface-list=WAN
/ip route
add comment="WAN1 ISP2 via PPPoE" disabled=yes distance=1 dst-address=\
0.0.0.0/0 gateway=pppoe-out pref-src="" routing-table=main scope=30 \
suppress-hw-offload=no target-scope=10
add comment="4G/LTE ISP via ether2" disabled=yes distance=2 dst-address=\
0.0.0.0/0 gateway=ether2 pref-src="" routing-table=main scope=30 \
suppress-hw-offload=no target-scope=10
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 \
scope=10 target-scope=12
add comment="WAN1 ISP2 via PPPoE - ping host 1" distance=1 dst-address=\
1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 \
scope=10 target-scope=12
add comment="WAN1 ISP2 via PPPoE - ping host 2" distance=2 dst-address=\
9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11
add comment="ISP1 via Backup Router" disabled=no distance=3 dst-address=\
0.0.0.0/0 gateway=XX.YY.72.111 pref-src="" routing-table=main scope=3 \
suppress-hw-offload=no target-scope=30
add comment="4G/LTE ISP via ether2" disabled=no distance=4 dst-address=\
0.0.0.0/0 gateway=ether2 pref-src="" routing-table=main \
suppress-hw-offload=no
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www disabled=yes
set www-ssl certificate=Webfig disabled=no
set api disabled=yes
/system clock
set time-zone-name=Europe/Rome
/system identity
set name="MikroTik RB5009 #02"
/system ntp client
set enabled=yes
/system ntp client servers
add address=194.0.5.123
add address=216.239.32.15
/tool mac-server
set allowed-interface-list=none
/tool mac-server mac-winbox
set allowed-interface-list=LAN
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri Apr 28, 2023 8:45 am

The quick suggestion is to make sure the VRRP interfaces are in the LAN and the VRRP ip address is a /32. As to what you think WAN/failover and VRRP will do with PPPoE and LTE, I'm not sure.

You should post your entire config as VRRP and ISP/LTE failover are exactly related.
VRRP interfaces should be ok. I used the /32 address. I confirm that from VRRP standpoint everything seems to be working ok. If I switch off one of the two routers the other takes over and when both are working they have 2 vlans running as master and 2 vlans as backup of the other router and viceversa.

I understand that VRRP does not take care of the ISPs failover, that's why I have defined the routing rules to manage this, but the problem I believe, since I'm referring to the GW IP of the other router, since the subnets (having VRRP) are the same, this is creating issues and in the firewall filters rule there are packets considered invalid and then discarded. I believe that to avoid this, I should point to the other router GW with the failover route entering from the wan side and not from the LAN side. To do this I beleive I shoudl use srcnat, but I do not know how to do this.

Finally the backup LTE line is the last resort and if ISP1 and ISP2 are down the maximum distance is for this line that will take over the internet connection.

I hope I was clear.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri Apr 28, 2023 8:59 am

I'm setting-up two RB5009 routers in VRRP active balanced configuration, so that means both of them are active at the same time.
For each VRRP interface one is always active and the others are backup. There is no "both of them are active" with VRRP. Now each VRRP interface can live on different routers.
I fully agree with you. When I say that both are active, I mean the both routers are running. Router 1 has two vrrp interfaces (vlans are attached to these) running as master (with their corresponding backup on router 2) and the other two vrrp interfaces are in backup mode, since their corresponding masters are running on router 2.

Again I confirm that the VRRP mechanism is working ok, dhcp clients are connecting properly (based on the vlan they are belonging to) to the respective router and if one of the router is off-line, the other is taking over everything and when it is back, both of them are working together (vrrp interfaces master and backup states are correctly shown within the interfaces list).

My problem is only related (I believe) on having proper management of the ISP failover through the recursive routes that should point to the the WAN side of the failover router and not to its lan side.

Thanks for your support.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri Apr 28, 2023 12:48 pm

The routing solution I-m trying to implement is mentioned by Sindy in this conversation, response #11: viewtopic.php?t=172532

Here Sindy is clearly mentioning: "you can use src-nat when routing via the other router; this will make the other router see that the packet as coming from the local router's own address, so it can use e.g. an /ip route rule row matching on a particular src-address to choose a routing table that only contains a default route via its own WAN, and it will automatically deliver the response back to that address. The address used for the src-nat must not be from the LAN subnet to avoid ICMP redirect to be sent to the sender, but you can use the same link which the LAN subnet is using for the interconnection of the routers".

Unfortunately, I do not understand how to specify this route referring to the WAN address of the other router.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Fri Apr 28, 2023 5:25 pm

Now I get it more. VRRP is for you 4 LANs. On WAN, you have a VLAN 835 with the PPPoE WAN connection and the two routers, with two PPPoE client on each router. LTE should be the last choice.

At a high level, I think I'd just use an additional VLAN, without VRRP, and separate from the PPPoE one, to use for WAN routing between the two routers. e.g. use the IP of the other router on that new VLAN as the destination for the backup 0.0.0.0 route. Let PPPoE get a WAN address on each router, and that it's for VLAN835. The new router-to-router VLAN can have recursive route on it to check it's valid (thus no need for VRRP on this particular VLAN). I also suppose you could use the MGMT IP of the far-end router as the 2nd route, but it likely just be cleaner to create a new "WAN routing" VLAN to keep it cleaner.

Otherwise, re-using the PPPoE VLAN for traffic between the routers requires more extensive firewall treatment. Perhaps the src-nat as suggest by @sindy in the other thread, but that's not the entire story I suspect. Personally, I'd avoid additional NAT'ing since it really shouldn't be needed. If router1's internet is down, you want to route to router2 first, then get NAT'ed there. Still need a recursive route on it since if the other router was down, then you do want to go to the 3rd LTE route on the same router.

I'm not sure this is an issue yet. But Ether2/LTE modem, you have a /ip/dhcp-client and a static /ip/route for ether2... likely just need to set the default-route-distance and a default route will be created automatically... so no need for the static route for ether2 since dhcp-client should take care of it. As the "last chance" internet, you likely don't want any recursive routes and you're not doing this which seems right — both since there isn't another choice & also it adds more complexity (requires script on dhcp-client for LTE and additional "canary addresses" (e.g. 8.8.8.8, etc.) if you do want recursive routes).

One more thing, for the /ip/dhcp-server, I normally have them listen on the VRRP interface. But this is just preference since like seeing the leases for a VRRP'd VLAN on one router (e.g. the VRRP master). In theory, it doesn't matter who provides the DHCP address for a VRRP VLAN, but it does get annoying to trace down an lease down the road since you have to look in two places...
 
wiseroute
Member
Member
Posts: 352
Joined: Sun Feb 05, 2023 11:06 am

Re: VRRP and ISP Failover

Fri Apr 28, 2023 6:37 pm

@ rickpal
router 1:
- ether1 pppoe ISP1 (2.5 gbps) (distance 1)
- ether2: backup LTE connection (dhcp client with an LTE modem) (distance 3)
- 4 VLANS (vlans 1 & 2 acting as VRRP master and vlans 3 & 4 as backup with router 2) all 4 have dhcp servers
- sfp-sfpplus1 trunk with the 4 VLANS -> mikrotik switch CRS305 (sfp-sfpplus1 in trunk)

router 2:
- ether1 pppoe ISP2 (2.5 gbps) (distance 1)
- ether2: backup LTE connection (dhcp client with the same LTE modem, as above) (distance 3)
- 4 VLANS (vlans 1 & 2 acting as VRRP backup and vlans 3 & 4 as master with router 1) all 4 have dhcp servers
- sfp-sfpplus1 trunk with the 4 VLANS -> mikrotik switch CRS305 (sfp-sfpplus2 in trunk
are you saying that you have 4 internet links, which are 2 internet for each router?

--- edit

i am not saying that your vrrp setup is not doable. it's really nice to have everything in place - but i just thought that maybe you could make your homework easier by doing better design, better assessment? I'm sure that there are many other things to do with your network?🤔
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri Apr 28, 2023 7:06 pm

Amm0, first I've to thank you for your precious support.
Let me go through your different points / suggestions:
At a high level, I think I'd just use an additional VLAN, without VRRP, and separate from the PPPoE one, to use for WAN routing between the two routers. e.g. use the IP of the other router on that new VLAN as the destination for the backup 0.0.0.0 route. Let PPPoE get a WAN address on each router, and that it's for VLAN835. The new router-to-router VLAN can have recursive route on it to check it's valid (thus no need for VRRP on this particular VLAN). I also suppose you could use the MGMT IP of the far-end router as the 2nd route, but it likely just be cleaner to create a new "WAN routing" VLAN to keep it cleaner.
Can you kindly elaborate a bit more (I'm quite new to mikrotik and this is one of my first router setup) about this "WAN Routing" vlan? Should I create it in the same way I did for the 835 (required by the ISP), separate from this one(?), on which interface(?) and how can I use it on both routers and have 2 IPs on it to use on the other router as 2nd route (getting internet access from each of the pppoe connections)? You mention also to use the MGMT IP of the far-end router as the 2nd route, but this is exactly what I'm doing today:

/ip route
add comment="ISP2 via Backup Router" disabled=no distance=3 dst-address=\
0.0.0.0/0 gateway=XX.YY.72.112 pref-src="" routing-table=main scope=3 \
suppress-hw-offload=no target-scope=30

but the problem I'm having in doing that is that the firewall filter rule that in the forward chain (chain=forward action=drop connection-state=invalid) drops all the packets related to that connection via the other router. I believe this is due to the fact these packets are coming not through the nat masquerade. So to avoid these packets to be dropped I added 2 accept rules before the drop invalids.

So, using this new "WAN Routing" VLAN will I have the same issue? But I need your help on how to understand how to create it between the two routers and have the 2 IPs for the 2 routers and internet connection too.

Otherwise, re-using the PPPoE VLAN for traffic between the routers requires more extensive firewall treatment. Perhaps the src-nat as suggest by @sindy in the other thread, but that's not the entire story I suspect. Personally, I'd avoid additional NAT'ing since it really shouldn't be needed. If router1's internet is down, you want to route to router2 first, then get NAT'ed there. Still need a recursive route on it since if the other router was down, then you do want to go to the 3rd LTE route on the same router.
the @Sindy suggestion is the only one I've found on the forum that could address my issue having VRRP and having a 2nd route for ISP failover on a far-end router, that I suppose could avoid to have the invalid packets dropped, passing them to the WAN of the other router.

I'm not sure this is an issue yet. But Ether2/LTE modem, you have a /ip/dhcp-client and a static /ip/route for ether2... likely just need to set the default-route-distance and a default route will be created automatically... so no need for the static route for ether2 since dhcp-client should take care of it. As the "last chance" internet, you likely don't want any recursive routes and you're not doing this which seems right — both since there isn't another choice & also it adds more complexity (requires script on dhcp-client for LTE and additional "canary addresses" (e.g. 8.8.8.8, etc.) if you do want recursive routes).
With regard to the ether2/LTE modem, I'm still in the configuration process, so it's simpler for me to test it with a dhcp client getting an IP from an ether cable with just internet access. The final situation will be a static address e.g. 192.168.1.2 with a GW 192.168.1.1 from the LAN side of the LTE modem. I created a static route just to be sure to have the highest distance and use it only as last resort (without risk to dragging the LTE connection without knowing it). I will use this LTE connection for both routers, so I'll put a dumb switch on the LAN LTE modem and I'll connect it with 2 ethernet cables to the ether2 of both routers.
BTW, on this regard I still have a doubt since when ISP1 will go down and the router 1 will use the 2nd route to get internet from the router 2 ISP2, if this ISP is down too, router 1 will take internet through ether2 of the router 2 (and exactly would happen vicevers router 2-> router 1). But I do not know if your solution of using the "WAN Routing" VLAN will be able to address this issue too, making the 2nd route working only with the pppoe connections (maybe yes).

One more thing, for the /ip/dhcp-server, I normally have them listen on the VRRP interface. But this is just preference since like seeing the leases for a VRRP'd VLAN on one router (e.g. the VRRP master). In theory, it doesn't matter who provides the DHCP address for a VRRP VLAN, but it does get annoying to trace down an lease down the road since you have to look in two places...
To implement VRRP I google around the mikrotik forums and manuals (and I did not find too much honestly), so I understood how it should work and I tried on my own. I started to create the VLANs for each router and got them working, then I added the VRRP on top of them. Now, if you would suggest me a different way and I could avoid to have only 4 active dhcp servers instead of 8, for me it would be better and I'd be happier. Consider anyway that I split master and backup between the 2 routers and the only way I thought it would be possible in case of crash of one of the 2 routers is that the remaining one should have all the 4 dhcp servers, but this is the first VRRP experience for me, so the solution could be different. I believe you are suggesting that putting the VRRP (instead of VLAN) interface on the DHCP server this will have only 4 dhcp server active at any time. If you would suggest me how to implement this (maybe just change the interface reference in the dhcp server), I'd be very happy.

Finally, I'm so grateful to you, since you are spending your time helping me in addressing these issues, otherwise I'll not know how to solve.
Thanks again.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri Apr 28, 2023 7:13 pm

@ rickpal
...
are you saying that you have 4 internet links, which are 2 internet for each router?
I've three: 1 iSP for each router (ISP1, ISP2) and a third 4G/LTE modem shared btw the 2 routers

i am not saying that your vrrp setup is not doable. it's really nice to have everything in place - but i just thought that maybe you could make your homework easier by doing better design, better assessment? I'm sure that there are many other things to do with your network?🤔
I really appreciate your help, if you have any suggestion. pls let me know, I'm quite new to mikrotik, but I love it. I started with switches and I'm currently trying to place the router too.
 
wiseroute
Member
Member
Posts: 352
Joined: Sun Feb 05, 2023 11:06 am

Re: VRRP and ISP Failover

Fri Apr 28, 2023 7:42 pm

@ rikpal
I've three: 1 iSP for each router (ISP1, ISP2) and a third 4G/LTE modem shared btw the 2 routers
how about to put the lte modem in the drawer first, so that you can focus on the vrrp ?

ok. so you have 2 vrrp routers in active - active mode?

r1 master for wan1,
r2 master for wan2,

am i correct?
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri Apr 28, 2023 7:51 pm

@wiseroute
yes
 
wiseroute
Member
Member
Posts: 352
Joined: Sun Feb 05, 2023 11:06 am

Re: VRRP and ISP Failover

Fri Apr 28, 2023 8:29 pm

hello rikpal,

ok. since I have read that your vrrp have worked perfectly, then this is your next step right?
I'm currently having issues and struggling in managing failover of the 2 ISPs lines between the two routers.
Following this guide (viewtopic.php?t=182373), I setup dual wan recursive (using 2 recursive routes - flat) rules to manage the two ISP failover.
That means that if ISP1 goes down, router 1 should use (distance 2) the ISP2 from the router 2 and viceversa. See below.
well, it's not that easy to separate your 2 isp fail over with your vrrp active - active setup. since each vrrp router actually in active forwarding state.

i am sorry, my eyes couldn't read long config any longer, so i better ask a simple one.

the question is: when designing those vrrp active - active mode, do you have any LAN gateway separation? ie. vlan 1 & 2 mainly for wan1, 3 & 4 goes to wan2? are those vlans existed on each switch?

if each 2 vlans are allocated on single switch, then it should be simple enough to do
1. gateway detection for each one. distance will have no effect in the wan transition process, it got overridden by your gateway detection and the script.

the problem is if those vlans are spread across those 2 switches.

2. if wan1 offline, kick script to re-route to wan2, and vice versa.

make new routing tables for fail over, examples : wan1_active, wan1_backup. because each wan represent different vlan.

so there will be 2 routing tables besides main table on each router.

3. just don't look at the vrrp setup. it's physical. any devices connected to either one of the switch, will lost if it is offline. so stay focus on your gateway re-route.

ok. this will be great 👍🏻 good luck 👍🏻
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sat Apr 29, 2023 8:27 am

@wiseroute
Thank you for your suggestions and advices.
I would prefer avoiding scripts, whenever possible.
I understood from @Amm0 that a dedicated VLAN for the WAN (ISP1 and ISP2) could be the right way, but I do not know how to implement it, since my PPPoE interfaces on ether1 are already associated to the VLAN 835 required by my ISPs. Moreover, can I trunk this WAN VLAN into the other bridges' VLANs to share it between the two routers. How this WAN VLAN will have access to internet? I hope that somebody could help me on this.
The way depicted by @Amm0 seems to answer to most of my questions and doubts.
Tnx.
 
wiseroute
Member
Member
Posts: 352
Joined: Sun Feb 05, 2023 11:06 am

Re: VRRP and ISP Failover

Sat Apr 29, 2023 10:44 am

hello rikpal
I would prefer avoiding scripts, whenever possible.
ok. here is the thing - in every fail over schema there should be some kind of mechanism to move the failing routes to working one. be it static by script or dynamic using routing protocol.

are you familiar with it? at least rip should be enough.

and, maybe now you have learned that any vrrp deployment needs thorough assessment.

for each vrrp link, they should independent from their gateways. different physical devices. so they not having split brain when the fail over system happens.

ok. good luck 👍🏻
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Sat Apr 29, 2023 5:07 pm

I was suggesting that you can create a new VLAN network, calling it VLAN 100 with 172.22.1.0/24. Same steps to create any VLAN...
/interface/vlan with vlan-id=100 and /ip/address of 172.22.1.1/24 and 172.22.1.2/24 for your two routers.
Then for the "2nd ISP" use 172.22.1.2 as the default route 1st router, and 172.22.1.1 on 2nd router. The idea being is the each router just routes to it's brother. You'd also need tagging of new VLAN 100 in the bridges and your switch. All the recursive routing same – just using the 172.22.1.x as the 8.8.4.4 (or whatever) route.

For failure modes: If one route is physically off, VRRP get the LANs to the working one, and recursive route would kill the route to the 2nd router since since 8.8.4.4 wouldn't be pingable. If ISP1 died (but still working locally), on one of the router (e.g. 8.8.8.8 primary recusive route fails ping check), then it route to the 2nd router.

With @wiseroute here...leave LTE out initially (although it could be 172.22.1.3 as 2nd IP on VLAN 100 that use also use in a recursive route). VRRP is totally right approach for LANs, but WAN recusive routes part is just tricky & with two router even more tricky. Not sure about dynamic routing protocols (RIP or OSPF) here, with only two routes and a lot going on already, not sure that help – still needs a live-ness check someplace, which unforentently involves recursive routes to do it well today ;)
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sat Apr 29, 2023 8:03 pm

@Ammo
Tnx a lot, I'll try in the next days and I'll let you know.
I hooe I'll not have the same issue with the invalid packets being discarded.
Cheers
 
wiseroute
Member
Member
Posts: 352
Joined: Sun Feb 05, 2023 11:06 am

Re: VRRP and ISP Failover

Sat Apr 29, 2023 8:09 pm

@ amm0 @ rikpal

very nice discussion 👍🏻
for failure modes: If one route is physically off, VRRP get the LANs to the working one, and recursive route would kill the route to the 2nd router since since 8.8.4.4 wouldn't be pingable. If ISP1 died (but still working locally), on one of the router (e.g. 8.8.8.8 primary recusive route fails ping check), then it route to the 2nd router.
well, probably this vrrp discussion can't take a short answer 😂

ok. let us observe this example from the wiki.

https://wiki.mikrotik.com/wiki/File:Vrr ... haring.png

i think that is what @ rikpal has in mind.

now, from that picture, do you notice that shared media marked as LAN??

well, in real world - that shared media is supposed to be the real vrrp switches, which then attached to upper real routers. so, let's say many of us got confused in understanding how the vrrp works.

just don't think that vrrp switches are the real router. no. they still need the real gateway to the internet so that those vrrp routers don't have split brain, and the most important thing is that vrrp only works in physical hardware error.

question:
so, what is the point having another router which attached to the vrrp switch before go to the internet?

answer: the wan detection should shut down the interface connected to the vrrp switch when the wan fails. hence the vrrp switches sense that physical failure which then triggers to re-route to wan2.

but.. as for @ rikpal vlans.. i think maybe he got the idea 💡

hope this helps.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Sat Apr 29, 2023 11:26 pm

I'd listen to @wiseroute. He seem to have a better handle on what going on here.

On the specific issue,
I hooe I'll not have the same issue with the invalid packets being discarded.
It might be easy to try disabling the rp-filter, at least temporarily. That may be causing the invalid packets, but hard to know.
/ip settings set rp-filter=none

You can check the log checkbox in the firewall rule for drop invalid, and see what packets are "invalid".

Also I'm not sure the recursive route are setup right, but didn't check the whole config.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sun Apr 30, 2023 1:05 am

With regard to the rpfilter it was originally set to none and i tried to set to loose following the router setup from anav (see link reference in my first post), that I used also for the recursive route definizione.

If I correctly remember, in both cases for the rpfilter set to none and loose, I was getting the invalid packets.

I believe that the invalid packets, but I could be wrong, are due to the fact that when I point in the route to the real GW LAN IP of the other router the packets are flowing across the routers, in a wrong way (lan side instead of wan), so in the fwd chain when they are checked if valid, and since they are not already established or related, they are then discarded as invalid and dropped. That's why I started this thread mentioning the sindy thread that was talking about accessing the other router from the wan side instead of the lan side.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Sun Apr 30, 2023 3:47 am

With regard to the rpfilter it was originally set to none and i tried to set to loose following the router setup from anav (see link reference in my first post), that I used also for the recursive route definizione.
Yes, you want "loose" – thought it be quick test to see if had an effect.
That's why I started this thread mentioning the sindy thread that was talking about accessing the other router from the wan side instead of the lan side.
And that why I suggested a new VLAN – e.g. a "router-to-router" one. As this let you keep the PPPoE one as "WAN" in your firewall.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Mon May 01, 2023 4:21 am

Whats missing is a detailed network diagram, it may help!!

( besides that I am no good to do vrrf either, I can help with your failover and routes within a router only ) vrrf add another dimension that I would hope is separate from each routers own individual failover setup between the WANs on the router. AKA should be an isolated thing from VRRF takeover...........?????
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Mon May 01, 2023 8:27 pm

Good point. I'm not sure VRRP has much to do with underlying issue here. I think the underlying issue is the recursive routing and firewall stuff isn't quite right.

@anav, the part VRRP is pretty easy in "all VLAN world...
- add a VRRP interface linked to a VLAN
- give it an /ip/address with /32 in same subnet as VLAN
- add the same interface-list as the associated VLAN
- repeat same steps above on the 2nd router, using same IP address of VRRP interface on 2nd.
- finally use same VRRP IP address in the VLAN's DHCP server network's gateway for the VLAN.
- If you get fancy... you can change the priority of the VRRP interface control which router "wins" or use new "sync connection tracking" so that if failover, all the connections remained tracked (but this requires using same upstream WAN – and not used by OP's case here.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Mon May 01, 2023 8:33 pm

Nope he is mixing up vrrf and failover then..........

First he is stating vlans are for VRRF, I dont see any regular vlans only vrrf vlans and what he calls back up vlans so that is really hosed in my opinion.
Second and the worst part is he trying to do VRRFF in the recursive failover --- what a dogs breakfast.
So either VRRF is clean or its a messy pile of doo doo to be avoided.

To me, it should be.
Admin has regular vlans on router1 and regular vlans on router2 (identical probably)
Then each router has one vlan for VRRF ( some sort of communication between the two routers )

Each router should have identical failover recursive routes for the two ISPs each router is dealing with - again probably identical setups.
There should not be failover to the other router (ONLY from one ISP to the other) , that router turnover should be handled by VRRF........

In other words, if the router is not available then switch to other router, this turnover is independent of individual ISP connectivity and is only for router redundancy (assuming).
Last edited by anav on Mon May 01, 2023 9:58 pm, edited 1 time in total.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Mon May 01, 2023 8:49 pm

Whats missing is a detailed network diagram, it may help!!

( besides that I am no good to do vrrf either, I can help with your failover and routes within a router only ) vrrf add another dimension that I would hope is separate from each routers own individual failover setup between the WANs on the router. AKA should be an isolated thing from VRRF takeover...........?????
Anav, thank you answering too.
I'm coming back after three days off and I'm going to test the Amm0 suggestion.

The network diagram is very simple at this stage since I'm in the process to configuring the 2 routers in vrrp and both of them are connected to a switch (CRS305) in trunk on the sfp+1 (router 1) and sfp+2 (router 2). The PCs to test that everything is working properly, are connected to the CRS305 to sfp+3 and 4. Once everything will work properly, I'll test it in the full network.
With regard to the routes, I followed completely your suggestions (see my first post routes and the link of your guide - but ... I believe, these are for multi wan connections managed by the same router ...).

Now, if we forget for a moment about the VRRP setup (that does not take care at all about ISP failover issues), I've simply 2 routers connected each one with its own ISP (ether 1 via pppoe - route to be followed are distance 1 & 2 and I'm using a 2 way recursive flat).
What I need to do is, in case ISP1 will fail, router 1 needs to connect with ISP2 (route distance 3). The same should be in case ISP2 will fail for router 2, that will try using ISP1.
Amm0's suggestion is to create a "VLAN for WAN routing" VLAN 100 (see post #18 of this discussion) that is independent from the VRRP and to use the 2 IP addresses created for this VLAN in the recursive routes to reach the the other far-end router: so, router 1 route vs router 2 will reference 172.22.1.2 and viceversa for router 2 that will use 172.22.1.1 to reach router 1. This is what I'm going to test tomorrow.
I believe this type of need I've should be related to something already addressed by others and it should be independent from VRRP and should work in any way.

As of today, I was trying to use (instead of VLAN 100) one of my four VLAN I created under VRRP (more precisely the mgmt vlan). So, I was experiencing the issue above described about invalid packets dropped by the firewall filter rule (chain=forward action=drop connection-state=invalid) and that's why I created two additional rules to accept them before the drop invalids (see my configs above posted for the two routers). I do not know exactly why these packets are dropped (I made some guess but I could be wrong). Maybe because I was using one VLAN under VRRP or for other reasons. Now with the Amm0's suggestion to use this new VLAN for WAN Routing I hope it will address any issue related to VRRP and it will work. But again, I think my need is independent from VRRP (that does not address ISP failover issues) and I need simply to manage routing btw two different routers each one connected to one ISP.

Finally, in addition to all the above stuff (ISP1 and ISP2), each router has a second backup line on ether 2 connected with the lan of a 4G/LE modem that in the routes have the highest distance, so this is the last chance for routers 1 & 2 to connect to internet, in case both ISPs will be down. But this 4G/LTE connection is working ok and it is not an issue for me.

I hope this clarifies.

P.S.: I hope that I'm managing in the right way my ip firewall filter rules and that they are not creating the troubles I'm experiencing above... In this case too I followed your suggestions to drop everything non expressly allowed. I had some difficulties initially to segregate the VLANs so I solved this in the chain=input with one rule for each VLAN that accepts where packets src-address and dst-address belonging to the subnet.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Mon May 01, 2023 10:03 pm

Nope completely mixed up for me.
A diagram will help immensely.
You keep stating a confusing picture.
What I need to do is, in case ISP1 will fail, router 1 needs to connect with ISP2 (route distance 3). The same should be in case ISP2 will fail for router 2, that will try using ISP1.

Are they the same ISPs Why do you reverse them on the different routers......... makes no sense to me.
Are you trying to engineer a router replacement or NOT ?? Its starting to sound like a very fancy ISP failover mechanism

where if an ISP1 goes down on teh primary, switch all traffic to the other router which has ISP2 already up and running.
INSTEAD of changing to ISP2 on the first router.

So whatever concept you are trying to achieve, I dont get it.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 02, 2023 12:05 am

@Anav,
Tomorrow morning before going at work I'll draft a diagram and i'll better explain what I'm trying to do, hoping it will better clarify you.
Now it's quite late time here.
Tnx for your help.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 02, 2023 8:40 am

@Anav,
here attached the network diagram could help to better understand what I'm trying to do.

NETWORK DIAGRAM 2 ROUTERS VRRP - ISP FAILOVER.pdf

1) The two routers are active together and vrrp protect both of them from hw failure. In case one of the 2 routers goes down, the other is able to take over all the vlans and the devices connected to them in a transparent way. (https://wiki.mikrotik.com/wiki/Manual:V ... ad_sharing)
2) Each router is connected to its own ISP (ISP1 and ISP2). Of course, in case one of the routers will crash, the other will take over and only one of the ISP will be active.
3) What I would like to do is simply manage the ISPs failover, since VRRP does not take care about it. In order to do this, I followed this guide (viewtopic.php?t=182373). I setup dual wan recursive (using 2 recursive routes - flat) rules to manage the two ISP failover.
What I would like to do is that if ISP1 goes down, router 1 should use (distance 2) the ISP2 from the router 2 and viceversa. See below.

# router #01 routes definition
/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 2"
# +++++++++++++++++++
add distance=3 dst-address=0.0.0.0/0 gateway=ROUTER2-GW-LAN-IP scope=3 target-scope=30 comment="ISP2 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2"

# router #02 routes definitions
/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 2"
# +++++++++++++++++++
add distance=3 dst-address=0.0.0.0/0 gateway=ROUTER1-GW-LAN-IP scope=3 target-scope=30 comment="ISP1 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2

As of today I was not able to have these routes fully working, getting the issue already described, having the routed packets being dropped by the firewall (chain=forward action=drop connection-state=invalid).
The suggestion I got from Amm0 was to implement a new VLAN for routing (as described in. the previous posts) and to use the two VLAN IPs in the above routes.
I should be able to test the Amm0's suggestion this evening or at the latest tomorrow morning.

I hope this better clarifies.
You do not have the required permissions to view the files attached to this post.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 02, 2023 1:53 pm

BTW, I found also this other discussion about VRRP and ISP failover dated 2021.

link: viewtopic.php?t=180907

The principle is the same, two routers in VRRP in load sharing, two ISPs (one for each router), two subnets (one for each router but no VLANs) and ISP failover (considered separately from VRRP).

I do not know if it could help or not.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Tue May 02, 2023 2:27 pm

Thanks for the diagram.
I do not understand the requirement.
Or more clearly why do you have two routers?
Why not host ISP1 and ISP2 on a single router.

Are you afraid of router failure? or what, is the driving factor here...........
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 02, 2023 2:46 pm

To manage router failure. I've two rb5009 available and 2 isp connections.
Currently I've a mikrotik vigor 3910 and I had recently a failure (now solved) blocked everything for some time. So, I wanto to have 2 routers in HA and VRRP should address this (seems to be very fast and reliable).
VRRP is configured and it is switching properly in case of down of one of the two routers, but it does not take care at all about internet failover.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Tue May 02, 2023 7:08 pm

Here is a question.
If ISP1 is for Router Master and ISP2 is for ROUTER slave, what happens when ISP1 goes down from a VRRP perspective?
In other words, does this constitute or mimic router failure and thus the Slave then becomes the Master?

Assuming the two routers are connected over ether2 lets say................
Both routers would still be up and running except there would be no internet access on Router Master
Is VRRP smart enough to recognize that Router Master has not working gateway and thus should no longer be the Master ??
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 02, 2023 8:20 pm

Here is a question.
If ISP1 is for Router Master and ISP2 is for ROUTER slave, what happens when ISP1 goes down from a VRRP perspective?
In other words, does this constitute or mimic router failure and thus the Slave then becomes the Master?
I try to answer your question based on my understanding.
I applied VRRP only to LAN side and not WAN side. I do not know currently how to do it on the WAN side, but, in case I'd have done this means that on one router we should have a VRRP master active interface (up and running) and on the other side a backup interface (sleeping waiting for replacing the master if needed). That is not what I want since I would like to use both ISPs together and if one is down use the working one.
In the current configuration when ISP1 (or ISP2) goes down VRRP is ok.

Assuming the two routers are connected over ether2 lets say................
Both routers would still be up and running except there would be no internet access on Router Master
Is VRRP smart enough to recognize that Router Master has not working gateway and thus should no longer be the Master ??
Again, VRRP is not on the WAN side.
What do you mean both connected with ether2 (directly connected - are you thinking about a single WAN for both routers to connect them to internet - PLS SEE THE LAST NOTE I PUT BELOW IN THIS RESPONSE)

Anyway I just applied the Amm0 suggestion and the situation is the same as before (when I was using the routes on one of the VRRP'ed VLAN interfaces).
To better explain what I did to use an alternative route vs the far-end router:
1) I Remove the cable for ISP2, forcing the router 2 to look for connection to 172.22.1.1 (on router 1) - consider that the route is in blue colour.
2) VRRP does not recognise any HW failure and is not changing anything
3) the pc connected to router 2 does not have anymore connection (ISP2 is down but it is not getting connection from ISP1)
4) I see from the log in router 2 that the chain=fwd drop invalids was dropping all the packets where in:VRRP out:VLAN 100 - in order to avoid this I moved the chain=fwd rule allowing this before the drop invalid rule, as follows:

/ip firewall filter
...
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
connection-state=established,related hw-offload=yes
add action=accept chain=forward comment=\
"defconf: accept established,related, untracked" connection-state=\
established,related,untracked
add action=accept chain=forward comment=""\
in-interface-list=VRRP out-interface=VLAN100
add action=drop chain=forward comment="defconf: drop invalid" \
connection-state=invalid log=yes log-prefix="*** invalid ***"
...

5) On the router 1 I see from the log that the chain=fwd drop invalids is dropping all the packets where in:VLAN100 out:pppoe-out - in order to avoid this I moved the chain=fwd rule allowing VLAN100 before the drop invalid rule, as follows:

/ip firewall filter
...
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
connection-state=established,related hw-offload=yes
add action=accept chain=forward comment=\
"defconf: accept established,related, untracked" connection-state=\
established,related,untracked
add action=accept chain=forward comment="allow internet traffic for VLAN 100 \
for WAN Routing for ISP failover." in-interface=WAN_ROUTING_VLAN out-interface-list=WAN
add action=drop chain=forward comment="defconf: drop invalid" \
connection-state=invalid log=yes log-prefix="*** invalid ***"
...

In doing this, I'm able to get ISP failover working and I've internet working.
When I reconnect the cable for ISP2 everything is back normal.

I really do not know why these packets are dropped as invalid.

NOTE: I'VE SEEN THAT IN THE MIKROTIK EXAMPLE OF VRRP IN LOAD SHARING THE ISP CONNECTION(S) ARE REPRESENTED AS A WAN TO WHICH BOTH ROUTERS CONNECT (https://wiki.mikrotik.com/wiki/Manual:V ... ad_sharing), BUT I DO NOT KNOW WHAT DOES IT MEAN.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 04, 2023 5:23 pm

@Amm0, @Anav,

I'd like to summarize, the status based on my understanding about this issue (I spent the last day to do some more tests):

1) VRRP (implemented on VLANS) seems to work ok - no problems (in case one router is off the other takes over everything and once the first is back both are working in load sharing)
2) with regard to ISP failover, I defined the routing routes (I summarize them below for both routers) and implemented as suggested by @Amm0 a VLAN 100 for ISP failover routing (listed below)
3) Now, I tested the ISP failover (removing the ISP1 cable from ether1 router1 and testing that the connectivity is routed do ISP2 and viceversa) and the only way to make it working is to add 2 separate firewall filters rules in the chain=forward before drop invalid to accept packets with in-interface=VLAN100 and out-interface=VLAN100 on each router (such 2 rules are below posted for more clarity). In this way routing for ISP failover seems to work ok. But I do not know why packets that are from/to interfaces VLAN100 are considered invalid.

So I would like to understand if this situation is normal and acceptable or not. in other words: IS IT OK HAVING THE ISP FAILOVER ROUTING WITH THESE 2 ACCEPT RULES (THEY ARE EXPLAINABLE) OR IS A DIRTY PATCH AND THEY WORK JUST BY COINCIDENCE?
I'm not able to understand why in this way is working and if it is a proper implementation or not.

Few points may help to understand better or avoid some your potential questions:
A) Firewall filters rules should be ok (I previously posted the configs for both routers in this thread) since I did a test, removing all the chain=input rules and nothing change. The invalid packets droping is still there. For me this means that the chain=input rules are not causing any problem (with or without them it's the same). In the chain=forward, the rules are accept fast-fwd, accept established/related, drop invalid, then all the rest. If I insert the 2 rules mentioned at 3 above before drop invalid, then VLAN100 packets are not dropped and routing to the alternate ISP is working ok.
B) for the ip routes I tried also to implement the suggestions from @Anav in another thread (viewtopic.php?t=188388) using an additional route table as explained in the thread (/routing table add name=useWAN2 fib) but nothing change and VLAN100 packets are dropped anyway.

------------------------------------------------------------------------------

WAN ROUTING VLAN100 DEFINITIONS

# ROUTER #01
/interface vlan
add interface=bridge name=WAN_ROUTING_VLAN vlan-id=100 comment="WAN Routing VLAN 100"
/interface bridge vlan
add bridge=bridge comment="WAN Routing VLAN 100" tagged=bridge,sfp-sfpplus1 \
untagged=ether6 vlan-ids=100
/interface bridge port
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether6 pvid=100
/ip address
add address=172.22.1.1/24 comment="WAN Routing VLAN Gateway #01" interface=\
WAN_ROUTING_VLAN network=172.22.1.0
# this rule should not be relevant because as explained in bullet B above, if I remove the entire chain=input the VLAN100 packets are anyway dropped in the chain=forward
/ip firewall filter
add action=accept chain=input comment=\
"allow WAN Routing VLAN 100 only (inter-vlan is blocked)" dst-address=172.22.1.0/24 \
src-address=172.22.1.0/24
add action=accept chain=forward comment="allow internet traffic for VLAN 100 \
for WAN Routing for ISP failover." in-interface=WAN_ROUTING_VLAN out-interface-list=WAN

# ROUTER #02
/interface vlan
add interface=bridge name=WAN_ROUTING_VLAN vlan-id=100 comment="WAN Routing VLAN 100"
/interface bridge vlan
add bridge=bridge comment="WAN Routing VLAN 100" tagged=bridge,sfp-sfpplus1 \
untagged=ether6 vlan-ids=100
/interface bridge port
add bridge=bridge comment=defconf frame-types=\
admit-only-untagged-and-priority-tagged interface=ether6 pvid=100
/ip address
add address=172.22.1.2/24 comment="WAN Routing VLAN Gateway #02" interface=\
WAN_ROUTING_VLAN network=172.22.1.0
# this rule should not be relevant because as explained in bullet B above, if I remove the entire chain=input the VLAN100 packets are anyway dropped in the chain=forward
/ip firewall filter
add action=accept chain=input comment=\
"allow WAN Routing VLAN 100 only (inter-vlan is blocked)" dst-address=172.22.1.0/24 \
src-address=172.22.1.0/24
add action=accept chain=forward comment="allow internet traffic for VLAN 100 \
for WAN Routing for ISP failover." in-interface=WAN_ROUTING_VLAN out-interface-list=WAN

------------------------------------------------------------------------------

IP ROUTES DEFINITIONS:

# router #01 routes definition
/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 2"
# +++++++++++++++++++
add distance=3 dst-address=0.0.0.0/0 gateway=172.22.1.2 scope=3 target-scope=30 comment="ISP2 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2"

# router #02 routes definitions
/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 2"
# +++++++++++++++++++
add distance=3 dst-address=0.0.0.0/0 gateway=172.22.1.1 scope=3 target-scope=30 comment="ISP1 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2

------------------------------------------------------------------------------

THE TWO IP FIREWALL FILTER RULES TO AVOID THAT VLAN100 PACKETS ARE DROPPED IN CHAIN=FORWARD

/ip firewall filter
add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." out-interface=WAN_ROUTING_VLAN

add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." in-interface=WAN_ROUTING_VLAN

add action=drop chain=forward comment="defconf: drop invalid" \
connection-state=invalid log=yes log-prefix="*** invalid ***"

Thanks in advance for your patience and help.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Thu May 04, 2023 5:46 pm

I think the accept rules are okay for "router-to-router" – at this point it the 1st's firewall as already decided it's WAN traffic for the default route & thus NOT inter-VLAN traffic of client that's your trying to block. Really no different security IMO. The only case you'd run into trouble is if both routers didn't have identical routes tables and firewall rules – but your LAN VRRP scheme ensures that can't happen (otherwise VRRP wouldn't work ;))

You keep mention "invalid packets", but I'm just not sure what you mean specifically. You can add some firewall rules with a "log" action and filters for invalid to see the src and dst IP to understand what exactly is getting flagged as "invalid".
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Thu May 04, 2023 5:50 pm

You asked about WANs and VRRP, that's more useful when you have a single upstream connection, and you want either router to be able to use it – active/passive on WAN. In your case, you have two independent connection, so recursive routes is for sure what you need on the WAN, not VRRP.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 04, 2023 6:26 pm

Thank you Amm0,
I name them invalid packets since their connection-state=invalid and they are dropped by the in the chain=forward, unless I accept them with these 2 rules in both routers before are being dropped as invalid:

/ip firewall filter
add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." out-interface=WAN_ROUTING_VLAN

add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." in-interface=WAN_ROUTING_VLAN

add action=drop chain=forward comment="defconf: drop invalid" \
connection-state=invalid log=yes log-prefix="*** invalid ***"

As you can see from the above I already have set the rule that drops invalids with log=yes and log-prefix="*** invalid ***"
and what I've seen is the if ISP2 goes down, then router 2 routes its connection towards ISP1.
In router 1 log (that is currently providing ISP1 failover connection to router 2) I see invalid forward packet where in-if=WAN_ROUTING_VLAN out-if=pppoe-out and mac address is the VLAN100 interface in router 1, then
In router 2 log I see invalid fwd packets where in-if=VRRP (the vlan connected with my laptop) out-if=WAN_ROUTING_VLAN and mac address is my laptop's mac address

Is this normal that is solved accepting the invalid packets (that should be normally dropped) but I accept them before being dropped?

tnx.
Last edited by rikpal on Thu May 04, 2023 6:35 pm, edited 1 time in total.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 04, 2023 6:28 pm

You asked about WANs and VRRP, that's more useful when you have a single upstream connection, and you want either router to be able to use it – active/passive on WAN. In your case, you have two independent connection, so recursive routes is for sure what you need on the WAN, not VRRP.
Yes, I agree with you that this is the real question. recursive routes btw two different routers with independent connections.
Nevertheless, I need VRRP for HW redundancy.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Thu May 04, 2023 6:33 pm

Basically I'm saying you got this right for what you're trying to do – VRRP for LAN, recursive routes for WANs. You need the VLAN100 because of the recursive routes since those need a "connected route" to the other router, not the 2nd ISP far end IP.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Thu May 04, 2023 6:42 pm

Nevertheless, I need VRRP for HW redundancy.
Or if you like it when a router reboots/upgrades/etc, it causes no service outages. In all honesty, VRRP has never kicked because of an actual hardware failure (sure VRRP works, just doesn't happen)...now frequent RouterOS v7 updates, different story.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 04, 2023 6:50 pm

@Amm0, is this clarifying / convincing you how / why this is happening?
Thank you Amm0,
I name them invalid packets since their connection-state=invalid and they are dropped by the in the chain=forward, unless I accept them with these 2 rules in both routers before are being dropped as invalid:

/ip firewall filter
add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." out-interface=WAN_ROUTING_VLAN

add action=accept chain=forward comment="need this rule to manage the ISP fail\
over on the other VRRP router, otherwise these packets will be discarded a\
s invalid by the next rule." in-interface=WAN_ROUTING_VLAN

add action=drop chain=forward comment="defconf: drop invalid" \
connection-state=invalid log=yes log-prefix="*** invalid ***"

As you can see from the above I already have set the rule that drops invalids with log=yes and log-prefix="*** invalid ***"
and what I've seen is the if ISP2 goes down, then router 2 routes its connection towards ISP1.
In router 1 log (that is currently providing ISP1 failover connection to router 2) I see invalid forward packet where in-if=WAN_ROUTING_VLAN out-if=pppoe-out and mac address is the VLAN100 interface in router 1, then
In router 2 log I see invalid fwd packets where in-if=VRRP (the vlan connected with my laptop) out-if=WAN_ROUTING_VLAN and mac address is my laptop's mac address

Is this normal that is solved accepting the invalid packets (that should be normally dropped) but I accept them before being dropped?

tnx.
so, is this the right way to manage this?
but is there no other way to avoid this patch to accept such VLAN100 routing btw the two routers?
tnx.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Thu May 04, 2023 7:03 pm

Ahh, it's just a few invalids packets at failover... That's happens in the time when connection tracking is figuring it out the failover I think.

The is the newer "Sync Connection Tracking" option that keep the firewall connection tracking state on both routers. But you can't use this with your "dual active-active WAN" since the WAN IP changes, you do NOT want the connection tracking the same. So as TCP etc session figure out the new router, you'll get a few invalid.

If these invalid persist long after the failover, different story then...
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 04, 2023 7:26 pm

Ahh, it's just a few invalids packets at failover... That's happens in the time when connection tracking is figuring it out the failover I think.

The is the newer "Sync Connection Tracking" option that keep the firewall connection tracking state on both routers. But you can't use this with your "dual active-active WAN" since the WAN IP changes, you do NOT want the connection tracking the same. So as TCP etc session figure out the new router, you'll get a few invalid.

If these invalid persist long after the failover, different story then...
Could you pls elaborate more?
Anyway, I'm not using the VRRP "Sync Connection Tracking" feature. Because I do not need to use it and it should also be an over-complication since I'm having 2 VLANs on router 1 and 2 VLANs on router 2 and I do not need to keep track of the connections.
Just for your information I've 2 static IPs on both lines, so the 2 WAN IPs are always the same.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Thu May 04, 2023 7:54 pm

[removed by accident - meant to edit most recent post]
Last edited by Amm0 on Sat May 06, 2023 5:12 am, edited 2 times in total.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 04, 2023 8:20 pm

are you talking about this setting?

/ip/firewall/connection/tracking/set enabled=auto

for me is set to auto.
I believe it is not related to VRRP.
I do not know exactly what does it mean and what's its purpose (really sorry for my ignorance).
Should I set it to off? or should I just ignore it?
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 04, 2023 8:25 pm

Both routers need to have one common public IP for VRRP on WAN with sync connections - you have two IP - which is why this cannot work.
do you mean that from a VRRP standpoint it works when both routers are connected to the same WAN IP? (https://wiki.mikrotik.com/wiki/File:Vrr ... haring.png)
So, my configuration with each router with its own WAN IP is wrong and not supported?
I've seen it from this thread and it is from where I was mentioning that Sindy is proposing src-natting solution post #11: viewtopic.php?t=172532
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Thu May 04, 2023 10:10 pm

There is not one “right way”. I was just explaining why you cannot use VRRP on the WAN.

What is not working now? If it just a few invalid that you see when failing over, there is no issue - that’s expected since It has to form new connections after the routers swap as the outbound public IP changes.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri May 05, 2023 9:47 am

Thank you Amm0 for your help.
I'll work a little bit more during the week end to clean and fine-tune the rules.
But this is should be the 'right' solution to manage 2 routers in VRRP and 2 ISPs with with failover.

One more thing, for the /ip/dhcp-server, I normally have them listen on the VRRP interface. But this is just preference since like seeing the leases for a VRRP'd VLAN on one router (e.g. the VRRP master). In theory, it doesn't matter who provides the DHCP address for a VRRP VLAN, but it does get annoying to trace down an lease down the road since you have to look in two places...
One more thing, I understand from your suggestion that is better to move the interface for the DHCP Server from the VLAN to the VRRP. In this way, if I properly understand I'll have only 4 DHCP Servers actives at the same time, corresponding only to the masters ones will release DHCP IP address to clients. Am I right?
Tnx.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Fri May 05, 2023 6:23 pm

When all working would love to see the config, so that I can understand what was done! (including a routes jpeg (edited for security of course))
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri May 05, 2023 8:25 pm

Sure
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Sat May 06, 2023 4:34 am

if I properly understand I'll have only 4 DHCP Servers actives at the same time, corresponding only to the masters ones will release DHCP IP address to clients. Am I right?
Correct, you still have 8 DHCP servers, but on the 4 on slave VRRP VLANs will be red/inactive until the slave becomes the master. If a VRRP switch happens, the clients remain on the old router until the lease expires, but upon renewal the lease will "move". Since client request their old address again, it should be seamless.

In your current all routers listening on the VLAN interface (instead of VRRP interface), you'll have 2 active DHCP Servers per VLAN – and both will replay. So it be semi-random which one "wins" and client list be spread across both routers. That why listening on the VRRP interface make sure one DHCP is always active.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Sat May 06, 2023 5:13 am

FWIW, this older thread had more background/commentary on VRRP:
viewtopic.php?p=972296&hilit=vrrp#p855509

There is a discussion of "dual WAN" where @raimondsp states same about VRRP+WAN...
Of course, it does not help if you have two different ISP (e.g., ethernet + 4G). But with a single WAN connection and two routers, that is a way to do.
That's why using recursive routes on the WAN make sense. And the VLAN's VRRP master selection control which routing table is used...so if you want a VLAN to prefer one WAN1 or WAN2, you ensure that VRRP priority get the master to the desire primary WAN.

It worth doing a diagram since your setup solves the limitation of a dual WAN VRRP (e.g. VRRP cannot help tell you the "internet is up" – and that where your recursive routes fix that)
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sat May 06, 2023 9:21 am

@Amm0,
Let me finalizing and testing the config, then I'll post the diagram (I'll update the one I already sent upon @anav request) and I'll post the key snipets of the config as requested by @anav.
Tnx, I'll keep you posted.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sat May 06, 2023 2:11 pm

@anav and @Amm0,
I did some fine tuning and tests and it seems that the ISP1 and ISP2 failover on the 2 VRRP routers is working properly using the recursive routes (and adding to each router in the fwd chain just one accept rule before the drop invalid where in-int-list=VRRP and out-int=VLAN100 - nothing more is needed).

I also moved the DHCP Servers from VLANs on VRRP interfaces and it is working ok. Only 4 DHCP servers are active and when I power-off one of the routers the other takes over everything in a couple of seconds.

Then I moved to test the ether2 connection (client dhcp vs the 4G/LTE modem) and with my surprise, this connection takes precedence over the ISP one (I tested on both routers) also if its distance is the highest. It was working ok in the past when I was testing it without using recursive rules for checking ISP, so just specified ISP distance=1 and ether2 distance=2 and everything was ok. Now with recursive ether2 if has a connection takes the precedence. why? What am'I doing wrong?

My objective is that ether2 is sleeping and it takes over only when both ISPs are down, not before this event.

here below my recursive rules:

IP ROUTES DEFINITIONS:

# router #01 routes definition
/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 2"
# +++++++++++++++++++
add distance=3 dst-address=0.0.0.0/0 gateway=172.22.1.2 scope=10 target-scope=30 comment="ISP2 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2"

# router #02 routes definitions
/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 2"
# +++++++++++++++++++
add distance=3 dst-address=0.0.0.0/0 gateway=172.22.1.1 scope=10 target-scope=30 comment="ISP1 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=4 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2"

Thank you.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Sat May 06, 2023 7:45 pm

For a bottle of red anything is possible........
I never like to put myself in a corner and thus do my route distance with always potential to put routes before or after any route.
Only thing I can think of is add check-gateway=ping to the first backup route....... Not that it should prevent reaching the 3rd ISP..........

More than likely stating your LTE connection interface gateway is ether2?? is probably WRONG.

/ip route
add check-gateway=ping distance=5 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=5 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=10 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=10 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP1 via PPPoE - ping host 2"
# +++++++++++++++++++
add check-gateway=ping distance=15 dst-address=0.0.0.0/0 gateway=172.22.1.2 scope=10 target-scope=30 comment="ISP2 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=20 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2"

# router #02 routes definitions
/ip route
add check-gateway=ping distance=5 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add distance=5 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 1"
# +++++++++++++++++++
add check-gateway=ping distance=10 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add distance=10 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11 comment="WAN1 ISP2 via PPPoE - ping host 2"
# +++++++++++++++++++
add check-gateway=ping distance=15 dst-address=0.0.0.0/0 gateway=172.22.1.1 scope=10 target-scope=30 comment="ISP1 via Backup Router"
# +++++++++++++++++++
add disabled=no distance=20 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2"
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sat May 06, 2023 8:13 pm

Tnx @anav, I'd love a botte of italian good red, but we should be quite far away ;-) .
Btw, I'll follow your suggestions.
LTE/4G is ether2 since it's a client dhcp on the wan of the router managing the 4G connection, that's why.
Tnx
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Sat May 06, 2023 8:33 pm

I am not convinced LOL.
For example if you disable all routes except the one to LTE do you get internet

Since we have no information on this connection, No pppoe information etc, I cannot visualize what the gatewayIP address might be,
Does your LTE interface have an assigned name??
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sat May 06, 2023 8:45 pm

I am not convinced LOL.
For example if you disable all routes except the one to LTE do you get internet

Since we have no information on this connection, No pppoe information etc, I cannot visualize what the gatewayIP address might be,
Does your LTE interface have an assigned name??
ok, let me better explain.
I've a 4G TP-Link modem / router with a SIM inside.
On the LAN side of it, I've a DHCP server 192.168.3.1/24. So, the TP-Link GW is 192.168.3.1 and it is connected to the ether2 of the mikrotik router RB5009. The ether2 on the MT router has a DHCP client and it get assigned an address 192.168.3.2. ether2 is not part of the bridge and it is assigned to the WAN as the other pppoe (ether1) interface. I did not change ether2 name.
Does this clarify?
I'm going just now to try your suggestions.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Sat May 06, 2023 9:03 pm

I've a 4G TP-Link modem / router with a SIM inside.
On the LAN side of it, I've a DHCP server 192.168.3.1/24. So, the TP-Link GW is 192.168.3.1 and it is connected to the ether2 of the mikrotik router RB5009. The ether2 on the MT router has a DHCP client and it get assigned an address 192.168.3.2. ether2 is not part of the bridge and it is assigned to the WAN as the other pppoe (ether1) interface. I did not change ether2 name.
I wouldn't use the DHCP client, e.g. disable it. Add 192.168.1.2/24 as IP address on Mikrotik for ether2, and use your existing static route the ip. You already have a route for LTE, just include a higher target/target-scope*:
/ip route add disabled=no distance=20 dst-address=0.0.0.0/0 gateway=ether2 scope=10 target-scope=35

The issue is the DHCP client in inserting a route with lower scope/target-scope than your recursive routes, so it become primary. In theory, you could lower scopes of the existing recursive routes so they're lower than DHCP client uses.

*I think there are right, but if not adjust target-scope
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sat May 06, 2023 9:12 pm

I've a 4G TP-Link modem / router with a SIM inside.
On the LAN side of it, I've a DHCP server 192.168.3.1/24. So, the TP-Link GW is 192.168.3.1 and it is connected to the ether2 of the mikrotik router RB5009. The ether2 on the MT router has a DHCP client and it get assigned an address 192.168.3.2. ether2 is not part of the bridge and it is assigned to the WAN as the other pppoe (ether1) interface. I did not change ether2 name.
I wouldn't use the DHCP client, e.g. disable it. Add 192.168.1.2/24 as IP address on Mikrotik for ether2, and use your existing static route the ip. You already have a route for LTE, just include a higher target/target-scope*:
/ip route add disabled=no distance=20 dst-address=0.0.0.0/0 gateway=ether2 scope=10 target-scope=35

The issue is the DHCP client in inserting a route with lower scope/target-scope than your recursive routes, so it become primary. In theory, you could lower scopes of the existing recursive routes so they're lower than DHCP client uses.

*I think there are right, but if not adjust target-scope
@Amm0 see my response. entering the interface, the dhcp client has specified distance=1 (I never know this) and I changed it to 15.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Sat May 06, 2023 9:12 pm

AMMO is on to something maybe,
Dont worry about scope shit, its too in the weeds, and not required.


If in the DHCP client you didnt select add default route. which you shouldnt as you are trying to deal with all manual routes,
Then the error is fixed with
add distance=20 dst-address=0.0.0.0/0 gateway=192.168.3.1 routing-table=main

instead of this one above
add disabled=no distance=20 dst-address=0.0.0.0/0 gateway=ether2 pref-src="" \
routing-table=main suppress-hw-offload=no comment="4G/LTE ISP via ether2"

If you dont use DHPC Client then you need to add the address ( one or the other not both )

/ip address
add address=192.168.3.2 interface=ether2 network=192.168.3.0
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3169
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: VRRP and ISP Failover

Sat May 06, 2023 9:18 pm

AMMO is on to something maybe,
Dont worry about scope shit, its too in the weeds, and not required.
Likely true.

@anav is right, the core problem is that DHCP client in likely adding a lower distance=1. If you uncheck the "add-default-route", then your existing static route should work (as noted, if not... it's the scopes, someplace).
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sun May 07, 2023 3:36 pm

@Amm0, @Anav,

here below for router 1 (I avoid to post for router 2 too, since they are similar):

/ip route
add check-gateway=ping distance=1 dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=12
add comment="WAN1 ISP1 via PPPoE - ping host 1" distance=1 dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=11
add check-gateway=ping distance=2 dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=12
add comment="WAN1 ISP1 via PPPoE - ping host 2" distance=2 dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=11
add comment="ISP2 via Backup Router" disabled=no distance=3 dst-address=0.0.0.0/0 gateway=172.22.1.2 routing-table=main scope=10 target-scope=30
add comment="4G/LTE ISP via ether2" disabled=no distance=2 dst-address=0.0.0.0/0 gateway=192.168.4.1 routing-table=main scope=10 target-scope=35

This is the only way I found to make them working (not completely).
The open issues are:
1 - the maximum distance for the ISP3 (4G/LTE) is 2 that has to be the same as the distance for ISP2 (routing via VLAN100 to the other router) - WHY? my understanding / explanation: the route to ISP2 via backup router (VLAN100) is always black color (also when there is no connection and never became red), so it will never allow ISP3 route to work if its distance is higher than 2 (I believe it is not pingable but it has a connection with the far-end gateway ip address in VLAN100 172.22.1.2)

Based on this I defined routes differently from scratch considering three WANs, as below:

TRIPLE WAN RECURSIVE ROUTES (new definitions only router 1)

/ip route
add check-gateway=ping dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=13 distance=3
add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=13 distance=3
add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=10 target-scope=13 distance=10
add check-gateway=ping dst-address=0.0.0.0/0 gateway=208.67.222.222 scope=10 target-scope=13 distance=10
add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.4.4 scope=10 target-scope=13 distance=15
add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.1.1.1 scope=10 target-scope=13 distance=15

add dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP1 via PPPoE - ping Host 1" distance=3
add dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP1 via PPPoE - ping Host 2" distance=3
add dst-address=8.8.8.8/32 gateway=172.22.1.2 scope=10 target-scope=12 comment="ISP2 via Backup Router - ping Host 1" distance=10
add dst-address=208.67.222.222/32 gateway=172.22.1.2 scope=10 target-scope=12 comment="ISP2 via Backup Router - ping Host 2" distance=10
add dst-address=8.8.4.4/32 gateway=192.168.3.1 scope=10 target-scope=12 comment="ether2 ISP3 via GW 4G/LTE - ping Host 1" distance=15
add dst-address=1.1.1.1/32 gateway=192.168.3.1 scope=10 target-scope=12 comment="ether2 ISP3 via GW 4G/LTE - ping Host 2" distance=15

Results:
1) the ISP1 direct connection via PPPoE on the router (ether1) is working ok (as before) (distance=3)
2) the ISP2 indirect via VLAN100 via the router 2 does not respond to PING so it is not reachable (despite the fact it is not in red color but back color - below some screenshots) (distance 10)
3) the ISP3 direct connection via GW 192.168.3.1 (4G/LTE) (ether2) is working properly and switching when (1) drops, while (2) is ignored - when (1) is back it switched to it (distance 15)

So the failover route to ISP2 via the other router SHOULD allow to provide a connection in case ISP1 drops but is not pingable. It it is working with the previous routes definition since it was not pinging the gw (172.22.1.2) via external host (eg. 8.8.8.8 ), but does not allow to ISP3 with higher distance to take over if both ISP1 / ISP2 are off.

I want to clarify that I also tried to disable completely the firewall filter but nothing change, so the problem is not in the firewall that is blocking anything needed to routes to properly work.

Here below some screenshots about the stat of the routes from WinBox.
I hope they are readable since this is the first time I post images in the forum.
BTW the red arrows show that the host address for the failover route to the other router is read (not reachable) but the route is still back.
router 1 - pppoe disconnected & ether2 connected - 2023-05-07.png
router 1 - pppoe + ether2 connected - 2023-05-07.png
You do not have the required permissions to view the files attached to this post.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Sun May 07, 2023 6:38 pm

Okay but I sense that you have changed the requirements on us again LOL.

So on Router1 you want
a. if ISP1 fails Go to 4G/LTE
b. if ISP1 fails and 4G/LTE fails go to Router 2

c. if ISP2 fails Go to 4G/LTE
d. if ISP2 fails and 4G/LTE fails go to Router 1

Is that what you want. Stated clearly any setup is usually possible. A muddled request will get muddled answers in the long run anyway.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sun May 07, 2023 6:45 pm

Hi anav,
My requirements are always the same and the are not changed. It's the configuration I'm doing is not matching them :-).
1) router 1: ISP1 first, isp2 (failover on router 2), then finally 4G (ISP3)
2) router 2: ISP2 first, isp1 (failover on router 1), then finally 4G (ISP3)
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sun May 07, 2023 8:36 pm

@anav,
maybe your questions / doubts are coming, while reading the routes I posted, but I can reassure you, I did not change my requirements.

The old (yesterday) configuration the ISP2 routing (through VLAN100) was working ok, since I was not pinging any host (8.8.8.8 ) through the GW (172.22.1.2). Nevertheless, this old route definition was not probably correct to manage also the routing to ISP3 (4G/LTE) as a third option after ISP1 and ISP2.

The new three WAN routes definition is better and more clear. The distances (3 for ISP1, 10 for ISP2, 15 for ISP3) clearly confirm my requirements. Nevertheless, while ISP1 and ISP3 are working ok (pingable), ISP2 not. Because pinging 8.8.8.8 through 172.22.1.1 s not working, so the route 2 is not reachable and I cannot go to ISP2 on router 2.

I do not know why the route through VLAN100 is not able to ping an external host (8.8.8.8 ). I confirm the firewall is not blocking anything. Moreover, I did another test: I created an access port to router 1 for VLAN100, I connected my laptop with a fixed IP address (172.22.1.3, 172.22.1.0/24, GW/DNS 172.22.1.1) and I've internet full connection, nevertheless I'm not able to ping the host 8.8.8.8. So, this is for sure an issue that makes the route on VLAN100 not working and I'm sure that this is the KEY to move forward and make everything working ok (I really hope).

Tnx for your patience and I hope I was clear enough in describing what's the problem.

EDIT: maybe the problem in the route definition to ISP2, is that while ISP1 (pppoe-out on ether1) and ISP3 (fixed IP on ether2) are both associated to interfaces that are part of the WAN, while ISP3 is on the brige (VLAN100)?
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Mon May 08, 2023 10:21 pm

@anas, @Amm0,
I believe I'm not so far from having this failover solution working with routing.
But without your experience and suggestions, maybe I will not succeed. I'm currently running out of ideas.

I really do not want give up, since I learned a lot with your help and this solution, if working, could help somebody else too.

Otherwise, I'd move to a netwatch / scripting solution VRRP based, for which I already drafted a solution may also work.

Tnx in avance.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Tue May 09, 2023 12:05 am

Okay please post your latest config so I can take a fresh look
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 09, 2023 8:15 am

Here it is, tnx.
# may/09/2023 06:41:16 by RouterOS 7.8
# software id = 
#
# model = RB5009UG+S+
# serial number = 
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=\
    admit-only-vlan-tagged name=bridge protocol-mode=none pvid=20 \
    vlan-filtering=yes
/interface vlan
add interface=bridge name=GUEST_VLAN vlan-id=3090
add interface=bridge name=MGMT_VLAN vlan-id=20
add interface=bridge name=VLAN5 vlan-id=5
add interface=bridge name=VLAN10 vlan-id=10
add interface=ether1 name=VLAN835 vlan-id=835
add comment="WAN Routing VLAN 100" interface=bridge name=WAN_ROUTING_VLAN \
    vlan-id=100
/interface pppoe-client
add disabled=no interface=VLAN835 name=pppoe-out user=XXX
/interface vrrp
add interface=VLAN5 name=vrrp5 priority=200 vrid=5
add interface=VLAN5 name=vrrp6 vrid=6
add interface=VLAN10 name=vrrp10 priority=254 vrid=10
add interface=VLAN10 name=vrrp11 vrid=11
add interface=MGMT_VLAN name=vrrp20 priority=254 vrid=20
add interface=MGMT_VLAN name=vrrp21 vrid=21
add interface=GUEST_VLAN name=vrrp3090 priority=200 vrid=90
add interface=GUEST_VLAN name=vrrp3091 vrid=91
/interface list
add comment=defconf name=WAN
add comment=defconf name=LAN
add name=VRRP
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/ip pool
add name=MGMT_POOL ranges=XX.YY.72.201-XX.YY.72.250
add name=VLAN5_POOL ranges=XX.YY.70.201-XX.YY.70.250
add name=VLAN10_POOL ranges=XX.YY.71.231-XX.YY.71.250
add name=GUEST_POOL ranges=192.KK.ZZ.101-192.KK.ZZ.150
/ip dhcp-server
add address-pool=MGMT_POOL comment="Secured / Management DHCP Server" \
    interface=vrrp20 name=MGMT_DHCP
add address-pool=VLAN5_POOL comment="Standard DHCP Server" interface=vrrp6 \
    name=VLAN5_DHCP
add address-pool=VLAN10_POOL comment="IoT DHCP Server" interface=vrrp10 name=\
    VLAN10_DHCP
add address-pool=GUEST_POOL comment="Guest DHCP Server" interface=vrrp3091 \
    name=GUEST_DHCP
/interface bridge port
add bridge=bridge comment=defconf frame-types=\
    admit-only-untagged-and-priority-tagged interface=ether3 pvid=5
add bridge=bridge comment=defconf frame-types=\
    admit-only-untagged-and-priority-tagged interface=ether4 pvid=10
add bridge=bridge comment=defconf frame-types=\
    admit-only-untagged-and-priority-tagged interface=ether5 pvid=3090
add bridge=bridge comment=defconf frame-types=\
    admit-only-untagged-and-priority-tagged interface=ether6 pvid=100
add bridge=bridge comment=defconf interface=ether7
add bridge=bridge comment=defconf frame-types=\
    admit-only-untagged-and-priority-tagged interface=ether8 pvid=20
add bridge=bridge comment=defconf frame-types=admit-only-vlan-tagged \
    interface=sfp-sfpplus1
/ip neighbor discovery-settings
set discover-interface-list=LAN
/ip settings
set rp-filter=loose
/ipv6 settings
set disable-ipv6=yes
/interface bridge vlan
add bridge=bridge comment="VLAN Standard" tagged=bridge,sfp-sfpplus1 \
    untagged=ether3 vlan-ids=5
add bridge=bridge comment="VLAN IoT" tagged=bridge,sfp-sfpplus1 untagged=\
    ether4 vlan-ids=10
add bridge=bridge comment="VLAN Secured / Management" tagged=\
    bridge,sfp-sfpplus1 untagged=ether8 vlan-ids=20
add bridge=bridge comment="VLAN Guest" tagged=bridge,sfp-sfpplus1 untagged=\
    ether5 vlan-ids=3090
add bridge=bridge comment="WAN Routing VLAN 100" tagged=bridge,sfp-sfpplus1 \
    untagged=ether6 vlan-ids=100
/interface list member
add comment=defconf interface=bridge list=LAN
add comment=defconf interface=ether1 list=WAN
add interface=ether2 list=WAN
add interface=pppoe-out list=WAN
add interface=vrrp5 list=VRRP
add interface=vrrp6 list=VRRP
add interface=vrrp10 list=VRRP
add interface=vrrp11 list=VRRP
add interface=vrrp20 list=VRRP
add interface=vrrp21 list=VRRP
add interface=vrrp3090 list=VRRP
add interface=vrrp3091 list=VRRP
/ip address
add address=XX.YY.72.112/24 comment="Secure / Management Network Gateway" \
    interface=MGMT_VLAN network=XX.YY.72.0
add address=XX.YY.70.112/24 comment="VLAN Standard Gateway" interface=VLAN5 \
    network=XX.YY.70.0
add address=XX.YY.71.112/24 comment="VLAN IoT Gateway" interface=VLAN10 \
    network=XX.YY.71.0
add address=192.KK.ZZ.254/24 comment="VLAN Guest Gateway" interface=\
    GUEST_VLAN network=192.KK.ZZ.0
add address=XX.YY.72.115 interface=vrrp20 network=XX.YY.72.115
add address=XX.YY.72.116 interface=vrrp21 network=XX.YY.72.116
add address=XX.YY.70.115 interface=vrrp5 network=XX.YY.70.115
add address=XX.YY.70.116 interface=vrrp6 network=XX.YY.70.116
add address=XX.YY.71.115 interface=vrrp10 network=XX.YY.71.115
add address=XX.YY.71.116 interface=vrrp11 network=XX.YY.71.116
add address=192.KK.ZZ.1 interface=vrrp3090 network=192.KK.ZZ.1
add address=192.KK.ZZ.2 interface=vrrp3091 network=192.KK.ZZ.2
add address=172.22.1.2/24 comment="WAN Routing VLAN Gateway #02" interface=\
    WAN_ROUTING_VLAN network=172.22.1.0
add address=192.168.3.4/24 comment=\
    "ether2 fixed ip address to the 4G/LTE modem for the backup connection" \
    interface=ether2 network=192.168.3.0
/ip dhcp-client
add default-route-distance=2 disabled=yes interface=ether2
/ip dhcp-server network
add address=XX.YY.70.0/24 dns-server=XX.YY.70.112,1.1.1.1,8.8.8.8 gateway=\
    XX.YY.70.116
add address=XX.YY.71.0/24 dns-server=XX.YY.71.112,1.1.1.1,8.8.8.8 gateway=\
    XX.YY.71.115
add address=XX.YY.72.0/24 dns-server=XX.YY.72.112,1.1.1.1,8.8.8.8 gateway=\
    XX.YY.72.115
add address=192.KK.ZZ.0/24 dns-server=192.KK.ZZ.254,1.1.1.1,8.8.8.8 \
    gateway=192.KK.ZZ.1
/ip dns
set allow-remote-requests=yes servers=1.1.1.1,8.8.8.8
/ip dns static
add address=XX.YY.72.116 comment="Secured / Management Network Gateway" name=\
    router.lan
add address=159.148.172.226 name=upgrade.mikrotik.com
add address=173.194.76.108 name=smtp..gmail.com
/ip firewall address-list
add address=XX.YY.72.0/24 comment="Admin List Address" list=Admin
/ip firewall filter
add action=accept chain=input comment=\
    "defconf: accept established,related,untracked" connection-state=\
    established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=\
    invalid
add action=accept chain=input comment="accept vrrp packets" protocol=vrrp
add action=accept chain=input comment="defconf: accept ICMP" protocol=icmp
add action=accept chain=input comment=\
    "allow VLAN 5 only (inter-vlan is blocked)" dst-address=XX.YY.70.0/24 \
    src-address=XX.YY.70.0/24
add action=accept chain=input comment=\
    "allow VLAN 10 only (inter-vlan is blocked)" dst-address=XX.YY.71.0/24 \
    src-address=XX.YY.71.0/24
add action=accept chain=input comment=\
    "allow MANAGEMENT VLAN 20 only (inter-vlan is blocked)" dst-address=\
    XX.YY.72.0/24 src-address=XX.YY.72.0/24
add action=accept chain=input comment=\
    "allow GUEST VLAN 3090 only (inter-vlan is blocked)" disabled=yes \
    dst-address=192.KK.ZZ.0/24 src-address=192.KK.ZZ.0/24
add action=accept chain=input comment="\"defconf: accept local loopback (for D\
    ude, RADIUS, user-manager, CAPsMAN, Wireguard) (https://forum.mikrotik.com\
    /viewtopic.php\?t=180838)" dst-address=127.0.0.1
add action=reject chain=input comment="*** TBC LOGGING *** optional --> useful\
    \_but only if interested in tracking LAN issues (https://forum.mikrotik.co\
    m/viewtopic.php\?t=180838) - The purpose of the action=reject rule is to p\
    revent users in LAN from waiting for tens of seconds to get a timeout if t\
    hey are trying to connect to forbidden destinations, and of course for the\
    \_admin to be aware of traffic that has the potential to be a problem (aka\
    \_pinpoint device with issues)." in-interface-list=LAN log=yes \
    log-prefix="*** TRACKING LAN ISSUES ***" reject-with=\
    icmp-admin-prohibited
add action=drop chain=input comment="block everything else - non presente in R\
    B5009 default, aggiunto da CCR2216."
add action=accept chain=forward comment="defconf: accept in ipsec policy" \
    ipsec-policy=in,ipsec
add action=accept chain=forward comment="defconf: accept out ipsec policy" \
    ipsec-policy=out,ipsec
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
    connection-state=established,related hw-offload=yes
add action=accept chain=forward comment=\
    "defconf: accept established,related, untracked" connection-state=\
    established,related,untracked
add action=accept chain=forward comment="need this rule to manage the ISP2 fai\
    lover (on this router), using the ISP1 from the other VRRP router, otherwi\
    se these packets will be dropped as invalid by the 'drop invalid' rule." \
    in-interface-list=VRRP out-interface=WAN_ROUTING_VLAN
add action=accept chain=forward comment="need this rule to manage the ISP fail\
    over on the other VRRP router, otherwise these packets will be discarded a\
    s invalid by the next rule." disabled=yes in-interface-list=VRRP \
    out-interface=pppoe-out
add action=drop chain=forward comment="defconf: drop invalid" \
    connection-state=invalid log=yes log-prefix="*** invalid ***"
add action=accept chain=forward comment=\
    "allow internet traffic for VLAN 100 for WAN Routing for ISP failover." \
    in-interface=WAN_ROUTING_VLAN out-interface-list=WAN
add action=accept chain=forward comment="allow internet traffic (all vrrp inte\
    rfaces) - non presente in RB5009 default, aggiunto da xxx (che usava i\
    nvece all-vlan)." in-interface-list=VRRP out-interface-list=WAN
add action=accept chain=forward comment="allow port forwarding - DA VERIFICARE\
    \_CHE SERVA VERAMENTE (https://forum.mikrotik.com/viewtopic.php\?t=180838)\
    " connection-nat-state=dstnat disabled=yes
add action=reject chain=forward comment="*** TBC LOGGING *** optional --> usef\
    ul for tracking LAN issues - in most installations the rule doesn't have t\
    o care about multicast traffic because it never sees it (https://forum.mik\
    rotik.com/viewtopic.php\?t=180838) - The purpose of the action=reject rule\
    \_is to prevent users in LAN from waiting for tens of seconds to get a tim\
    eout if they are trying to connect to forbidden destinations, and of cours\
    e for the admin to be aware of traffic that has the potential to be a prob\
    lem (aka pinpoint device with issues)." dst-address=!0.0.0.0/0 \
    in-interface-list=LAN log=yes log-prefix="*** TRACK LAN ISSUES ***" \
    reject-with=icmp-admin-prohibited
add action=drop chain=forward comment="defconf: drop all from WAN not DSTNATed\
    \_- drop access to clients behind NAT from WAN - drops all new connection \
    attempts from the WAN port to our LAN network (unless DstNat is used). Wit\
    hout this rule, if an attacker knows or guesses your local subnet, he/she \
    can establish connections directly to local hosts and cause a security thr\
    eat." connection-nat-state=!dstnat connection-state=new \
    in-interface-list=WAN
add action=drop chain=forward comment="block everything else - non presente in\
    \_RB5009 default." log-prefix=\
    "*** blocked by fwd ***"
/ip firewall nat
add action=src-nat chain=srcnat ipsec-policy=out,none out-interface=pppoe-out \
    to-addresses=xx.xx.xx.xx
add action=masquerade chain=srcnat ipsec-policy=out,none out-interface=ether2
add action=masquerade chain=srcnat comment="defconf: masquerade" disabled=yes \
    ipsec-policy=out,none out-interface-list=WAN
/ip route
add check-gateway=ping distance=3 dst-address=0.0.0.0/0 gateway=9.9.9.9 \
    scope=10 target-scope=13
add check-gateway=ping distance=3 dst-address=0.0.0.0/0 gateway=1.0.0.1 \
    scope=10 target-scope=13
add check-gateway=ping disabled=no distance=10 dst-address=0.0.0.0/0 gateway=\
    8.8.8.8 pref-src="" routing-table=main scope=10 suppress-hw-offload=no \
    target-scope=13
add check-gateway=ping disabled=no distance=10 dst-address=0.0.0.0/0 gateway=\
    208.67.222.222 pref-src="" routing-table=main scope=10 \
    suppress-hw-offload=no target-scope=13
add check-gateway=ping distance=15 dst-address=0.0.0.0/0 gateway=8.8.4.4 \
    scope=10 target-scope=13
add check-gateway=ping distance=15 dst-address=0.0.0.0/0 gateway=1.1.1.1 \
    scope=10 target-scope=13
add comment="WAN1 ISP2 via PPPoE - ping Host 1" distance=3 dst-address=\
    9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=12
add comment="WAN1 ISP2 via PPPoE - ping Host 2" distance=3 dst-address=\
    1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=12
add comment="ISP1 via Backup Router - ping Host 1" disabled=no distance=10 \
    dst-address=8.8.8.8/32 gateway=172.22.1.1 pref-src="" routing-table=main \
    scope=10 suppress-hw-offload=no target-scope=12
add comment="ISP1 via Backup Router - ping Host 2" disabled=no distance=10 \
    dst-address=208.67.222.222/32 gateway=172.22.1.1 pref-src="" \
    routing-table=main scope=10 suppress-hw-offload=no target-scope=12
add comment="ether2 ISP3 via GW 4G/LTE - ping Host 1" distance=15 \
    dst-address=8.8.4.4/32 gateway=192.168.3.1 scope=10 target-scope=12
add comment="ether2 ISP3 via GW 4G/LTE - ping Host 2" distance=15 \
    dst-address=1.1.1.1/32 gateway=192.168.3.1 scope=10 target-scope=12
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www disabled=yes
set www-ssl certificate=Webfig disabled=no
set api disabled=yes
/system clock
set time-zone-name=XXX
/system identity
set name="MikroTik RB5009 #02"
/system ntp client
set enabled=yes
/system ntp client servers
add address=194.0.5.123
add address=216.239.32.15
/tool mac-server
set allowed-interface-list=none
/tool mac-server mac-winbox
set allowed-interface-list=LAN
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Tue May 09, 2023 3:51 pm

I can only go with what I know works.

(1) From this
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=\
admit-only-vlan-tagged name=bridge protocol-mode=none pvid=20 \
vlan-filtering=yes

TO
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=admit all\
name=bridge protocol-mode=none vlan-filtering=yes


(2) Add ingress filtering=yes to all bridge ports listed

(3) The obfuscation of private IPs is not required, they are private not public, matters zilcho ( aka ridonkulous)

(4) The firewall rules are not efficient in the least and some are just wrong.......
For example these have no business in the input chain.
add action=accept chain=input comment=\
"allow VLAN 5 only (inter-vlan is blocked)" dst-address=XX.YY.70.0/24 \
src-address=XX.YY.70.0/24
add action=accept chain=input comment=\
"allow VLAN 10 only (inter-vlan is blocked)" dst-address=XX.YY.71.0/24 \
src-address=XX.YY.71.0/24
add action=accept chain=input comment=\
"allow MANAGEMENT VLAN 20 only (inter-vlan is blocked)" dst-address=\
XX.YY.72.0/24 src-address=XX.YY.72.0/24
add action=accept chain=input comment=\
"allow GUEST VLAN 3090 only (inter-vlan is blocked)" disabled=yes \
dst-address=192.KK.ZZ.0/24 src-address=192.KK.ZZ.0/24


Suggest you have no business making any firewall rules other than the defaults until you understand how to use them.
In other words, start the firewall rules from scratch and redo, with the intent to make it leaner.

(5) Your source nat rules seem confused.
/ip firewall nat
add action=src-nat chain=srcnat ipsec-policy=out,none out-interface=pppoe-out \
to-addresses=xx.xx.xx.xx

If its a dynamic WANIP the to-address is not required and the action is masquerade.
If its a static WANIP then the format is correct but pppoe is dynamic is it not????

The ether2 rule with a fixed WANIP should be
add action=src-nat chain=srcnat out-interface=ether2 to-address=192.168.3.4

The default (third rule you can get rid of).

(6) Working on routes section separately.........
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 09, 2023 6:44 pm

@anav, first of all thank you for your help on this.

I can only go with what I know works.

(1) From this
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=\
admit-only-vlan-tagged name=bridge protocol-mode=none pvid=20 \
vlan-filtering=yes

TO
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf frame-types=admit all\
name=bridge protocol-mode=none vlan-filtering=yes
OK, I'll do this change.
The reason for the orange part is to access the bridge only through the MGMT VLAN in a tagged way. Is it wrong to do this?


(2) Add ingress filtering=yes to all bridge ports listed

(3) The obfuscation of private IPs is not required, they are private not public, matters zilcho ( aka ridonkulous)
Ingress filtering is already yes on all ports. I believe it's the default.
OK points taken.

(4) The firewall rules are not efficient in the least and some are just wrong.......
For example these have no business in the input chain.
add action=accept chain=input comment=\
"allow VLAN 5 only (inter-vlan is blocked)" dst-address=XX.YY.70.0/24 \
src-address=XX.YY.70.0/24
add action=accept chain=input comment=\
"allow VLAN 10 only (inter-vlan is blocked)" dst-address=XX.YY.71.0/24 \
src-address=XX.YY.71.0/24
add action=accept chain=input comment=\
"allow MANAGEMENT VLAN 20 only (inter-vlan is blocked)" dst-address=\
XX.YY.72.0/24 src-address=XX.YY.72.0/24
add action=accept chain=input comment=\
"allow GUEST VLAN 3090 only (inter-vlan is blocked)" disabled=yes \
dst-address=192.KK.ZZ.0/24 src-address=192.KK.ZZ.0/24


Suggest you have no business making any firewall rules other than the defaults until you understand how to use them.
In other words, start the firewall rules from scratch and redo, with the intent to make it leaner.
OK, these rules in the input chain are to keep separate VLANS (block inter-vlan traffic). How can I do this in the forward chain in the right way, since I was not able to make it working (shame on me :-(). Can you help me on this? Tnx.

(5) Your source nat rules seem confused.
/ip firewall nat
add action=src-nat chain=srcnat ipsec-policy=out,none out-interface=pppoe-out \
to-addresses=xx.xx.xx.xx

If its a dynamic WANIP the to-address is not required and the action is masquerade.
If its a static WANIP then the format is correct but pppoe is dynamic is it not????
I've a public fixed IP address.

The ether2 rule with a fixed WANIP should be
add action=src-nat chain=srcnat out-interface=ether2 to-address=192.168.3.4

The default (third rule you can get rid of).
OK, clear, I'll follow your suggestions.

Thank you.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Tue May 09, 2023 7:45 pm

Yes the best way to block intervlan traffic is to put at the end of the forward chain

add chain=forward action=drop comment="drop all else"

Now above this line and below most of the defaults you put needed traffic rules.

/ip firewall
{Input Chain}
(default rules)
add action=accept chain=input comment="defconf: accept established,related,untracked" connection-state=established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=invalid
add action=accept chain=input comment="defconf: accept ICMP" protocol=icmp
add action=accept chain=input comment="defconf: accept to local loopback (for router uses)" dst-address=127.0.0.1
(admin rules)
add action=accept chain=input in-interface=MGMT src-address-list=ADMIN { only allow some Ip addresses to config router }
add action=accept chain=input in-interface-list=LAN dst-port=53,123 protocol=udp { allow users to dns and some devices to ntp services }
add action=accept chain=input in-interface-list=LAN dst-port=53 protocol=tcp { allow user to dns services
add action=drop chain=input comment="drop all else" *****
{Forward Chain}
(default rules)
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" connection-state=established,related
add action=accept chain=forward comment="defconf: accept established,related, untracked" connection-state=established,related,untracked
add action=drop chain=input comment="defconf: drop invalid" connection-state=invalid
(admin rules)
add action=accept chain=forward comment="allow internet traffic" in-interface-list=LAN out-interface-list=WAN
add action=accept chain=forward comment="allow port forwarding" connection-nat-state=dstnat
----> any other needed traffic rules you can put here, before the drop rule <-----
add action=drop chain=forward comment="drop all else"


*****
Last rule put in so you dont lock yourself out of access to router.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Tue May 09, 2023 7:46 pm

There is no reason to access bridge, There is nothing on the bridge its a meaningless statement and confusing.
The only things one accesses

are users/devices through VLANs (forward chain rules)
The router for configuration purposes ( Input chain rule).
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 09, 2023 7:57 pm

Thank you, clear.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Tue May 09, 2023 11:06 pm

  1. You can use a VLAN on the bridge as the WAN interconnection between routers A and B (where router A uses router B as the secondary WAN uplink and vice versa), but if the physical connection between the bridges breaks, each of the two routers will serve the LAN clients that can see it via its own WAN uplink because it will be unable to send the traffic via the other router. So to provide redundancy at this level, the routers should be connected using two cables, i.e. you either have to use bonding (so the two physical paths are treated as a single logical one by both bridges) or some flavor of the spanning tree protocol must run on the bridges so that you could use a ring topology. Is this currently the case?

    If the bandwidth requirements are so huge that passing each packet through the same physical path twice would cause packet loss, you have to use a third physical link for the WAN interconnection, and use MSTP to make sure that each VLAN will prefer another physical link whereas the third one will only be used if one of the other two fails (it needs awful hacks to do that using bonding).

  2. What I am missing there is some way of making sure that the WAN->LAN responses will follow the same path as the respective LAN->WAN requests, i.e. that responses to requests that got from a client to router B's primary WAN via router A because router A's primary WAN was down will not take a shortcut from router B directly to the client, bypassing router A. Because if these responses do bypass router A, the firewall on router A will not allow TCP sessions to establish as it will never see the SYN,ACK packet from the server.
    • the simplest possibility would be to use src-nat (or masquerade) when sending traffic to the other router vi athe WAN interconnect, but the fact that you bothered to obfuscate the first two bytes of the IP addresses on most of the VRRP interfaces suggests that these are public addresses, so you may not want to use NAT for them.
    • another possibility would be to use some auxiliary subnet to deliver the VRRP traffic between the devices, attach the actual XX.YY.7N.1/24 address to the VRRP interface, and add a route to XX.YY.7N.0/24 via the WAN interconnect link. So on the router where the VRRP interface is in backup mode, XX.YY.7N.0/24 would not exist at all, and that router would send packets for that subnet via the WAN interconnection link. But you seem to have two VRRP interfaces in each VLAN, each of which is a master on another router if everything works, so this method is not applicable either.
    • so the last possibility that works always is to use connection marking and corresponding routing marking - connections that come in via the WAN interconnect get a connection mark in mangle, and the responses within these marked connections get a routing mark in mangle so that they would be routed towards the client via the WAN interconnect rather than directly.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 12:08 am

@Sindy,
Thanks for helping me.

I already have a vlan for the routing created as suggested by Amm0 (VLAN 100).
You can se above my config for router 2 (similar to router 1).
My issue is that the routes I've defined the as per my post #65 (triple wan recursive), the second route via the other router is not working since the vlan100 (subnet 172.22.0.x) is not able to ping an host (8.8.8.8 ), so it is not reachable. I do not know why. You can see a better description of the issue in my posts #65 and #68.

The 4 vlan that are in vrrp are not public, I was only trying to obfuscate my subnet numbers (maybe too much paranoid, sorry).
I would prefer the solution src-nat you mention, when you say:
"the simplest possibility would be to use src-nat (or masquerade) when sending traffic to the other router via the WAN interconnect, but the fact that you bothered to obfuscate the first two bytes of the IP addresses on most of the VRRP interfaces suggests that these are public addresses".

I believe this is what I'm trying to do with vlan 100, but it is not yet working.

I'm learning, so I'd appreciate if you could point me better on how to implement and make it working. Is the vlan100 implemented properly to be used for failover routing from one router to the other? I'm not really concerned about having a redundant connection btw the two routers, since they are not so far one from each other. I was also thinking about src-nat since you were mentioning it in an other thread (link is in my post #8 of this thread), but I do not know how exactly implement it. BTW, I'm also sure I'm having issues that the request and the response are not following the same path, since in the firewall I've invalid packets dropped in the forward chain (for which I've created rules to recover them before being dropped).
Tnx.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Wed May 10, 2023 10:01 am

I already have a vlan for the routing created as suggested by Amm0 (VLAN 100).
That's clear, all what I was trying to say was that choosing a VLAN on the same bridge where your LANs are located as the WAN side interconnect has some implications you have to take into account.

My issue is that the routes I've defined the as per my post #65 (triple wan recursive), the second route via the other router is not working since the vlan100 (subnet 172.22.0.x) is not able to ping an host (8.8.8.8), so it is not reachable. I do not know why.
When looking into this, I've noticed that the scope and target-scope parameters of your routes are not configured properly. The purpose of these parameters is to restrict the choice of "serving" routes for a "client" route - a client route with target-scope N can only use serving routes whose scope is smaller than N. All your routes have the same scope of 10, which means that if the route to 8.8.8.8 via 172.22.1.1 is inactive for some reason, the check-gateway process of the default route via 8.8.8.8 as gateway can use any of the other default routes. However, this should cause a different misbehaviour than the one you observe - it should mean that if the WAN 1 is unavailable (PPPoE down), the check-gateway processes of the default routes via 9.9.9.9 and 1.0.0.1 can still ping 9.9.9.9 and 1.0.0.1 using the other default routes, keeping these default routes active. So definitely fix that by setting the scope of all the default routes to something higher than 13, so that no default route could use another default route as its serving one, but I currently cannot see how this fix should resolve the issue that you cannot ping 8.8.8.8 via the WAN_ROUTING_VLAN. On the other hand, I cannot see anything else to be wrong - even with the firewall rules enabled, the rule chain=forward in-interface=WAN_ROUTING_VLAN out-interface-list=WAN action=accept is enough to let the check-gateway pings from the other router pass through to the primary WAN.

So if it doesn't start working after you fix the target-scope values, you'll have to sniff on both routers and see what actual path the check-gateway pings and the responses to them take.

But maybe it is just a misinterpretation of what the route state indicator is showing you? Of all routes with the same dst-address and different distance values in the same routing table, only those with the lowest distance value can be Active. So as long as the PPPoE client interfaces are up, the default routes via WAN_ROUTING_VLAN will stay inactive even though their gateway addresses are reachable, like in this example from elsewhere:

[me@myTik] > /ip route print detail where target-scope=11
0 A S dst-address=0.0.0.0/0 gateway=8.8.8.8 gateway-status=8.8.8.8 recursive via 10.25.13.1 ether1 check-gateway=ping distance=1 scope=30 target-scope=11
1 . . S dst-address=0.0.0.0/0 gateway=9.9.9.9 gateway-status=9.9.9.9 recursive via 172.16.1.1 ether2 check-gateway=ping distance=5 scope=30 target-scope=11


The 4 vlan that are in vrrp are not public, I was only trying to obfuscate my subnet numbers (maybe too much paranoid, sorry).
This shows why you have to take a lot of care when obfuscating - to avoid removing any information that is necessary for proper understanding.

I would prefer the solution src-nat you mention, when you say:
"the simplest possibility would be to use src-nat (or masquerade) when sending traffic to the other router via the WAN interconnect ..."

I believe this is what I'm trying to do with vlan 100, but it is not yet working.
I'd say it is enough to add a rule chain=srcnat ipsec-policy=out,none out-interface=WAN_ROUTING_VLAN action=masquerade into /ip firewall nat. You can optimize the NAT rules a bit afterwards, but first we need to make it work, therefore I suggest only the minimum changes needed.

I'm not really concerned about having a redundant connection btw the two routers, since they are not so far one from each other.
It's not a matter of physical distance, it's a matter of how the whole redundancy concept is designed. As you bother to implement an active-active redundancy with two VRRP gateways per VLAN, I suppose that you want one group of devices in your LAN to prefer the uplink connected to WAN 1 of router A and another group of devices to prefer the uplink connected to WAN 1 of router B. But if the only interface connecting one of the routers to the LAN breaks, the outcome will be the same as if that router broke completely, because the WAN_ROUTING_VLAN goes via that same physical interface so all traffic of the LAN devices will use solely WAN 1 (or WAN 3) of the other router if that port breaks.

Of course, both theoretically and by practical experience, a temporary outage of an ISP uplink happens way more often than a breakdown of an interface of a router or of a router as a whole, but it costs you just two physical ports and a patchcord to implement L2 redundancy on LAN. For critical applications, it is a standard that even individual devices in LAN are connected to two different switches that are connected to two different routers.

I'm also sure I'm having issues that the request and the response are not following the same path, since in the firewall I've invalid packets dropped in the forward chain (for which I've created rules to recover them before being dropped).
Try the above and the extra firewall rules should not be necessary any more.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 10:48 am

I solved the route host(s) pinging issue. It was due to the fact I was pinging the same hosts from the 2 different routers for the ISP failover route. So I specified different hosts for router 1 and router 2 and now the three WANs failover routing is working perfectly (below the working routes for benefit of everybody is interested for one of the two routers - lines in red have to use different hosts for the two routers to work properly).

@Sindy: it remains the issue about the VLAN 100 for ISP failover btw the 2 routers. As you said, we have to be "sure that the WAN->LAN responses will follow the same path as the respective LAN->WAN requests" between the two routers. As of today, as I was saying before, in case of failover routing there're invalid packets dropped by the routers from the VRRP for which I've to recover before dropping them (viewtopic.php?t=195726) . I strongly believe this is happening since packets are not following the same patch for the request and the response.

I would prefer to follow src-nat way, but I do not ho to implement it.

tnx.

add check-gateway=ping dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=10 target-scope=13 distance=5
add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=10 target-scope=13 distance=5
add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=10 target-scope=13 distance=10
add check-gateway=ping dst-address=0.0.0.0/0 gateway=208.67.222.220 scope=10 target-scope=13 distance=10

add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.4.4 scope=10 target-scope=13 distance=15
add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.1.1.1 scope=10 target-scope=13 distance=15

add dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP1 via PPPoE - ping Host 1" distance=5
add dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP1 via PPPoE - ping Host 2" distance=5
add dst-address=8.8.8.8/32 gateway=172.22.1.2 scope=10 target-scope=12 comment="ISP2 via Backup Router - ping Host 1" distance=10
add dst-address=208.67.222.220/32 gateway=172.22.1.2 scope=10 target-scope=12 comment="ISP2 via Backup Router - ping Host 2" distance=10

add dst-address=8.8.4.4/32 gateway=192.168.4.1 scope=10 target-scope=12 comment="ether2 ISP3 via GW 4G/LTE - ping Host 1" distance=15
add dst-address=1.1.1.1/32 gateway=192.168.4.1 scope=10 target-scope=12 comment="ether2 ISP3 via GW 4G/LTE - ping Host 2" distance=15
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Wed May 10, 2023 11:07 am

I solved the route host(s) pinging issue. It was due to the fact I was pinging the same hosts from the 2 different routers for the ISP failover route.
Ah, so you have routed 8.8.8.8 via WAN_ROUTING_VLAN at both routers, so the request packets were circulating between the routers until their TTL has finally expired.

You don't need to use different canary (virtual gateway) IPs at both routers but you must make sure that if a given canary address is routed via WAN_ROUTING_VLAN on one router, it is routed via the actual WAN 1 on the other. So e.g. 9.9.9.9 and 1.0.0.1 via WAN 1 on router A and via WAN_ROUTING_VLAN on router B, and 8.8.8.8 and 208.67.222.222 via WAN 1 on router B and via WAN_ROUTING_VLAN on router A.

@Sindy: it remains the issue about the VLAN 100 for ISP failover btw the 2 routers.
I assume you have written this before reading my post #79?
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 11:27 am

Thank you Sindy,
If you see my last posting, I solved the ping issue (I was pinging same hosts on both routers for the failover route, so I specified different ones only for this). Now all the routes are working and I tried all the possible failover combinations.

With regard to the scope parameter in the routes, if I properly understand, I just change all the scope=10 with scope=13? Am I right?

With regard to the src-nat rule, I'll try this today, and I'll let you know. Before I'll try the masquerade and then the srcnat specifying the target ip address (I believe the same I've on the router and not inverting as I do in the routes, e.g. 172.22.1.1 for router 1).
So, in this case, will not be necessary any marking and creating new routes tables?

Thank you again.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 11:38 am

We are crossing our messages. :-)

You don't need to use different canary (virtual gateway) IPs at both routers but you must make sure that if a given canary address is routed via WAN_ROUTING_VLAN on one router, it is routed via the actual WAN 1 on the other. So e.g. 9.9.9.9 and 1.0.0.1 via WAN 1 on router A and via WAN_ROUTING_VLAN on router B, and 8.8.8.8 and 208.67.222.222 via WAN 1 on router B and via WAN_ROUTING_VLAN on router A.
Sorry, I do not understand if I've to do anything else and make other changes to the routes based on this comment (apart from changing the scope from 10 to 13).

@Sindy: it remains the issue about the VLAN 100 for ISP failover btw the 2 routers.
I assume you have written this before reading my post #79?
Yes, tnx ;-)
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Wed May 10, 2023 12:17 pm

With regard to the scope parameter in the routes, if I properly understand, I just change all the scope=10 with scope=13? Am I right?
Not for all routes, only for those with dst-address=0.0.0.0/0.

With regard to the src-nat rule, I'll try this today, and I'll let you know.
...
So, in this case, will not be necessary any marking and creating new routes tables?
Correct, this way you won't need any additional routing tables and associated connection&route marking to ensure that the responses take the "proper" path.

You don't need to use different canary (virtual gateway) IPs at both routers...
Sorry, I do not understand if I've to do anything else and make other changes to the routes based on this comment (apart from changing the scope from 10 to 13).
You do not *have* to do anything else, but you *can* reduce the total number of canary IPs in use because you can use the same address as a canary one for both routers if you do it "properly".
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 12:44 pm

With regard to the scope parameter in the routes, if I properly understand, I just change all the scope=10 with scope=13? Am I right?
Not for all routes, only for those with dst-address=0.0.0.0/0.
OK cristal clear.

With regard to the src-nat rule, I'll try this today, and I'll let you know.
...
So, in this case, will not be necessary any marking and creating new routes tables?
Correct, this way you won't need any additional routing tables and associated connection&route marking to ensure that the responses take the "proper" path.
OK, it should be simple and straightforward. I'll test the src-nat later on today and I'll let you know.


Sorry, I do not understand if I've to do anything else and make other changes to the routes based on this comment (apart from changing the scope from 10 to 13).
You do not *have* to do anything else, but you *can* reduce the total number of canary IPs in use because you can use the same address as a canary one for both routers if you do it "properly".
[/quote]
Sorry, what do you mean, with "if you do it properly"?
considering the 6 rows:
1) for the first 2 (distance=5) I'm checking the ISP1, so I'm pointing to the WAN of the router (pppoe-out) - here no doubts about the clear path (everything is on the same router and I have exactly the same 2 rows on both routes and I can use without problems the same 2 IP canary)
2) for rows 3 & 4 (distance=10) (in red in one of my previous posts), I'm pointing to the other router ip (so router 1 point to router 2 ip 172.22.1.2) and these ones were not working since I was using same IP canary on both routers. As soon as, I changed them (so I'm using 2 different for router 1 e other 2 different for router 2 - in total 4 different), the route started to work. Is there a different way to write it to avoid to use 4 different IP canary?
3) it's the same story as point 1 (distance=15).

Tnx.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Wed May 10, 2023 1:34 pm

Sorry, what do you mean, with "if you do it properly"?
Let's use just 4 rows per router (A, B) for simplicity of the illustration.

Router A:
dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=13 target-scope=11 distance=1 check-gateway=ping
dst-address=0.0.0.0/0 gateway=1.1.1.1 scope=13 target-scope=11 distance=5 check-gateway=ping
dst-address=8.8.8.8 gateway=gw.of.primary.wan.A scope=10 target-scope=11
dst-address=1.1.1.1 gateway=interconnect.ip.of.B scope=10 target-scope=11


Router B:
dst-address=0.0.0.0/0 gateway=1.1.1.1 scope=13 target-scope=11 distance=1 check-gateway=ping
dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=13 target-scope=11 distance=5 check-gateway=ping
dst-address=1.1.1.1 gateway=gw.of.primary.wan.B scope=10 target-scope=11
dst-address=8.8.8.8 gateway=interconnect.ip.of.A scope=10 target-scope=11


This "proper" way, router A sends the check ping packet to 1.1.1.1 from its own address interconnect.ip.of.A using the interconnect.ip.of.B as a gateway; once that ping packet arrives to router B, router B forwards it to the internet using gw.of.primary.wan.B as a gateway.

When you previously had the same routing on B as on A (the "wrong" way), the check ping from router A to 1.1.1.1 arrived to router B, but instead of forwarding it to internet via gw.of.primary.wan.B, router B forwarded it back to router A via interconnect.ip.of.A as a gateway because there was the route dst-address=1.1.1.1 gateway=interconnect.ip.of.A scope=10 target-scope=11.

Here, it can be done this "proper" way because if the primary WAN of router B fails, you do not want router A to use the LTE WAN of router B - instead, you want it to use its own LTE WAN. But in other situations the requirement may be different, so the setup would have to be adjusted accordingly, either by using unique canary addresses or by using not only the canary addresses of the primary uplink of router B but also those of router B's LTE uplink for the server routes at router A.

Yet another point for those who come here for inspiration - there is little point in using recursive routing for the WAN of the last resort. As you only ever use it if the other WANs have failed, there is nothing you can do if it fails as well, so there is no reason to monitor its state.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Wed May 10, 2023 4:43 pm

Progress.
Things that are bugging me

(1) How does one LTE backup become available on two routers?
How does LTE work (one sim card?) Chateau router or MT LTE device with two ports, feeding both routers?

(2) What do you mean by interconnect link.... or interconnect,ip of a or b ?????
I will scope your choices :-)
Typically VRRF is
a. Decide on a subnet ( prefer vlans for everything, like bacon goes with everything)
b. One common IP is used as the VRRP address 172.22.0.1 interface VRRP (both routers)
c. One vrrp IP address on interface etherX for master Router1 - 172.22.0.2/24
d. One vrrp IP address on interface etherX for slave Router2 - 172.22.0.3/24
e. IP routes common to both add dst-address=0.0.0.0/0 gateway=172.22.0.1 table=main

Note: To check which Router is live, check Ip address the one which is master the 172.22.0.1 is live, the other router the floating IP is red.

(3) How do we deal with two sources of dhcp servers?
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 5:21 pm

@Sindy,
Sorry, what do you mean, with "if you do it properly"?
Let's use just 4 rows per router (A, B) for simplicity of the illustration.

....

This "proper" way, router A sends the check ping packet to 1.1.1.1 from its own address interconnect.ip.of.A using the interconnect.ip.of.B as a gateway; once that ping packet arrives to router B, router B forwards it to the internet using gw.of.primary.wan.B as a gateway.

When you previously had the same routing on B as on A (the "wrong" way), the check ping from router A to 1.1.1.1 arrived to router B, but instead of forwarding it to internet via gw.of.primary.wan.B, router B forwarded it back to router A via interconnect.ip.of.A as a gateway because there was the route dst-address=1.1.1.1 gateway=interconnect.ip.of.A scope=10 target-scope=11.
do you mean something like this, if I proper understood what you mean?
# router #01
/ip route
add check-gateway=ping dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=13 target-scope=13 distance=5
add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=13 target-scope=13 distance=10
add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=13 target-scope=13 distance=5
add check-gateway=ping dst-address=0.0.0.0/0 gateway=208.67.222.220 scope=13 target-scope=13 distance=10

add dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP1 via PPPoE - ping Host 1" distance=5
add dst-address=8.8.8.8/32 gateway=172.22.1.2 scope=10 target-scope=12 comment="ISP2 via Backup Router - ping Host 1" distance=10
add dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP1 via PPPoE - ping Host 2" distance=5
add dst-address=208.67.222.220/32 gateway=172.22.1.2 scope=10 target-scope=12 comment="ISP2 via Backup Router - ping Host 2" distance=10

add dst-address=0.0.0.0/0 gateway=192.168.4.1 scope=10 target-scope=12 comment="ether2 ISP3 via GW 4G/LTE" distance=15


# router #02
/ip route
add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=13 target-scope=13 distance=5
add check-gateway=ping dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=13 target-scope=13 distance=10
add check-gateway=ping dst-address=0.0.0.0/0 gateway=208.67.222.220 scope=13 target-scope=13 distance=5
add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=13 target-scope=13 distance=10

add dst-address=8.8.8.8/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP2 via PPPoE - ping Host 1" distance=5
add dst-address=9.9.9.9/32 gateway=172.22.1.1 scope=10 target-scope=12 comment="ISP1 via Backup Router - ping Host 1" distance=10
add dst-address=208.67.222.220/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP2 via PPPoE - ping Host 2" distance=5
add dst-address=1.0.0.1/32 gateway=172.22.1.1 scope=10 target-scope=12 comment="ISP1 via Backup Router - ping Host 2" distance=10

add dst-address=0.0.0.0/0 gateway=192.168.4.1 scope=10 target-scope=12 comment="ether2 ISP3 via GW 4G/LTE" distance=15


Here, it can be done this "proper" way because if the primary WAN of router B fails, you do not want router A to use the LTE WAN of router B - instead, you want it to use its own LTE WAN. But in other situations the requirement may be different, so the setup would have to be adjusted accordingly, either by using unique canary addresses or by using not only the canary addresses of the primary uplink of router B but also those of router B's LTE uplink for the server routes at router A.

Yet another point for those who come here for inspiration - there is little point in using recursive routing for the WAN of the last resort. As you only ever use it if the other WANs have failed, there is nothing you can do if it fails as well, so there is no reason to monitor its state.
do you mean that:
1) I can solve the issue that for the failover back it will check only for remote WAN working ISP1 / ISP2 and not LTE. Also if from the tests I did, I thought I already solved this issue ...
2) the route for ISP3 (LTE) is just what I proposed in this new code? Is this right?

I still not had time to check the src-nat.

Tnx again, if this will work is terrific.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 5:41 pm

Progress.
Things that are bugging me

(1) How does one LTE backup become available on two routers?
How does LTE work (one sim card?) Chateau router or MT LTE device with two ports, feeding both routers?
I've one TP-Link modem/router with 1 sim card 4G
Now during the configuration and test, I'm switching the ether cable coming from TP-Link LAN side on ether2 of router1 and router2 just for testing purposes.
But at the end, I'll have 2 LAN cables from the TP-Link LAN side connecting to both routers' ether2. So, both of them will have internet from 4G.



(2) What do you mean by interconnect link.... or interconnect,ip of a or b ?????
I will scope your choices :-)
Typically VRRF is
a. Decide on a subnet ( prefer vlans for everything, like bacon goes with everything)
b. One common IP is used as the VRRP address 172.22.0.1 interface VRRP (both routers)
c. One vrrp IP address on interface etherX for master Router1 - 172.22.0.2/24
d. One vrrp IP address on interface etherX for slave Router2 - 172.22.0.3/24
e. IP routes common to both add dst-address=0.0.0.0/0 gateway=172.22.0.1 table=main

Note: To check which Router is live, check Ip address the one which is master the 172.22.0.1 is live, the other router the floating IP is red.

(3) How do we deal with two sources of dhcp servers?
I believe the IPs to be used are the same already specified in the previous routes (and reused in the new ones).
What I've understood in terms of routes definitions as reshaped by Sindy, I proposed in my response his post.
I hope this is correct and it'll solve issues.

tnx.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Wed May 10, 2023 5:42 pm

do you mean something like this, if I proper understood what you mean?
Yes, exactly.

do you mean that:
1) I can solve the issue that for the failover back it will check only for remote WAN working ISP1 / ISP2 and not LTE. Also if from the tests I did, I thought I already solved this issue ...
2) the route for ISP3 (LTE) is just what I proposed in this new code? Is this right?
1) I am not sure I understand the question. I did not read the complete history of 70 posts in detail so I may have missed something - if you were concerned about how to prevent the backup via the other router from using LTE on that other router if its primary (PPPoE) WAN is down, then yes, this is the solution.
2) yes
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 5:57 pm

do you mean something like this, if I proper understood what you mean?
Yes, exactly.

do you mean that:
1) I can solve the issue that for the failover back it will check only for remote WAN working ISP1 / ISP2 and not LTE. Also if from the tests I did, I thought I already solved this issue ...
2) the route for ISP3 (LTE) is just what I proposed in this new code? Is this right?
1) I am not sure I understand the question. I did not read the complete history of 70 posts in detail so I may have missed something - if you were concerned about how to prevent the backup via the other router from using LTE on that other router if its primary (PPPoE) WAN is down, then yes, this is the solution.
2) yes
WOW, very excited, let me test the src-nat and then change the routes as per your suggestion ;-)
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 6:25 pm

@Sindy,
I implemented the src-nat masquerade on both routers as per your instruction on VLAN100 as out interface.
Nevertheless, it is not working unless I include in the forward chain of router 2 this rule (before it was working only before dropping invalid, now is placed after it):
add action=accept chain=forward comment="need this rule to manage the ISP2 fai\
    lover (on this router), using the ISP1 from the other VRRP router, otherwi\
    se these packets will be dropped as invalid by the 'drop invalid' rule." \
    in-interface-list=VRRP out-interface=WAN_ROUTING_VLAN
I did this test: on router 2 I disconnected ether 1 cable (pppoe-out) and I route to router 1 ISP (I checked this checking my public IP), nevertheless, internet is not working, unless I add this accept rule (on the router 2).

I'm not sure why this rule is necessary ...
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Wed May 10, 2023 7:03 pm

I'm not sure why this rule is necessary ...
This is an expected behaviour. Do you remember I was talking about simplification of the firewall rules once the basics starts working? This is one of such potential simplifications - in fact, on each of the two routers, WAN_ROUTING_VLAN basically plays a role of yet another WAN (and also a role of yet another LAN). But as it is currently not a member of interface list WAN, the rule action=accept chain=forward in-interface-list=VRRP out-interface-list=WAN ignores packets that arrive to one of the VRRP interfaces and get forwarded via WAN_ROUTING_VLAN, so you need a separate rule (the action=accept chain=forward in-interface-list=VRRP out-interface=WAN_ROUTING_VLAN one) to allow them to pass. If you add WAN_ROUTING_VLAN as a member of the interface list WAN, this extra rule in filter will not be necessary any more, and you will also be able to use a single common action=masquerade rule matching on out-interface-list=WAN instead of multiple src-nat/masquerade rules matching on individual out-interface names. Similarly, if you also add WAN_ROUTING_VLAN as a member of interface list VRRP (and indeed, doing so will make the name of the list slightly misleading), you will not need the separate filter rule action=accept chain=forward in-interface=WAN_ROUTING_VLAN out-interface-list=WAN any more too.

But all of these are only optimisations for the sake of brevity, their contribution to throughput is negligible because all of those rules are only matched once per each new connection. So it doesn't make a noticeable difference in CPU load whether the first packet of a connection has to pass through three rules or only one in filter and then in nat. The clarity is more important - if a set of individual rules matching on individual interface names is more "readable" for you than a single "aggregate" rule matching on interface lists, keep it that way.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 8:07 pm

ok Sindy it's clear.
I'm a little bit scared to include the WAN_ROUTING_VLAN in the WAN interfaces list, since it is also playing the role of a LAN, and it is defined on the bridge too, differently from all the other wan interfaces. So, I'm scared about unexpected sides effects of this inclusion.
From a VRRP standpoint, I started using one of the 4 VLANs already existing (and VRR'ed) and with Amm0 discussed that it is cleaner to have a separate VLAN only for failover routing. Moreover this vlan it is not needed to be part of the vrrp, since in case of crash of one of the routers, such vlan is not needed.
At the end, to me it seems better to have WAN_ROUTING_VLAN as a separate VLAN that is managing the failover routing and fine tune the firewall rules. Now, I understand better why this rule is needed (together with the internet access one, I already added to the rules, when Amm0 suggested to use a separate VLAN for routing).
@anav, what do you think?
@sindy, tomorrow I'll change the routing rules in the "proper way" as per your suggestions and I'll do more tests, some cleaning, but it seems we are closer to the end of this. I'll let you know the results.
Tnx also a lot to anav for helping me with the firewall rules, clarifying me proper mechanisms, now I understand better how to work on it. I just started with mikrotik a couple of months ago and I'm learning a lot from you.
You are very kind and available to help others.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 8:15 pm

One more question about masquerade vs srcnat,
I understand that if I've fixed ip addresses I'd use srcnat for everyting, but, if I'm not wrong, I also read somewhere that masquerade truncate quicker all the pending old sessions. So, in my case should I confirm that's better to use srcnat vs masquerade?
Tnx.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Wed May 10, 2023 8:17 pm

Glad you are understanding VRRP LOL. I have a ways to go but thanks for the motivation to do so!

It appears as though your approach is something like having two different VRRPs live at the same time ( so each router is concurrently a Master and a Slave, which is too much for me to grasp at the moment. It may be a stroke of genius or a needless complication but I dont know enough to state one way or another. Lots of ways to skin a cat in RoS ( apologies to rextended ).
I am having trouble grasping using one VRRP LOL.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Wed May 10, 2023 9:15 pm

I'm a little bit scared to include the WAN_ROUTING_VLAN in the WAN interfaces list,
...,
I already added to the rules, when Amm0 suggested to use a separate VLAN for routing).
That's what I had in mind when mentioning readability. There are actually no "side effects", but until you understand perfectly what each of your firewall rules does, it may seem to you as if they were :)

One more question about masquerade vs srcnat,
Both action=srcnat (as well as action=netmap in the srcnat chain) and action=masquerade activate a src-nat behaviour for matching connections. The difference is that masquerade is designed specifically for dynamically assigned addresses, which requires two things:
  • to use the address assigned to the out-interface as reply-dst-address, rather than to use one specified manually using to-addresses,
  • to remove all the connections using that dynamically assigned address from the list of tracked connections if the interface loses that dynamically assigned address.
So it's not that masquerade removes the old sessions "faster" - in fact, sessions whose srcnat behaviour has been activated using action=src-nat or action=netmap are not automatically removed at all.

This makes a difference if the address is assigned dynamically (some ISPs do that to discourage users from running servers on home grade connections or simply to squeeze more money from them for the luxury of having a stable address, some do that even for IPv6 assignments which cannot be technically justified at all) or if the interface physically fails. So for static address assignments, use of masquerade is not recommended because the CPU load caused by removal of hundreds or even thousands of connections when the physical link fails is enormous and can cause trouble also to connections not affected by that link failure directly.

For failover scenarios, removal of tracked connections that use the failed uplink is necessary, because otherwise they can never successfully migrate to the backup link as the packets coming from the LAN side keep being src-nated to the address of the failed link, so even if they make it to the remote server (often they don't as the backup ISP drops them as they come from an incorrect address), the server sends the responses to that address so they never reach the router. This is not important for TCP connections as the clients drop them and establish new ones from a different port at client side, but it is fatal for those UDP connections where the client uses the same source port for all connections, like SIP or IPsec ones. However, masquerade only helps with this if the primary link goes physically down or loses its address, not when it stays up and only becomes opaque for traffic due to an issue further in the ISP network. Nor does that help when the primary link becomes transparent again - in that case, these connections keep being src-nated to the address of the backup link although they are routed via the primary one.

So if you use this type of connections (SIP phones, mobile phones that use VoWiFi/Wifi calling, bare IPsec or L2TP VPNs), you need a housekeeping script that watches the state of the WANs and removes connections that got trapped this way from the connection tracking table, to allow them to get re-created according to the current state of the WAN uplinks. In your particular scenario, such a housekeeping script would have to remove all UDP connections whose reply-dst-address matches the one assigned to ether2 if the virtual gateway of at least one of the default routes via the other router is reachable, and remove all UDP connections whose reply-dst-address matches the one assigned to WAN_ROUTING_VLAN if the virtual gateway of at least one of the default routes via the primary uplink is reachable.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Wed May 10, 2023 9:34 pm

Tnx for the great and exhaustive answer.
 
emunt6
Frequent Visitor
Frequent Visitor
Posts: 87
Joined: Fri Feb 02, 2018 7:00 pm

Re: VRRP and ISP Failover

Thu May 11, 2023 12:28 am

Hi!

The solution for your problem is 2 router with the same configuration, when 1st is down, the 2nd will take over (or vice versa). Other topology just causes complexity. You can do the traffic selection using "fwmark" which gateway to use.
There is another drawback which is currently lacking of the Mikrotik: "true HA":
> Connection states/NAT sessions/VPN connections are not synchronized between the routers: When the failover is happening, the end-users will see 5-10sec "outage",because new master needs to reconnect/recreate all the connection/sessions on the WAN side again. (Other vendors solving this problem)
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 11, 2023 5:49 am

My configuration is 2 routers active in load sharing.
See in link @ load sharing: https://wiki.mikrotik.com/wiki/Manual:VRRP-examples
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 11, 2023 10:10 am

@sindy,
Good morning, I would need for ISP3 (4G/LTE) to incapsulate the ethernet cable in a dedicated WAN VLAN (e.g. VLAN 200), transport it together with my all other LAN VLANs and use it on my 2 routers RB5009, instead of connecting a cable to ether2, as I'm doing as of today.

My concerns / doubts:
1) this is a WAN signal, so it is unprotected (my current TP-Link 4G Modem is in DMZ and the firewall filtering is in the RB5009), so is it safe to transport this together with all the other LAN VLANs?
2) in terms of solution implementation, I imagine I'll have to create a VLAN as I normally do, on the 2 routers and all the switches will transport it. This VLAN200 will be added to the WAN interfaces list, but it will be a tagged interface associated to the switch. The fact is confusione me, is that VLAN200 will be a WAN but at the same time is defined within the switch as all other LAN VLANs to be transported and used across the LAN.
3) in terms of utilization of VLAN200, I can use access ports where needed, or alternatively, I can define on the router an address on the subnet of VLAN200 and use this (as I currently do for VLAN100). Am I right?
4) unless the TP-Link already supports VLANs, I've to use a dedicated physical small managed switch that has in input an access port on VLAN200 connected to the ether cable from the TP-Link WAN side and output with a trunk port on VLAN200 to connect to the rest of my LAN. I believe, I cannot use the current mikrotik switch, since I'd create a new bridge for WAN for witch I'll not have HW offloading, unless I can use the current defined bridge for the LAN (having a port belonging to WAN within the bridge).
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Thu May 11, 2023 2:57 pm

I found the load sharing link on the updated docs site................
https://help.mikrotik.com/docs/display/ ... n+Examples

So you are adding another evil twist, no physical connection?
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 11, 2023 4:09 pm

I found the load sharing link on the updated docs site................
https://help.mikrotik.com/docs/display/ ... n+Examples

So you are adding another evil twist, no physical connection?
Yes, maybe for one of the two routers I'll not be able to pass another wan cable.
Do you think is feasible a WAN VLAN?
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Thu May 11, 2023 5:22 pm

1) this is a WAN signal, so it is unprotected (my current TP-Link 4G Modem is in DMZ and the firewall filtering is in the RB5009), so is it safe to transport this together with all the other LAN VLANs?
The purpose of VLANs is to separate the traffic in the individual VLANs from each other while it passes through the same cable. So unless you connect external equipment to trunk ports (where multiple VLANs are allowed) or if your firewall configuration permits routing between interfaces representing the individual VLANs on the router, there is no problem with having WAN traffic on the same cable like the LAN one, in distinct VLANs.

2) in terms of solution implementation, I imagine I'll have to create a VLAN as I normally do, on the 2 routers and all the switches will transport it. This VLAN200 will be added to the WAN interfaces list, but it will be a tagged interface associated to the switch. The fact is confusione me, is that VLAN200 will be a WAN but at the same time is defined within the switch as all other LAN VLANs to be transported and used across the LAN.
Nothing to be confused about, just another subnet in its own VLAN. Just think of it as of yet another four pairs in the same cable if that makes you feel better :)

3) in terms of utilization of VLAN200, I can use access ports where needed, or alternatively, I can define on the router an address on the subnet of VLAN200 and use this (as I currently do for VLAN100). Am I right?
The routers themselves will access VLAN200 on the bridge by means of an /interface vlan with vlan-id=200 attached to the bridge. "Untrusted" external devices should be connected to access ports with ingress filtering enabled.

4) unless the TP-Link already supports VLANs, I've to use a dedicated physical small managed switch that has in input an access port on VLAN200 connected to the ether cable from the TP-Link WAN side and output with a trunk port on VLAN200 to connect to the rest of my LAN. I believe, I cannot use the current mikrotik switch, since I'd create a new bridge for WAN for witch I'll not have HW offloading, unless I can use the current defined bridge for the LAN (having a port belonging to WAN within the bridge).
The latter approach (an additional VLAN on the single common bridge) is the only possible one if VLAN 200 shares the same bridge with the LAN VLANs on the routers. Each physical interface must belong to at maximum one bridge; you could use a single bridge for each VLAN instead (where the physical interface would not be a member port of any bridge and the VLAN interfaces would be connected to it directly), but you cannot use a mixed approach where some VLANs would share the common bridge and some would have their own.


So is there a single switch to which both the routers are connnected, and you would connect the TP-Link LTE router to that switch?
 
metrotyranno
just joined
Posts: 14
Joined: Fri Mar 24, 2017 12:21 pm

Re: VRRP and ISP Failover

Thu May 11, 2023 5:28 pm

Hi,

I'm not sure if someone mentioned this yet but starting from ROS 7.7 there is a VRRP issue with failover breaking quite easily. If you are still having issues with failovers I recommend downgrading to ROS 7.6. This is a known issue and should be fixed in upcomming 7.10 beta's / 7.10 stable.

Cheers
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 11, 2023 5:38 pm

So is there a single switch to which both the routers are connnected, and you would connect the TP-Link LTE router to that switch?
Yes same switch for the 2 routers, but the tplink wan could most probably connected l to a separate one. Does this matter?
My question is how to manage the connection of the cable coming from the tplink that as of today is connected to ether2 of each of the two routers.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 11, 2023 5:56 pm

Another simpler option would be to buy a mikrotik 4G/LTE router and its LAN port is a trunk with the VLAN 200. Will this work?
My doubt using current 4G modem, is that I need to connect the LAN cable in a switch (it can be the same to which are connected the 2 routers or a different one) to a port that will be a wan interface of such switch and then I'll do a src-nat srcnat, but I'm not sure how to link it to VLAN 200.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Thu May 11, 2023 7:31 pm

A switch itself has no "WAN" or "LAN" interfaces as far as it is concerned - it only has trunk ports and access ports to particular VLANs. The roles of those VLANs are determined by the routers. So you choose a port of the switch, make it a an access port to VLAN 200, and connect the existing LTE router to it. That's it on the switch.

On each of the routers, an /interface vlan attached to the bridge represents the WAN 3 interface, to which you attach the DHCP client and for which you add a masquerade rule:

/interface bridge vlan
add vlan-ids=200 bridge=bridge tagged=bridge,etherXY
/interface vlan
add vlan-id=200 interface=bridge name=bridge.200.wan
/ip dhcp-client
add disabled=no interface=bridge.200.wan default-route-distance=20
/ip firewall nat
...
add chain=srcnat out-interface=bridge.wan.200 action=masquerade


Even on the routers, "LAN" interfaces differ from "WAN" ones only in how you treat and use them. And the interfaces connected using the interconnect link (WAN_ROUTING_VLAN) even behave both ways - as WANs for outgoing connections from the LAN but at the same time they accept those incoming conections that end up routed out via the primary WAN.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 11, 2023 7:46 pm

Perfect, now it's more clear.
I'll do my tests.
In these days I'm completely full with my job engagements.
Tnx a lot.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Thu May 11, 2023 9:07 pm

A switch itself has no "WAN" or "LAN" interfaces as far as it is concerned - it only has trunk ports and access ports to particular VLANs. The roles of those VLANs are determined by the routers. So you choose a port of the switch, make it a an access port to VLAN 200, and connect the existing LTE router to it. That's it on the switch.

On each of the routers, an /interface vlan attached to the bridge represents the WAN 3 interface, to which you attach the DHCP client and for which you add a masquerade rule:

/interface bridge vlan
add vlan-ids=200 bridge=bridge tagged=bridge,etherXY
/interface vlan
add vlan-id=200 interface=bridge name=bridge.200.wan
/ip dhcp-client
add disabled=no interface=bridge.200.wan default-route-distance=20
/ip firewall nat
...
add chain=srcnat out-interface=bridge.wan.200 action=masquerade


Even on the routers, "LAN" interfaces differ from "WAN" ones only in how you treat and use them. And the interfaces connected using the interconnect link (WAN_ROUTING_VLAN) even behave both ways - as WANs for outgoing connections from the LAN but at the same time they accept those incoming conections that end up routed out via the primary WAN.
In your instructions you are saying that I plug physically the cable coming from tp-link into an ether access port of the router (etherXY) and I use dhcp-client.
What I would like to do is to enter into a switch (not one of the two routers since, having chosen a VRRP config, I assume that one of the 2 could be off) (maybe the switch will be the same on which also one of the router is plugged in with its sfp-sfpplus1 in trunk) and for each router I need than to define the wan3 not on a physical port, but on the vlan interface with a fixed ip.

So, on the bridge:
/interface bridge vlan
add vlan-ids=200 bridge=bridge tagged=bridge,etherXY
/interface vlan
add vlan-id=200 interface=bridge name=bridge.200.wan

and on each router (the only point of difference is the ip address 192.168.3.x):
- /interface vlan add vlan-id=200 interface=bridge name=bridge.200.wan
- /interface bridge vlan add vlan-ids=200 bridge=bridge tagged=bridge
- /ip address add address=192.168.3.x interface=bridge.200.wan network=192.168.3.1
- /ip firewall nat add chain=src-nat out-interface=bridge.wan.200 action=srcnat ipsec-policy=out,none to-addresses=192.168.3.1
- the routes will be the same I've already defined for ether2 (ISP3 distance=15), we agreed upon yesterday

But, I feel I'm missing something ...
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Thu May 11, 2023 10:23 pm

In your instructions you are saying that I plug physically the cable coming from tp-link into an ether access port of the router (etherXY)
Nope, I'm not, read the instructions again. In my config excerpt, etherXY is the interface of the router you use to physically connect the router to the switch, acting as a trunk port of the bridge on the router. I did not provide a config excerpt from the switch because it is really simple and because I didn't even know whether it was a Mikrotik one, so I just gave a non-formalized text instruction on how to set it up.

having chosen a VRRP config, I assume that one of the 2 could be off
I can't see why you should run VRRP atop the LTE WAN interfaces on the routers. Just let each Mikrotik get its own IP address from the TP-Link LTE. Or give them static addresses from TP-Link's LAN subnet, whatever you prefer.

So, on the bridge:
/interface bridge vlan
add vlan-ids=200 bridge=bridge tagged=bridge,etherXY
/interface vlan
add vlan-id=200 interface=bridge name=bridge.200.wan
You do not need any /interface vlan for vlan-id=200 on the switch - the switch itself doesn't need to access vlan200, it is enough that it forwards it between its ports. But you do need to tell the switch that the port to which the TP-Link is connected is an access port to VLAN 200, so
/interface bridge vlan
add vlan-ids=200 bridge=bridge tagged=ether-to-R1,ether-to-R2
/interface bridge port
add bridge=bridge interface=ether-to-TP-Link pvid=200

if it is a Mikrotik switch.

and on each router (the only point of difference is the ip address 192.168.3.x):
/interface vlan add vlan-id=200 interface=bridge name=bridge.200.wan
/interface bridge vlan add vlan-ids=200 bridge=bridge tagged=bridge
/ip address add address=192.168.3.x interface=bridge.200.wan network=192.168.3.1
/ip firewall nat add chain=src-nat out-interface=bridge.wan.200 action=srcnat ipsec-policy=out,none to-addresses=192.168.3.1

- the routes will be the same I've already defined for ether2 (ISP3 distance=15), we agreed upon yesterday
On the routers, you must add the physical interface that connects each router to the switch to the tagged list for vlan-ids=200 under /interface bridge vlan. Setting the IP address as a /32 one on an L2 interface is possible but unusual. And the to-addresses in the srcnat rule must be 192.168.3.x, not 192.168.3.1.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri May 12, 2023 10:53 am

Nope, I'm not, read the instructions again. In my config excerpt, etherXY is the interface of the router you use to physically connect the router to the switch, acting as a trunk port of the bridge on the router. I did not provide a config excerpt from the switch because it is really simple and because I didn't even know whether it was a Mikrotik one, so I just gave a non-formalized text instruction on how to set it up.
I'm sorry, I misunderstood you.


I can't see why you should run VRRP atop the LTE WAN interfaces on the routers. Just let each Mikrotik get its own IP address from the TP-Link LTE. Or give them static addresses from TP-Link's LAN subnet, whatever you prefer.
No, no, here I was not clear. I'll not run VRRP on LTE WAN. Makes no sense. I just meant that I will connect the LAN LTE cable on a switch, and do not want to use the router, that's it.


You do not need any /interface vlan for vlan-id=200 on the switch - the switch itself doesn't need to access vlan200, it is enough that it forwards it between its ports. But you do need to tell the switch that the port to which the TP-Link is connected is an access port to VLAN 200, so
/interface bridge vlan
add vlan-ids=200 bridge=bridge tagged=ether-to-R1,ether-to-R2
/interface bridge port
add bridge=bridge interface=ether-to-TP-Link pvid=200

if it is a Mikrotik switch.
Yes my switches are mostly Mikrotik. Nevertheless, I believe that on the switch connected directly to the LTE WAN (since I've only 1 cable), I just need only 1 ether connection (access port) from the LTE LAN to the switch for creating VLAN 200, and in output, only 1 trunk (that will go to other switches to transport it) tagged=wan-vlan200 (not tagged=ether-to-R1,ether-to-R2), since when I'll connect from the router side, I'll assign a static ip addresses 192.168.3.x (one for each router). So I really do not understand why the purpose of the trunk tagged=ether-to-R1,ether-to-R2. But maybe I'm missing something, sorry.


On the routers, you must add the physical interface that connects each router to the switch to the tagged list for vlan-ids=200 under /interface bridge vlan. Setting the IP address as a /32 one on an L2 interface is possible but unusual.
Here, you are telling me that is better to exit from the switch using a physical interface. Apart from being unusual setting the IP address as a /32 one on an L2 interface, do you see issues or side effects for this? My idea is, I could just take the WAN3 from the VLAN 200 interface, otherwise, I've to create an access port on the same switch to which the router is connected and connect this to ether2 of the router. Nothing complicated (maybe more intuitive, more elegant, clear and manageable in case of issues for debugging - and to use the @anav's words is another 'evil twist') but I could save ports and cables. Nevertheless, if you tell me that is better to use physical interface, I'll follow your experience (mine is zero on this).


And the to-addresses in the srcnat rule must be 192.168.3.x, not 192.168.3.1.
Yes, sorry. You're right.


Tnx.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Fri May 12, 2023 12:18 pm

So I really do not understand why the purpose of the trunk tagged=ether-to-R1,ether-to-R2. But maybe I'm missing something, sorry.
It's me who is missing something - namely, the network diagram :) From the previous posts my impression was that the network consisted of two routers connected to a single switch, hence I was assuming you need to make VLAN 200 available on two trunk ports of that switch, one for each router (along with the other VLANs used for the local subnets like IoT, guest etc.). Since there are multiple switches, of course the actual paths from each of the two routers to the access port for VLAN 200 on the switch next to the LTE router may look differently.

Here, you are telling me that is better to exit from the switch using a physical interface.
Something got lost in translation, so I'll try to re-word it. What I am only saying is that the VLAN 200 has to be permitted, in tagged mode, on the physical port of each router that connects that router to its adjacent switch, along with the VLANs carrying the various LAN subnets that you have already permitted on that physical port. I do not see any benefit in using a separate pair of ports to carry only the WAN VLAN between the router and the switch, except maybe bandwidth issues, but that's not relevant here because the bottleneck will be the LTE, not the Ethernet wire.

Apart from being unusual setting the IP address as a /32 one on an L2 interface, do you see issues or side effects for this?
No. The only IP address in VLAN 200 each router will be ever interested in is 192.168.3.1, and the LTE router will likey be configured with a usual /24 mask on its LAN interface, so it will send ARP requests for 192.168.3.x and 192.168.3.y as needed and the two Mikrotik routers will respond. I just cannot see any benefit in doing things in an unusual way if there is no special reason for that, because it adds headache when coming back to the configuration a few months later and looking for that (nonexistent) special reason :)
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Fri May 12, 2023 6:52 pm

Hi rikpal.

I have a draft article going for VRRP.
I am now up to the point where I have two different ISPs, ISP1 on R1 and ISP2 on R2, with a backup LTE all configured.
They share one common pair of ports to
a. create the VRRP link ( vlan333 )
b. create the cross WAN traffic links ( vlan11 and vlan12 ).

My question now, is it worth it to explore the next step. Namely TWO VRRPs?
Why did you pursue this added complexity, as it makes no sense to me and need some direction to continue.

What does makes sense is that you actually want to use ISP2 (or ISP1 depending on which is master) at ALL times!

1. Want LAN subnet A to use ISP1 and LAN subnet B to use ISP2?
2. WAant all LANS to PCC ISP1 and ISP2 traffic?
3. Want specific user(s)/device(s) to use a specific ISP?
4. Have incoming external users coming to a server and they need to come in on a specific ISP?

A complete discussion of the requirements will assist.
It may very well be that there are better ways to achieve some of the above WITHOUT using dual VRRP.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri May 12, 2023 7:31 pm

It's me who is missing something - namely, the network diagram :) From the previous posts my impression was that the network consisted of two routers connected to a single switch, hence I was assuming you need to make VLAN 200 available on two trunk ports of that switch, one for each router (along with the other VLANs used for the local subnets like IoT, guest etc.). Since there are multiple switches, of course the actual paths from each of the two routers to the access port for VLAN 200 on the switch next to the LTE router may look differently.
You're right Sindy. I posted a schematic showing both routers on the same switch, but this is the current configuration with the ongoing configuration under development and test, for which I can wire without problems wan3 cables straight to the routers (the schematic was requested by anav, and we are lucky it is very simple, that helped to debug the issue I was having with the routing and invalid packets dropped ...). Later on, in production, routers will be in two different rooms and one of them will not be reachable by the wan3 ether cable, so that's why I do need to virtualise it through vlans. Of course, in this situation, routers will be wired to different switches.


Something got lost in translation, so I'll try to re-word it. What I am only saying is that the VLAN 200 has to be permitted, in tagged mode, on the physical port of each router that connects that router to its adjacent switch, along with the VLANs carrying the various LAN subnets that you have already permitted on that physical port. I do not see any benefit in using a separate pair of ports to carry only the WAN VLAN between the router and the switch, except maybe bandwidth issues, but that's not relevant here because the bottleneck will be the LTE, not the Ethernet wire.
ok, I see. It's my fault, your english is perfect. mine is not. ;-)


No. The only IP address in VLAN 200 each router will be ever interested in is 192.168.3.1, and the LTE router will likey be configured with a usual /24 mask on its LAN interface, so it will send ARP requests for 192.168.3.x and 192.168.3.y as needed and the two Mikrotik routers will respond. I just cannot see any benefit in doing things in an unusual way if there is no special reason for that, because it adds headache when coming back to the configuration a few months later and looking for that (nonexistent) special reason :)
I agree with you. Evil twisting will overcomplicate things. If tomorrow I need to come back and/or change config, I will unplug and re-plug cables. Hidden configurations will give unnecessary headaches.

I hope this weekend I'll be able to have time to put my hands on the routers to move ahead.
Thank you very much for your help.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Fri May 12, 2023 10:06 pm

Hi rikpal.

I have a draft article going for VRRP.
I am now up to the point where I have two different ISPs, ISP1 on R1 and ISP2 on R2, with a backup LTE all configured.
They share one common pair of ports to
a. create the VRRP link ( vlan333 )
b. create the cross WAN traffic links ( vlan11 and vlan12 ).

My question now, is it worth it to explore the next step. Namely TWO VRRPs?
Why did you pursue this added complexity, as it makes no sense to me and need some direction to continue.

What does makes sense is that you actually want to use ISP2 (or ISP1 depending on which is master) at ALL times!

1. Want LAN subnet A to use ISP1 and LAN subnet B to use ISP2?
2. WAant all LANS to PCC ISP1 and ISP2 traffic?
3. Want specific user(s)/device(s) to use a specific ISP?
4. Have incoming external users coming to a server and they need to come in on a specific ISP?

A complete discussion of the requirements will assist.
It may very well be that there are better ways to achieve some of the above WITHOUT using dual VRRP.
ok anav,
let me explain my requirements, in order of priority, that drove me to have to routers in VRRP in load sharing, and see if this makes sense to you.
  • 1. I use VRRP in order to have redundant routers (HW), in case one of them will not be available for whatever reason
  • 2. I have chosen the 'load sharing' configuration vs. the 'basic' one for the following reasons:
    • a. basic setup is a waste of resources (double hw with one router always sleeping) - having 2 routers I prefer use both of them
    • b. possibility to distribute the load between two routers. My choice was by subnet (using vlans)
  • 3. I have the following available connections: 2 ISP fiber lines (both of them @2.5gbps - ISP1, ISP2 PPPoE login) and a 4G/LTE modem (ISP3 static address on modem LAN). What's the best way to use them?
    • a. I connected ISP1 to router1->ether1 and ISP2 to router2->ether1. This is the only way I've since RB5009 that has only 1 port @2.5Gbps
    • b. ISP3 is connected to ether2 of both routers (192.168.3.2, 192.168.3.3)
    • c. order of priority: for router1 clients = ISP1->ISP2->ISP3 - for router1 clients, then ISP2, then ISP3. for router2 clients = ISP2->ISP2->ISP3
    • d. for failover I've only 2 choices: #1 routing (preferred one) - #2 Netwatch + scripting (playing with VRRP priorities to 'shutdown' the router that lost ISP connection)

As you may know I preferred d1 (failover routes) and it seems it is working now. I've just in the next days to complete the configuration, fine tune and test it, then I'll post it.
With regard to option d2 (netwatch + scripting) I drafted on paper one config (that I believe should most probably work - it is definitively an easier way vs the routing one), but I'll implement and test it later on (just to have an alternative, in case routes may highlight any issue in production, but I don't think so. In any case for the benefit and curiosity of experimenting it in my lab I'll do this test.

PCC having 2 RB5009 and 2 ISPs @2.5Gbps is not a workable option. Using a different router with more ports 2.5gbps, it could be a good option. I have also a CCR2216, and at a later stage I'll test PCC too (but this is a different story ...).

Another benefit for me to use 2 different routers is that (apart from having HW redundancy) is that I can place each of them closer to the points where the 2 GPONs enter in my house, make the PPPoE login there and use the current CAT7e cables crossing my house from one side to the other, as LAN cables (10Gbps throughput) instead of WAN, and this is for me a huge benefit. But, the downside of having the 2 routers far away one from the other, is that I need to virtualise (using VLAN) the ISP3 WAN ether cable to reach both routers (but here the current throughput is no more that 100-200Mbps also considering back/forth traffic). This is what you name another 'evil twist' that is an expression I love ;-).

With regard to your specific questions:

1. Want LAN subnet A to use ISP1 and LAN subnet B to use ISP2? - currently this is my choice and VRRP load sharing is doing exactly this
2. Want all LANS to PCC ISP1 and ISP2 traffic? - as I explained above PCC is not an option for me (2x ISPs @2.5Gbps and 2x RB5009 with only ether1 @2.5Gbps)
3. Want specific user(s)/device(s) to use a specific ISP? - No need as of today, but I can eventually implement it (you are master on this and you helped a lot of users on this)
4. Have incoming external users coming to a server and they need to come in on a specific ISP? - having 2 fixed IPs, I already use as of today the specific one I need to reach the subnet I need. I split the subnets based on a specific logic I need. Anyway some servers already have connection on multiple subnets from where they can be reached.

Pls let me know if you need more clarifications.

P.S.: as I mentioned above after deploying the 2 RB5009, I'll play with CCR2216 with PCC (more similar to my current config I've with Draytek Vigor 3910, ISP1/ISP2 in load balancing/failover and ISP3 as last resort failover in case both ISP1 and ISP2 will not be available).
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 18959
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: VRRP and ISP Failover

Sat May 13, 2023 12:26 am

Thanks rikpal, that is close to what I was thinking so all good.
The PCC does not require you to have equal wan connections....... You can actually share more connections with the higher throughput ISP so in a way optimize both routers utilization.
PCC introduces mangling which does slow things down a bit.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sat May 13, 2023 3:18 pm

@sindy,
morning, I've just finished to:
1. implement the "properly done" last routes you suggested and they work perfectly
2. I cleaned up several small things in the config
3. I added VLAN100 in the WAN interfaces list and this allowed me to simplify and optimise the firewall filtering rules too

Then after an upgrade to RoS 7.9 and a backup I moved to the VLAN200 (the ISP3 WAN VLAN) topic.
It was amazingly simple:
1. add VLAN200 to the switch, add it to the trunk port and finally create an access port for the TP-Link LTE LAN cable
2. on each router I just did the following:
  • /interface vlan add interface=bridge name=WAN_ISP3_VLAN vlan-id=200
  • /interface bridge vlan add bridge=bridge comment="..." tagged=bridge,sfp-sfpplus1 vlan-ids=200
  • add VLAN200 to the list of WAN interfaces
  • change interfaces from ether2 to WAN_ISP3_VLAN to the IP Address and the srcnat rule
That's all, then after wiring the TP-Link LAN cable to the access port on the switch I was able on both routers to use instantly the ISP3 internet connection.

The only think on which I've still to work is the fact that as of today the TP-Link is connected on the Draytek 3910 with the address 192.168.3.2 in DMZ (only 1 single address is possible in DMZ). So now I'll need to have more destination addresses in DMZ (at least .3 and .4 for now).

So now I can have from wherever I want from my LAN use directly the LTE connection (including adding Wifi APs) simply creating an access port to get the connection (of course, in all my switches, I'll add the VLAN200 in the trunk ports).

Thank you very much again, I'm really very grateful you for your support.
The same warm thanks @Amm0 and @anav for their help, support and patience.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Sun May 14, 2023 11:42 am

the TP-Link is connected on the Draytek 3910 with the address 192.168.3.2 in DMZ (only 1 single address is possible in DMZ). So now I'll need to have more destination addresses in DMZ (at least .3 and .4 for now).
I am not sure I understand the issue. A "DMZ" is usually a colloquial term for a 1:1 dst-nat of requests incoming to the WAN address of the firewall (in this case, the TP-link) to a single particular IP address in LAN no matter what the destination port is, and it only makes sense if the WAN IP is a public one reachable from the internet (leaving aside exotic scenarios where NAT is used in private networks).

So unless your mobile ISP assigns a public IP to your SIM, you don't need to care about DMZ at all, as no incoming connections from the internet can ever reach the modem, i.e. all the connections will be initiated from the LAN side of your TP-link device. And unless in the TP-link world, DMZ means also a 1:1 src-nat, it has no effect on connctions initiated from the LAN side.

But if you do have a public IP assigned to your SIM, that's a different story, and there are two ways to deal with that - either you need to set up individual port-forwardings on the TP-link to allow the client in the internet to choose to which of the two Mikrotik routers it wishes to connect (which is quite OK for incoming VPN connections as they are terminated on the routers themselves), or you need to use VRRP also for VLAN 200, so that incoming connections from the internet that need to be port-forwarded further to servers in the LAN could be port-forwarded from the virtual address, that would be configured as the DMZ destination on the TP-link.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Sun May 14, 2023 8:58 pm

@sindy,
My SIM has a public IP address.
The current TP-Link allows to assign only one LAN IP address in DMZ and I use 192.168.3.2 for the Draytek 3910, in order to bypass the TP-Link Firewall and use the one in the 3910.
Replacing the 3910 with the 2 Mikrotik RB5009, I'll have to go through the TP-Link FW and then the 2 MTs FWs, that's my issue.
And anyway, the TP-Link FW is quite basic and not so much flexible. I trust more the Mikrotik one.
I'm really thinking to replace the TP-Link modem (also for other more valid reasons), with a mikrotik 4G LTE router.
In this case I'll simply define standard firewall filtering rules in the fwd chain, like the following ones and I'll have for sure a better control over the LTE router and a better understanding about what should be the expected behaviour.
/ip firewall filter
add action=accept chain=forward connection-state=established,related,untracked
add action=drop chain=forward connection-state=invalid
add action=accept chain=forward in-interface=LAN out-interface=WAN
add action=accept chain=forward connection-nat-state=dstnat
add action=drop chain=forward
Then the src-nat rule.
And finally, I'll define my dstnat fwd rules, just very few really necessaries, since this is a backup line that will be used only when IPS1/ISP2 will be down.

With regard to the point of accessing from outside router 1 or router 2, based on the servers running on one of the two, I hope I shouldn't have such issues, since I expect to access from outside the same services (including vpn) as I'm doing today having only 1 router (draytek 3910). What I have noticed about VRRP today, is that at least from LAN side, the way in which I've configured the virtual GWs addresses for the 4 vlans, they are exactly the same I'm currently running on the 3910, so for me it is totally transparent where are running the active VLANs (router 1 or router 2), so I would expect it will be the same also when I'll access from outside (and firewall filtering rules are exactly the same for both routers).

Interesting your point about having VLAN 200 in VRRP, but I've to better understand this, maybe I'll have a problem (with incoming connections) if both ISP1/ISP2 are down and I've ISP3 serving both routers at the same time. e.g. if I've an external WAN request coming to a specific port (e.g. 3333), that I need to port forward to a server running on 192.168.1.20 (VLAN1 running on router1), I believe it will be reached without problems, but i could be wrong. On this regard, as of today I've noticed that if I connect a PC to a router2 on an access port for VLAN1 (running with VRRP on router1), it works without problems. But maybe I'm underestimating or not properly understanding what will happen when I will access a service from WAN side. Maybe it's the firewall on the 4G LTE modem that has to forward the request to the proper router based on where a service is running or it should be the VLAN 200 VRRP'ed to do this?

One more question, since you mentioned 'the connections that will be initiated from the LAN side'. The Draytek router manages a special port forward feature that it's named 'Port Triggering', that means I do not have to define in advance which ports I've to forward (also because I maybe do not know) to a specific internal nat address, but I define a port triggering rule based on which when a specific nat address initiates a connection (e.g. 192.168.1.34 TCP port 443), then the firewall manages flexibly all the necessary port forwarding, until the conversation is active. e.g. currently I've an Ariston boiler the uses the 'port triggering' feature defined with the 3910 firewall. I'm sure that TP-Link does not support this, but I believe that Mikrotik also if doesn't have a feature named 'port triggering' it can manage in a similar way the connections that are initiated from the LAN side. Am I wrong saying this?

Thank you.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Sun May 14, 2023 10:34 pm

My SIM has a public IP address.
OK, that explains the need for DMZ.

And anyway, the TP-Link FW is quite basic and not so much flexible. I trust more the Mikrotik one.
...
I'm really thinking to replace the TP-Link modem (also for other more valid reasons), with a mikrotik 4G LTE router.
If the only purpose of the Mikrotik LTE router was to forward the incoming connections landing at the LTE router's WAN further to the "big" Mikrotiks, there would be no point in such replacement - the right way is indeed to use VRRP on VLAN 200 and let the TP-link forward these incoming connections to the virtual address.

What is way more important here is that as soon as you expect connections initiated from the internet and forwarded to the same server in the LAN of the Mikrotik routers to arrive to more than one WAN, you have no choice but to use connection marking as you need to make sure that the "big" Mikrotik will send the responses of that server via the same WAN through which their corresponding client requests have come. When the big Mikrotik itself is the server, it is enough to use routing rules that match on src-address to choose the proper WAN, but if requests that arrive to WAN 1 and requests that arrive to WAN 3 are both forwarded to the same address in one of your LANs, you have to "note down" the in-interface of the initial request of each such session and use that note to route the responses within that session via the same WAN.

And this does not play well with the VRRP-based load distribution on the LAN side where you use two VRRP addresses in a LAN and let two groups of LAN host use another one of them as their default gateway. If the WAN 3 VRRP address is active on Router A, that router sends the forwarded packet to a server in the LAN and the tracked connection is created on that router. But if the default gateway of that server in the LAN is the LAN VRRP address that prefers router B, that server will send the response through router B, so the response will get lost (router B's firewall will drop a SYN,ACK packet for which it hasn't seen a corresonding SYN one first, and even if it didn't, it would route it via a wrong WAN as it would lack the note made by router A).

To avoid this, you have to create a third VRRP address in the LAN, make it follow the WAN 3 VRRP address, and set that one as the default gateway for all servers in the LAN that are expected to respond incoming requests from the internet. In your setup, it may not be necessary that this LAN VRRP address follows the WAN 3 VRRP one as WAN 3 shares the VLAN trunk with the LAN VLANs so if one of the routers dies or gets disconnected from the rest of the network, both the WAN 3 VRRP address and this "LAN VRRP address for servers reachable form the internet" will move to the remaining router even without any scripting that would be otherwise required to make the latter follow the former.

Alternatively, you would have to forget the idea of DMZ (port-agnostic dst-nat) and choose the WAN 3 address of the Mikrotik router depending on what is the preferred ISP of the server to which the incoming request is finally forwarded.

RouterOS does support synchronization of connection tracking data from the router where a particular VRRP interface is active to the one where that same VRRP runs in backup, but it seems it only works for a single VRRP interface and in one direction, so it is not compatible with the load distribution scheme where in the "all good" state each of the two VRRP addresses in the same LAN is up on another router.

The Draytek router manages a special port forward feature that it's named 'Port Triggering'
Port triggering is at best a weird and unreliable bandaid for old services that were designed before NAT came into wide use. The only advantage as compared to static port forwarding is that it can make a kind of time-based multiplexing of the same port, but if two devices on LAN need to connect to the same service at the same time and that service needs the same port on the same WAN IP address to be forwarded to both of them, it won't work anyway as the last LAN device to connect to the server will always redirect that port to its own address. All the cloud based services use a single TCP connection initiated from the LAN side so even if tens of devices connect to the same server, a normal src-nat handles that and no exotic port triggering is necessary.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Mon May 15, 2023 8:44 am

@sindy,
I'm a little bit confused about how incoming requests should work.
Let me explain which type of services I'm currently exposing to internet:
1. wireguard on draytek - I currently have only one on draytek router, replacing it with 2xRB5009, I'll set WG on both routers, in order to access a WG tunnel from ISP1 (router1) or ISP2 (router2) indifferently
2. WG on a server running on VLAN1 (router1) (currently this is an alternative to WG running on draytek mentioned in bullet 1)
3. TCP server on specific port (e.g. 3333) through cloudflare proxy servers on VLAN1 (router1)
4. TCP server reachable on specific port (e.g. 2222) on VLAN2 (router2)
5. UDP VPN servers (NAS) running on VLAN1 (router1) and VLAN2 (router2)

Now let me make some use cases to understand if I could have problems. As we discussed before, ISP1 is on router1 and ISP2 is on router2:
1. I access from outside using ISP1 a service provided by VLAN1 (router1) - I believe here no problems (the same if ISP2->router2)
2. I access from outside using ISP2 a service provided by VLAN1 (router1) - will I've problems in this case? I expect routing will work ok and no headaches (it could be also viceversa ISP1->router2)

With regard to ISP3, assuming to replace the 4G/LTE modem with a mikrotik one, I would like to address the incoming requests to the specific internal servers IP addresses using dstnat rules to the proper router1/2 (so I forget about DMZ). In this case, the third use case will be:

3. I access a service from outside using ISP3 and the service could be on VLAN1 (router1) or VLAN2 (router2) - I'll use specific dstnat rules in the mikrotikg LTE router, pointing to the right server.

I'm not sure if in this case, I'll have to run VLAN 200 on VRRP, or it is not necessary.
Anyway, here I would like to chose the simplest way that will allow me to avoid complications and connection marking, whenever possible.

Al the above, assuming that I'm using the VRRPs virtual GWs, so for example if I'll continue to run my services with the same GW I've today on Draytek (e.g. 192.168.1.253), and I do not know if VRRP is referring to a physical GW on router1 (192.168.1.1) or on router2 (192.168.1.2). In this way, I'll replace the Draytek with the 2xRB5009 without changing anything in my network definitions.

Finally, with regard to 'port triggering' my understanding is that moving to mikrotik, I do not have to do anything (no need to define any dstnat rule), since if the connection is originated from a nat address, it will be allowed. Am I right?

Thank you.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Mon May 15, 2023 9:58 am

3. TCP server on specific port (e.g. 3333) through cloudflare proxy servers on VLAN1 (router1)
4. TCP server reachable on specific port (e.g. 2222) on VLAN2 (router2)
I am confused. Your configuration shows that you have two VRRP interfaces in each LAN-side VLAN, and that the priorities of these two VRRP interfaces are set in such a way that when both routers are alive, each of these two VRRP interfaces is active on another router. The only purpose of such a setup is to have a possibility to make each host in the same VLAN prefer either ISP 1 or ISP 2 by setting one or the other VRRP address as the default gateway in the configuration of that host. But what you wrote now looks as if you actually wanted to place hosts that should prefer ISP 1 into a different VLAN (and subnet) than those that should prefer ISP 2. Both approaches are possible, both have their pros and cons, but you cannot mix them together. So what is the actual intention?

2. I access from outside using ISP2 a service provided by VLAN1 (router1) - will I've problems in this case? I expect routing will work ok and no headaches (it could be also viceversa ISP1->router2)
If a server in LAN uses an IP address that is up on router 2 (no matter whether it is a "physical" one or a VRRP-controlled one) as a default gateway, it sends all packets via router 2. So if such packet is a response to a request packet that came to the server via router 1, you have the problem with firewall (non-symmetric routing) because router 2 has never seen the request packet so it will mishandle the response (drop it if it is a TCP packet, and treat it as an initial request of a new connection if it is a UDP packet).

For connections to servers in the internet initiated by clients in LAN, the src-nat on WAN side makes sure that the response of the server will always come to the correct router, because the server gets the WAN IP address of that router as the source address of the request, so it sends the response to that address. But a server in LAN will see the public address of the client in the internet as the source of the request/destination of the response, and this address carries no information about the local router it has arrived through. To make sure that a LAN server would respond via the same router through which the request has come to it, the router forwarding the request would have to src-nat the request to its own LAN address, so that the server would see it as coming from the router itself and would thus send the response to that address - if you send to any address in your own subnet, you don't need a gateway. Doing this would make sure that the responses always go through the same router like the requests, but the servers would lose information about the actual address of the client in the internet - this may or may not be a problem depending on your use case.

With regard to ISP3, assuming to replace the 4G/LTE modem with a mikrotik one, I would like to address the incoming requests to the specific internal servers IP addresses using dstnat rules to the proper router1/2 (so I forget about DMZ).
If you add a LTE-capable Mikrotik into the mix, you can simplify the overall setup a lot by removing the interconnections allowing Router 1 to use ISP 2 via Router 2 and vice versa. Instead, you can make all 3 routers use only their own ISP, and do all the traffic distribution by means of the VRRP addresses - you would use one VRRP address in each LAN VLAN to represent a distinct ISP. So as long as everything works, each host in the LAN sends its traffic through one of the three VRRP addresses in that LAN, which is up on the router that is a gateway to that ISP. If an ISP becomes unreachable, the gateway router to that ISP must lower the priority of the corresponding VRRP interfaces so that they would go to backup mode and the traffic from hosts that prefer that ISP would start flowing through another router. I don't know any way how to do this without scripting, though.

3. I access a service from outside using ISP3 and the service could be on VLAN1 (router1) or VLAN2 (router2) - I'll use specific dstnat rules in the mikrotikg LTE router, pointing to the right server.
That's what I have suggested above - if you know through which router the server in LAN will respond, you can use per-port dst-nat rules (aka port forwarding) rather than a port-agnostic dst-nat (aka DMZ).

Al the above, assuming that I'm using the VRRPs virtual GWs, so for example if I'll continue to run my services with the same GW I've today on Draytek (e.g. 192.168.1.253), and I do not know if VRRP is referring to a physical GW on router1 (192.168.1.1) or on router2 (192.168.1.2). In this way, I'll replace the Draytek with the 2xRB5009 without changing anything in my network definitions.
You can indeed use the current address of the Draytek as one of the VRRP addresses, but for connections initiated from the outside, you'll still have the issue described above - how to make the response pass the same router(s) like the request.

Finally, with regard to 'port triggering' my understanding is that moving to mikrotik, I do not have to do anything (no need to define any dstnat rule), since if the connection is originated from a nat address, it will be allowed. Am I right?
There is no functional equivalent of Draytek's port triggering in RouterOS. You can use an address-list to ensure that the port-forwarding on the controlled port is only enabled for some time after some communication on the controlling one took place (which is what Draytek presents as a security improvement, but it's disputable), but there is no way to change the destination address in the port-forwarding rule except scripting, which may not be responsive enough.

I cannot answer the question whether you have to do anything as I don't know whether you actually need a dst-nat rule for the Ariston boiler or not. It just feel that NAT has been hanging around for 20+ years before boilers started having IP addresses, so there should be no need for the "port triggering". But experience suggests to never say something is impossible.


A point we haven't mentioned throughout the whole converastion is whether you need hosts in different LAN VLANs to talk to each other and if yes, whether you want the firewall to control that communication (i.e. allowing your PC to initiate a connection to the boiler but not allowing the boiler to initiate a connection to the PC) as that puts additional requirements on the configuration.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Mon May 15, 2023 3:43 pm

I am confused. Your configuration shows that you have two VRRP interfaces in each LAN-side VLAN, and that the priorities of these two VRRP interfaces are set in such a way that when both routers are alive, each of these two VRRP interfaces is active on another router. The only purpose of such a setup is to have a possibility to make each host in the same VLAN prefer either ISP 1 or ISP 2 by setting one or the other VRRP address as the default gateway in the configuration of that host. But what you wrote now looks as if you actually wanted to place hosts that should prefer ISP 1 into a different VLAN (and subnet) than those that should prefer ISP 2. Both approaches are possible, both have their pros and cons, but you cannot mix them together. So what is the actual intention?
VRRP configuration is done in a way that 2 VLANs run completely on router1 and 2 VLANs completely on router2.
Virtual GWs are always the same for each VLAN/subnet and are exactly the same I currently have on the draytek, to avoid changing the clients definitions on my network. I don't use to refer to the backup virtual GWs to avoid confusion. For the sake of clearness, I just did several tests and when I power off one of the 2 routers all the 4 VLANs move on one single router and GWs remain the same.


If a server in LAN uses an IP address that is up on router 2 (no matter whether it is a "physical" one or a VRRP-controlled one) as a default gateway, it sends all packets via router 2. So if such packet is a response to a request packet that came to the server via router 1, you have the problem with firewall (non-symmetric routing) because router 2 has never seen the request packet so it will mishandle the response (drop it if it is a TCP packet, and treat it as an initial request of a new connection if it is a UDP packet).
How can I mark connections initiated from ISP1 or ISP2?
I believe I should just mark them and define new routing tables ad FIB, but not use any mangling ...


If you add a LTE-capable Mikrotik into the mix, you can simplify the overall setup a lot by removing the interconnections allowing Router 1 to use ISP 2 via Router 2 and vice versa. Instead, you can make all 3 routers use only their own ISP, and do all the traffic distribution by means of the VRRP addresses - you would use one VRRP address in each LAN VLAN to represent a distinct ISP. So as long as everything works, each host in the LAN sends its traffic through one of the three VRRP addresses in that LAN, which is up on the router that is a gateway to that ISP. If an ISP becomes unreachable, the gateway router to that ISP must lower the priority of the corresponding VRRP interfaces so that they would go to backup mode and the traffic from hosts that prefer that ISP would start flowing through another router. I don't know any way how to do this without scripting, though.
Are you saying that it would be better doing the following:
- remove the cross router routes (including VLAN 100): each router uses only ISP1 or ISP2 that is directly connected to its pppoe interface (distance=1) + ISP3 (distance=2)
- use netwatch and define scripts (for up / down) that change the priorities of the VRRPs interfaces - in other words, if ISP1 os down everything is managed by ISP2/router2 and vice versa and in case both ISP1 & ISP2 are down use ISP3

But doing in this way, for connections initiated from the outside, I believe that I will still have the issue described above and I've to mark incoming connections. E.g. in case I'll use ISP2 to access router1. Am I right?


3. I access a service from outside using ISP3 and the service could be on VLAN1 (router1) or VLAN2 (router2) - I'll use specific dstnat rules in the mikrotik LTE router, pointing to the right server.
That's what I have suggested above - if you know through which router the server in LAN will respond, you can use per-port dst-nat rules (aka port forwarding) rather than a port-agnostic dst-nat (aka DMZ).
But should I need anyway to define VRRP on VLAN 200 also if I do use port forwarding knowing on which router the server in lan will respond?
In case I've to use VRRP for VLAN 200, I do not understand how could help having the third (virtual) address for VLAN 200.


I cannot answer the question whether you have to do anything as I don't know whether you actually need a dst-nat rule for the Ariston boiler or not. It just feel that NAT has been hanging around for 20+ years before boilers started having IP addresses, so there should be no need for the "port triggering". But experience suggests to never say something is impossible.
I'll do a test about this, tnx.

A point we haven't mentioned throughout the whole conversation is whether you need hosts in different LAN VLANs to talk to each other and if yes, whether you want the firewall to control that communication (i.e. allowing your PC to initiate a connection to the boiler but not allowing the boiler to initiate a connection to the PC) as that puts additional requirements on the configuration.
If you refer to 'inter-vlan' connections, I want to have all the VLANs completely segregated each one from the other. So no communications between to VLANs, expect in special cases, but I'll manage this in case (address lists?).
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Mon May 15, 2023 4:41 pm

VRRP configuration is done in a way that 2 VLANs run completely on router1 and 2 VLANs completely on router2.
Virtual GWs are always the same for each VLAN/subnet and are exactly the same I currently have on the draytek, to avoid changing the clients definitions on my network. I don't use to refer to the backup virtual GWs to avoid confusion. For the sake of clearness, I just did several tests and when I power off one of the 2 routers all the 4 VLANs move on one single router and GWs remain the same.
If so, I don't understand why you have two VRRP interfaces per VLAN per router if you actually use only one of them.

How can I mark connections initiated from ISP1 or ISP2?
I believe I should just mark them and define new routing tables ad FIB, but not use any mangling ...
You can only mark connections in mangle, nowhere else. Also translation of connection-mark to routing-mark only happens in mangle. The connection-mark value is not added to the actual packet, it is stored in the connection tracking "database" on the router and assigned to each packet belonging to the connection as it passes through the connection tracking module. So if the response goes via another router than the request, it will not get the connection mark (except if the connection tracking is synchronized between the routers, but that is not compatible with the load distribution mode, even with the one you actually use).

But doing in this way, for connections initiated from the outside, I believe that I will still have the issue described above and I've to mark incoming connections. E.g. in case I'll use ISP2 to access router1. Am I right?
Worse than that, connection marking only works inside the same router. There is no way to add something to the packet being forwarded from WAN to the server that would tell the server "use another routing table to send responses to this packet". The server would have to have multiple VLAN interfaces with individual IP addresses and multiple routing tables so that it would respond via the same VLAN via which the request came in, which may not be possible with the "blackboxes" you have (NAS etc). If you cannot implement this on a server because it can only listen on a single address, you would have to override the use of default gateway by means of src-nat as I've suggested before.

But should I need anyway to define VRRP on VLAN 200 also if I do use port forwarding knowing on which router the server in lan will respond?
In case I've to use VRRP for VLAN 200, I do not understand how could help having the third (virtual) address for VLAN 200.
This would help only if each server in LAN could only be contacted via one WAN. So in your "each server in its own VLAN" approach, this would mean you would have dedicated VLANs for servers that have to be contacted via WAN 3.

So no communications between to VLANs, expect in special cases, but I'll manage this in case (address lists?).
A single special case is enough to require the complete handling of inter-vlan traffic. If you want to use a stateful firewall, you get the same problem with non-symmetric routing like for the WAN<->LAN traffic. Each endpoint device will send the traffic for the other endpoint device through its default gateway; if these default gateways are on different routers, each direction will be handled by another router and connection tracking will get confused.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Mon May 15, 2023 8:09 pm

If so, I don't understand why you have two VRRP interfaces per VLAN per router if you actually use only one of them.
OK, you're right, let me work on this.


You can only mark connections in mangle, nowhere else. Also translation of connection-mark to routing-mark only happens in mangle. The connection-mark value is not added to the actual packet, it is stored in the connection tracking "database" on the router and assigned to each packet belonging to the connection as it passes through the connection tracking module. So if the response goes via another router than the request, it will not get the connection mark (except if the connection tracking is synchronized between the routers, but that is not compatible with the load distribution mode, even with the one you actually use).
OK, can you help me with this config, pls? I found some partial examples but they are not for RoS7
Moreover, I'll have the problem that If I do this, marking all incoming connections, then I will lose the possibility to have the fast-forward for all what is in input and output from the WAN ...


Worse than that, connection marking only works inside the same router. There is no way to add something to the packet being forwarded from WAN to the server that would tell the server "use another routing table to send responses to this packet". The server would have to have multiple VLAN interfaces with individual IP addresses and multiple routing tables so that it would respond via the same VLAN via which the request came in, which may not be possible with the "blackboxes" you have (NAS etc). If you cannot implement this on a server because it can only listen on a single address, you would have to override the use of default gateway by means of src-nat as I've suggested before.
This should not be my case, most important server, as well as, all NAS they have network cards that are connected to multiple subnets.
What I can do I can customise dstnat rules in the 2 routers to access the specific address belonging to the VLAN.
But at the end, probably the best way is (I repeat again what I asked before):
- remove the cross router routes (including VLAN 100): each router uses only ISP1 or ISP2 that is directly connected to its pppoe interface (distance=1) + ISP3 (distance=2)
- use netwatch and define scripts (for up / down) that change the priorities of the VRRPs interfaces - in other words, if ISP1 is down everything is managed by ISP2/router2 and vice versa and in case both ISP1 & ISP2 are down use ISP3

At he end, what I'm understanding that is crucial that incoming connections do not have to cross routers: e.g. ISP1->router1 and ISP2->router2.


This would help only if each server in LAN could only be contacted via one WAN. So in your "each server in its own VLAN" approach, this would mean you would have dedicated VLANs for servers that have to be contacted via WAN 3.
I'm sorry, but I do not understand this point. Can you elaborate a little bit more?
What I do not understand if WAN 3 (the is the last backup connection I've in case both ISP1 and ISP2 will not work) to serve both routers can work as it is today, or should I implement VRRP on VLAN 200?


A single special case is enough to require the complete handling of inter-vlan traffic. If you want to use a stateful firewall, you get the same problem with non-symmetric routing like for the WAN<->LAN traffic. Each endpoint device will send the traffic for the other endpoint device through its default gateway; if these default gateways are on different routers, each direction will be handled by another router and connection tracking will get confused.
As of today, I do not need inter-vlan routing.
 
sindy
Forum Guru
Forum Guru
Posts: 10205
Joined: Mon Dec 04, 2017 9:19 pm

Re: VRRP and ISP Failover

Mon May 15, 2023 9:23 pm

Rather than answering individual points, let me give you a summmary suggestion this time.

First, since all your internal addresses are private ones and since you don't have BGP on your uplinks, no matter how many connection marks you use and how complicated the setup will be, if an ISP uplink, or the router it is connected to, dies, all connections through that uplink/router will die as well and will have to be re-established.

So if you don't need any load distribution except statical placement of the individual LAN hosts to different (V)LANs, which will make each of them always use the same ISP uplink as long as it is available, there is indeed no need for any forwarding path from one router to another - it is enough to have the VRRP interfaces on each router track the availability of the primary uplink of that router and lower their priority if it becomes unavailable. This holds if you don't need any inter-VLAN traffic.

If you accept the above, it is much better replace the TP-Link with a Mikrotik LTE router and include it into the scheme exactly the same way like the other two, with the lowest priority on all VRRP interfaces. So if Router 1 dies or its uplink becomes unusable, all the traffic that would normally use Router 1 / ISP 1 will start using Router 2 / ISP 2; only if both Router 1 and Router 2 die, all traffic will start using Router 3 / LTE.

In such a setup, you don't need any connection marking, nor a VRRP interface in VLAN 200 - in fact, you don't need the VLAN 200 at all. Nor do you need multiple interfaces on the servers if you don't mind that each server will only be reachable via a single uplink at a time.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Tue May 16, 2023 7:17 am

Thank you @sindy for your suggestions and your patience.
I'll work on this.
The only point to better think about for me it's the integration in this schema about Router 3 / LTE.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Mon May 22, 2023 11:41 am

@sindy,
I believe I've reached a good consistency point:

1) I've implemented the 'ISP failover' using netwatch + script to change VRRP priorities. I works ok (I've to ping the public IPs of the 2 lines that for me are static). This mechanism is alternative to the failover routes before implemented and both are working with similar speed. I'm not sure which is better, but at this stage I chosen to use netwatch and I just disabled (the routes definitions not applicable - that's the one that uses the ISP from the other router)

2) I cleaned up the VRRP definitions, since at the end it is simpler for my needs to use the basic VRRP schema instead of the load sharing, so I removed the VRRP interfaces not used

3) I replaced the 4G / LTE TP-Link router with a Mikrotik ATL LTE18 Kit and I decided to use it with its own firewall that implement the same filtering rules as the other 2 routers and its LAN side is connected through VLAN200 to both routers (1 & 2). Everything is working ok and it is much more flexible that the previous TP-Link.

4) I implemented the mangling for the incoming connections for both routers on WAN1 (the main ISP) and WAN3 (the backup 4G/LTE), in order ensure that what is coming from outside on WANx is getting out on the same WANx. It seems working OK, also if (unless an internal server is able to respond from more than one VLAN/subnet), I try to access from ISP1/router1 a server running on a VLAN where the VRRP is master on router2, I do not get any response. So, it seems that while VRRP inside the LAN, is able to manage requests to the right router (in other words, I see all the 4 VLANs as they are running on the same router), but from outside, unless I point to the right router through its ISPx, the server that is running on the other router is not reachable.

Based on this, I'm not really sure, I really need to do this mangling ...
And I do not believe I'll be able to solve this issue also going back to the first method (failover on the other router ISPx using routes) ...

For users in the forum that are interested, below some definitions.
Netwatch code:
#ROUTER #01

/tool netwatch
add comment="*** check ISP1 for failover ***" disabled=no down-script="/interf\
    ace vrrp set [find name=vrrp5] priority=150\r\
    \n/interface vrrp set [find name=vrrp10] priority=120\r\
    \n/interface vrrp set [find name=vrrp20] priority=120\r\
    \n/interface vrrp set [find name=vrrp3090] priority=150\r\
    \n" host=<PUBLIC_IP_ISP2> http-codes="" interval=30s name=ISP1_status \
    packet-count=3 startup-delay=1m test-script="" type=icmp up-script="/inter\
    face vrrp set [find name=vrrp5] priority=250\r\
    \n/interface vrrp set [find name=vrrp10] priority=200\r\
    \n/interface vrrp set [find name=vrrp20] priority=200\r\
    \n/interface vrrp set [find name=vrrp3090] priority=250\r\
    \n"

# ROUTER #02

/tool netwatch
add comment="*** check ISP2 for failover ***" disabled=no down-script="/interf\
    ace vrrp set [find name=vrrp5] priority=120\r\
    \n/interface vrrp set [find name=vrrp10] priority=150\r\
    \n/interface vrrp set [find name=vrrp20] priority=150\r\
    \n/interface vrrp set [find name=vrrp3090] priority=120\r\
    \n" host=<PUBLIC_IP_ISP2> http-codes="" interval=30s name=ISP2_status \
    packet-count=3 startup-delay=1m test-script="" type=icmp up-script="/inter\
    face vrrp set [find name=vrrp5] priority=200\r\
    \n/interface vrrp set [find name=vrrp10] priority=250\r\
    \n/interface vrrp set [find name=vrrp20] priority=250\r\
    \n/interface vrrp set [find name=vrrp3090] priority=200\r\
    \n"

IP Routes and incoming connections mangling.
Please note that the routes definitions that are marked with '#' are those ones (with distance=10) I've disabled. They were used in the first approach that was managing the ISPx failover taking the connection from the other router, so I had on router1 ISP1->ISP2->ISP3 and on router2 ISP2->ISP1->ISP3. On the contrary, using NetWatch method, I've only ISP1/2->ISP3. e.g. if ISP1 is down, the router2 takes over all the 4 VRRP interfaces as maters (and vice versa). So no need to cross the connections between the 2 routers as I was doing initially.
# router #01

# YOU HAVE THESE ONLY IF YOU NEED TO MANAGE INCOMING CONNECTIONS incoming WANx -> out WANx
/routing table
add name=ISP1_route fib
# add name=ISP2_route fib
add name=ISP3_route fib

/ip route
add check-gateway=ping dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=13 target-scope=13 distance=5 routing-table=main
# add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=13 target-scope=13 distance=10 routing-table=main
add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=13 target-scope=13 distance=5 routing-table=main
# add check-gateway=ping dst-address=0.0.0.0/0 gateway=208.67.222.220 scope=13 target-scope=13 distance=10 routing-table=main

# YOU HAVE THESE ONLY IF YOU NEED TO MANAGE INCOMING CONNECTIONS incoming WANx -> out WANx
add check-gateway=ping dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=13 target-scope=13 distance=5 routing-table=ISP1_route
# add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=13 target-scope=13 distance=10 routing-table=ISP2_route
add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=13 target-scope=13 distance=5 routing-table=ISP1_route
# add check-gateway=ping dst-address=0.0.0.0/0 gateway=208.67.222.220 scope=13 target-scope=13 distance=10 routing-table=ISP2_route

add dst-address=9.9.9.9/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP1 via PPPoE - ping Host 1" distance=5
# add dst-address=8.8.8.8/32 gateway=172.22.1.2 scope=10 target-scope=12 comment="ISP2 via Backup Router - ping Host 1" distance=10
add dst-address=1.0.0.1/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP1 via PPPoE - ping Host 2" distance=5
# add dst-address=208.67.222.220/32 gateway=172.22.1.2 scope=10 target-scope=12 comment="ISP2 via Backup Router - ping Host 2" distance=10

add dst-address=0.0.0.0/0 gateway=192.168.4.1 scope=10 target-scope=12 comment="ether2 ISP3 via GW 4G/LTE" distance=15 routing-table=main

# YOU HAVE THIS ONLY IF YOU NEED TO MANAGE INCOMING CONNECTIONS incoming WAN3 -> out WAN3
add dst-address=0.0.0.0/0 gateway=192.168.4.1 scope=10 target-scope=12 comment="incoming ISP3 via GW 4G/LTE" distance=15 routing-table=ISP3_route

# YOU HAVE THESE ONLY IF YOU NEED TO MANAGE INCOMING CONNECTIONS incoming WANx -> out WANx
# MANGLING TO MANAGE incoming WANx -> out WANx
/ip firewall mangle
add chain=prerouting in-interface=pppoe-out connection-state=new action=mark-connection new-connection-mark=ISP1_conn
add chain=prerouting in-interface-list=VRRP connection-mark=ISP1_conn action=mark-routing new-routing-mark=ISP1_route
# add chain=prerouting in-interface=WAN_ROUTING_VLAN connection-state=new action=mark-connection new-connection-mark=ISP2_conn
# add chain=prerouting in-interface-list=VRRP connection-mark=ISP2_conn action=mark-routing new-routing-mark=ISP2_route
add chain=prerouting in-interface=WAN_ISP3_VLAN connection-state=new action=mark-connection new-connection-mark=ISP3_conn
add chain=prerouting in-interface-list=VRRP connection-mark=ISP3_conn action=mark-routing new-routing-mark=ISP3_route

# router #02

# YOU HAVE THESE ONLY IF YOU NEED TO MANAGE INCOMING CONNECTIONS incoming WANx -> out WANx
/routing table
# add name=ISP1_route fib
add name=ISP2_route fib
add name=ISP3_route fib

/ip route
add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=13 target-scope=13 distance=5 routing-table=main
# add check-gateway=ping dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=13 target-scope=13 distance=10 routing-table=main
add check-gateway=ping dst-address=0.0.0.0/0 gateway=208.67.222.220 scope=13 target-scope=13 distance=5 routing-table=main
# add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=13 target-scope=13 distance=10 routing-table=main

# YOU HAVE THESE ONLY IF YOU NEED TO MANAGE INCOMING CONNECTIONS incoming WANx -> out WANx
add check-gateway=ping dst-address=0.0.0.0/0 gateway=8.8.8.8 scope=13 target-scope=13 distance=5 routing-table=ISP2_route
# add check-gateway=ping dst-address=0.0.0.0/0 gateway=9.9.9.9 scope=13 target-scope=13 distance=10 routing-table=ISP1_route
add check-gateway=ping dst-address=0.0.0.0/0 gateway=208.67.222.220 scope=13 target-scope=13 distance=5 routing-table=ISP2_route
# add check-gateway=ping dst-address=0.0.0.0/0 gateway=1.0.0.1 scope=13 target-scope=13 distance=10 routing-table=ISP1_route

add dst-address=8.8.8.8/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP2 via PPPoE - ping Host 1" distance=5
# add dst-address=9.9.9.9/32 gateway=172.22.1.1 scope=10 target-scope=12 comment="ISP1 via Backup Router - ping Host 1" distance=10
add dst-address=208.67.222.220/32 gateway=pppoe-out scope=10 target-scope=12 comment="WAN1 ISP2 via PPPoE - ping Host 2" distance=5
# add dst-address=1.0.0.1/32 gateway=172.22.1.1 scope=10 target-scope=12 comment="ISP1 via Backup Router - ping Host 2" distance=10

add dst-address=0.0.0.0/0 gateway=192.168.4.1 scope=10 target-scope=12 comment="ether2 ISP3 via GW 4G/LTE" distance=15 routing-table=main

# YOU HAVE THIS ONLY IF YOU NEED TO MANAGE INCOMING CONNECTIONS incoming WAN3 -> out WAN3
add dst-address=0.0.0.0/0 gateway=192.168.4.1 scope=10 target-scope=12 comment="incoming ISP3 via GW 4G/LTE" distance=15 routing-table=ISP3_route

# YOU HAVE THESE ONLY IF YOU NEED TO MANAGE INCOMING CONNECTIONS incoming WANx -> out WANx
# MANGLING TO MANAGE incoming WANx -> out WANx
/ip firewall mangle
add chain=prerouting in-interface=pppoe-out connection-state=new action=mark-connection new-connection-mark=ISP2_conn
add chain=prerouting in-interface-list=VRRP connection-mark=ISP2_conn action=mark-routing new-routing-mark=ISP2_route
# add chain=prerouting in-interface=WAN_ROUTING_VLAN connection-state=new action=mark-connection new-connection-mark=ISP1_conn
# add chain=prerouting in-interface-list=VRRP connection-mark=ISP1_conn action=mark-routing new-routing-mark=ISP1_route
add chain=prerouting in-interface=WAN_ISP3_VLAN connection-state=new action=mark-connection new-connection-mark=ISP3_conn
add chain=prerouting in-interface-list=VRRP connection-mark=ISP3_conn action=mark-routing new-routing-mark=ISP3_route

At the end, I was able to make all this working, thanks to @sindy, @anav and @Amm0.
 
wiseroute
Member
Member
Posts: 352
Joined: Sun Feb 05, 2023 11:06 am

Re: VRRP and ISP Failover

Mon May 22, 2023 2:04 pm

hello rikpal,

I'm sorry couldn't give you further feedback due to some work.

wow... really nice discussion you have with @ anav @ amm0 and the others 👍🏻

i have just read couple of your last post, and found that you have abandoned your vrrp master-master setup? why?

yes, vrrp is not nat friendly. in case of fail over happens vrrp will break any existing communication sessions.

about the mangle and mark etc. i don't think they are vrrp related.

hmm... i would like to help you with some config examples on some vm, but it's not easy emulating layer 2 setup. it creates loop inside the box.

from your vlan clients view, they only need to know their vrrp gateway, be it master r1, or master r2 for each vlan.

your master -master setup is doable, to gain benefits from those isp links. but you should compensate vrrp for the nat sessions.

as for internet inbound to your network, you can have your dns to have both isp1 isp2 ip to any service you have on the network as usual.

you should choose which ip is the most reliable to have the first option. for example smtp mx 10, mx 20 which 10 is the most reliable ip.

dst nat those ip to your server respectively. no problem except for vrrp taking failover action - it will break existing session, but not for new sessions.

hope this helps.
 
rikpal
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 74
Joined: Tue Mar 07, 2023 2:02 pm
Location: Italy

Re: VRRP and ISP Failover

Mon May 22, 2023 3:51 pm

i have just read couple of your last post, and found that you have abandoned your vrrp master-master setup? why?
Since I do not really need to distribute a single VLAN load on 2 routers, for me it's simpler to use only 1 gateway for my clients.
about the mangle and mark etc. i don't think they are vrrp related.
yes, you're right but I implemented it to be sure the incoming traffic for a WAN will go out for the same WAN, having multiple WANs.
your master -master setup is doable, to gain benefits from those isp links. but you should compensate vrrp for the nat sessions.
as for internet inbound to your network, you can have your dns to have both isp1 isp2 ip to any service you have on the network as usual.
you should choose which ip is the most reliable to have the first option. for example smtp mx 10, mx 20 which 10 is the most reliable ip.
dst nat those ip to your server respectively. no problem except for vrrp taking failover action - it will break existing session, but not for new sessions.
can you better explain this to me? Is there any way for me that I can access from ISP1/router1 servers running on router2? I understand this from your post, but I do not know how it could be feasible (VRRP master-master? DNS?)
thank you
 
wiseroute
Member
Member
Posts: 352
Joined: Sun Feb 05, 2023 11:06 am

Re: VRRP and ISP Failover

Mon May 22, 2023 4:55 pm

hello rikpal,
Since I do not really need to distribute a single VLAN load on 2 routers, for me it's simpler to use only 1 gateway for my clients.
ok. try to observe this example.
isp1 --- r1/vrrp10,11,12 --- server10
                |
isp2 --- r2/vrrp13,14,15 --- server13
on internet dns perspective:

server10 --- isp1 ip, alias isp2 ip
server13 --- isp2 ip, alias isp1 ip

the inbound dst nat perspective:

their respective isp1 or isp2 requests are obviously forwarded to server10, server13.

request came on wan1 isp1 ip for server13 will automatically intervlan routing. and so forth.

will the nat breaks the incoming session? no.
will the vrrp gateway for each server will lead to wrong forwarding interface? yes.

solutions : routing mark.

netwatch :
use it solely for link test failure - shut the vrrp interface down, and the vrrp backup will take over the forwarding. note : netwatch just to shut down the respective vrrp interface for each router.

of course, netwatch to activate those interface back on if the link comes back normal.

example:

isp1 --- r1 with its part of vrrp interface 10,11,12 prio 255.

if those vrrp interface are down, then

isp2 --- r2 with its part of vrrp interface 10,11,12 prio 100 will be activated. forwarding ip will change as well. the previous session are broken.

compensate for new session.

will the vrrp fail over break any existing session (request and reply)? yes. can't be avoided under nat. but will have no effect in full ip routing.

from the lan or servers perspective : the same vrrp virtual ip gateway.

ok. have a try and good luck 👍🏻

Who is online

Users browsing this forum: No registered users and 77 guests