I just got the RB5009UG+S+IN and trying to make a dual WAN setup work. I’ve spent a whole day reading the docs and reproducing the official examples here. Even tried a few of the YouTube tutorials but nothing has worked. Here’s the config I have so far: mikrotik.txt (8.51 KB)
I would very much appreciate it if anyone could help. The things that I’m struggling with so far are:
I’m not sure how to express the IP Firewall mangle rules when one connection is PPPoE and the other is DHCP both with non-static IPs.
I am not sure how to create the correct Routes when the IPs are not static and the interfaces are different
For some reason even when I temporarily enter the correct details for routes as they are in the moment I get ‘unreachable’ errors.
The setup is the following:
WAN1 - ISP1 over PPPoE on ether1 at pppoe-out1
WAN2 - ISP2 over DHCP on ether2
LAN - all ports in bridge, ether1 and ether2 are removed from bridge
WAN list contains ether1 and ether2
LAN list contains bridge
Ideally, I’d like to achieve the Example 3 PCC (Load Balancing With Per Connection Classifier), but so far copying the commands and substituting the fixed IPs has not worked. I’m at a loss. Any help would be much appreciated.
PPPoE can use gateway=pppoe-out1, so that’s fine. For DHCP you need correct gateway, which can be updated by lease script (example; route’s routing-mark is v6 parameter, v7 has routing-table). And you shouldn’t need the first two magle rules, their purpose is to skip the rest when request is from LAN and destination is local address (or subnet, when there’s some modem/upstream router you have access to). But local addresses are also excluded by dst-address-type=!local in further rules. And it doesn’t look like you need access to any modem.
I corrected the typos and set up the script. Everything should be working, but it’s not. If I remove the default route on the PPPoE and then remove the ISP_2 cable (to simulate ISP2 not working) my whole WAN goes down - nothing on the bridge can connect to the internet. Also, I’ve just noticed that print and export have profoundly different output, where the export command has routes in it that have been removed a very long time ago and are also incorrect. So I am at a loss to figure out the exact state of the router at any moment. See below:
/ip route print detail
Flags: D - dynamic; X - disabled, I - inactive, A - active; c - connect, s - static, r - rip, b - bgp, o - ospf, d - dhcp, v - vpn, m - modem, y - copy; H - hw-offloaded; + - ecmp
0 Xs ;;; dhcp-mrealnet
dst-address=0.0.0.0/0 routing-table=to_ISP2 gateway=1.2.3.4 distance=1 suppress-hw-offload=no
DAv dst-address=0.0.0.0/0 routing-table=main pref-src="" gateway=pppoe-out1 immediate-gw=pppoe-out1 distance=1 scope=30 target-scope=10 vrf-interface=pppoe-out1 suppress-hw-offload=no
DAc dst-address=192.168.152.0/24 routing-table=main gateway=bridge immediate-gw=bridge distance=0 scope=10 suppress-hw-offload=no local-address=192.168.152.1%bridge
DAc dst-address=213.231.129.12/32 routing-table=main gateway=pppoe-out1 immediate-gw=pppoe-out1 distance=0 scope=10 suppress-hw-offload=no local-address=213.231.164.18%pppoe-out1
/ip route export
# jun/21/2022 16:25:16 by RouterOS 7.3.1
# software id = ZHBT-C72F
#
# model = RB5009UG+S+
# serial number = EC190EEB6423
/ip route
add check-gateway=ping disabled=no distance=1 dst-address="" gateway=8.8.8.8 routing-table=*400 suppress-hw-offload=no
add check-gateway=ping disabled=no distance=2 dst-address="" gateway=8.8.4.4 routing-table=*400 suppress-hw-offload=no
add check-gateway=ping disabled=no distance=2 dst-address="" gateway=8.8.8.8 routing-table=*401 suppress-hw-offload=no
add check-gateway=ping disabled=no distance=1 dst-address="" gateway=8.8.4.4 routing-table=*401 suppress-hw-offload=no
add check-gateway=ping disabled=no dst-address="" gateway=ether2 routing-table=to_ISP2 suppress-hw-offload=no
add check-gateway=ping disabled=no dst-address="" gateway=pppoe-out1 routing-table=to_ISP2 suppress-hw-offload=no
add comment=dhcp-mrealnet disabled=yes distance=1 dst-address=0.0.0.0/0 gateway=1.2.3.4 routing-table=to_ISP2 suppress-hw-offload=no
PPPoE is ISP1, isn’t it? So if you remove default route from it, and you disconnect ISP2, then yeah, it won’t work very well, because one ISP will be without route to internet and another completely disconnected. Or did I misunderstand something?
You do want add-default-route=yes for DHCP client (and higher distance if it’s secondary ISP), because it’s used by router itself, e.g. for DNS requests.
I’m not sure (and I can’t test it right now) if check-gateway=ping does anything for PPPoE. Probably not. I don’t use PPPoE often, so I don’t remember it now, but there was some solution/workaround. I’ll try to remember, or maybe someone else will chip in.
Thanks for this, but I think the default route on ISP1 conflicts with the IPS1 route that I set up manually through the ToIPS1 routing table. If I don’t remove it, then everything goes through routing-table=main with destination=0.0.0.0/0 and gateway=%pppoe-out1, which is not correct. That will not load balance any connections. After the config is done, I should only have the manual routes: the script-based one on the DHCP IPS2 connection with table=ToISP2, and the static one for IPS1 table=ToISP1 with gateway=%pppoe-out1. I should never need the default route on any ISP.
EDIT
I no longer have an issue with acquiring DHCP IP from ISP2 (ISP2 had sent me wrong details).
The only issue left now is that I cannot seem to balance IPS1 and ISP2 for some reason. As soon as ISP2 gets online the network connectivity drops to a crawl. The connections seem to be marked correctly, and I believe the routes are also correct, but for some reason, traffic gets through only occasionally correctly.
Do you mean passthrough? I tried removing that on all the mangle rules and adding back the main table routes, but the situation remains the same - all clients have their traffic slow to a crawl. I also saw a heated discussion on the forum about the passthrough rules, mangle and dual WAN; I have tried following the suggestions there, but still no luck.
I think that this guide is what I’ll refer to next. It says that if the end-users are hitting the router directly, one should use the src-address classifier to avoid HTTP multi-connection-splitting issues.
Check if disabling this rule helps. If it does, you can either keep it disabled (RB5009 should have plenty of power to not really need it), or you’d need to make sure that it doesn’t touch connections using secondary ISP.
As for the best PCC classifier, if you want to avoid some remote servers not liking your connections changing addresses all the time, you want something that does not include local port.
So after a bunch of experimenting, I settled on the unsatisfactory setup of just having the different routes set with different distances so that that “main” WAN is used, and if that fails, the other is used - simple failover.
Unfortunately, I’ve already tried a bunch of the official guides and some others from the forum here, and none of them worked as expected, nor did they even meet basic network functioning requirements. Disabling the default “fasttrack” rule had a marginal effect given the mangle rules were prerouting and passthrough. Also, I think the named routing tables do not work as advertised, or something in all guides is missing. As soon as I create a route in a different routing table other than “main”, it is deemed unreachable, regardless of “main” table routes. Neither the official guide nor the one posted by @anav (which is for RouterOS 6 and uses routing marks instead of routing tables) helps alleviate this.
I’m now on RouterOS 7.5 and really struggling to find appropriate, applicable and functioning examples or documentation. I’m basically banging my head against the wall and learning the old-fashioned way. It’s very frustrating, but hopefully, when I have more time in the future, I’ll find a way to make it work.