after upgrading to ROSv7 I’ve been looking at WireGuard as an alternative for remote workers and have been doing some testing on it. The configuration is rather simple and quick which I do like a lot, and I’ve established a connection with my mobile to test things out.
Now for some context:
1.The router is a CCR 1009-8G-1S.
2.We have two ISPs set up for redundancy, one going out of SFP1, another from Ether7.
3. I have created the dst-nat rule and confirmed it works correctly.
4. My phone is able to ping locally and the internet. Also DNS is fine.
I would like some help from a more experienced Mikrotik admin regarding packet flow in cases where there’s a routing change.
So what’s happening currently if I swap ISPs is that Wireguard is able to do the handshake, ping the Mikrotik address it has been assigned but not the WireGuard tunnel interface, or anything else for that matter.
I’m unsure why the packets are not going out of the backup ISP’s interface whenever this routing change happens and why it affects WireGuard specifically.
What I’ve also tried is adding a separate peer on the WireGuard client configuration to point to the backup ISPs public address - and yes the handshake happens, but nothing else passes through the router.
In a dual wan scenario where WAN2 is secondary lets say by distance and your current setup is for users to connect to WAN1 address, when WAN1 fails ( is no longer available ), the router will move wireguard traffic to WAN2 after a short delay. I havent tested that lately but it used to be the case. You can test by having a wireguard connection ongoing, then pull the wan1 cable from the router and measure how long it takes for the user to access the subnets or internet on the router etc…
Not sure what you mean by swap ISPs, but yes if the primary is still available and users attempt to connect to WAN2 IP address for wireguard, without some modifications it wont work.
Interesting, I would expect that same behavior but it’s not the case on my end of things. Probably something else interfering?
I have both default-static routes configured via distance, so one is sitting at 1, the backup ISP is sitting at 2.
So for testing, I connected my phone, pinged a few internal and external addresses to make sure everything was okay - and then swapped the distances to simulate latency on a bad ISP day. What happened was completely inability to ping anything but the assigned WireGuard address. I reconnected just in case that was necessary, and nothing changed.
Since you mentioned this was working for you before, do you recall what the setup was like so I can simulate it on my end?
Your testing method may be flawed.
If you swap distances on the WANs, do you also change the endoint address to WAN2 for the device??
You need to NOT change the WAN distance, simply unplug the cable from wan1 into the router.
I don’t understand. Why would that need to be the case? The first WAN IP is still able to be pinged from the internet, so dst-nat should be working still to bring traffic in? Unless there’s another interaction in place which I’m not Mikrotik savy about.
If I unplug the cable, that WAN IP is no longer able to bring traffic in - so how would the WireGuard client be able to find it to establish the handshake?
Or did you phrase that question assuming the WireGuard client has two configured peers like:
[Peer]
PublicKey=keyhere
Endpoint=FirstWANIP:Port
AllowedIPs=0.0.0.0/0
If WAN1 is your primary WAN ( and WAN2 is rarely used ), then it stands to reason that all your wireguard users have WAN1 as their endpoint address.
To test if the router will switch to WAN2 automatically, due to distance in route difference, please do not SWAP distances.
To test simply unplug internet cable on the router that is associated with WAN1 and observer the behaviour.
After some time more than 10 seconds, less than 30 seconds I imagine the router should switch wireguard traffic to WAN2 on its own.
Please attempt and report back. ( see how long it takes…)
I just did this test (with minor regrets as I had a user in a phone call and the great telephony system their client uses bugged out), and traffic immediately jumped to Ether7. However WireGuard handshakes were no longer processing as before. The timer refreshes every 2 minutes by looking at WireGuard Peers, I assume that’s when it does the keep alive check? And this time it went over 2 minutes and didn’t refresh again. To double check it, I did a ping on my mobile as well and there was no reachability anywhere.
My very broscience take as I’m not networking savy enough is - could it be that WireGuard is expecting a reply from that same Peer address (From WAN1) and despite Mikrotik trying to route traffic out the WAN2 interface, the WireGuard client is not accepting it and the handshake process stops?
It should work so there may be something else in your config interfering.
/export file=anynameyouwish ( minus router serial number, any public WANIP information, keys )
The config is far to complex for my level of understanding, however I will say that you give away addresses like candy to kids,
and as far as I understand the single bridge should not have multiple IP addresses, nor probably any single etherport…
it’s just different subnets being assigned their respective gateways to the bridge. What is your configuration like for a network with multiple subnets in it?
Each vlan is created with interface being bridge.
Each vlan gets its own dhcp server, ip pool, dhcp-server network AND!!!own IP address ( not a sniff of bridge on these subnet config lines ).
The only other place vlans and bridges are mixed is /interface bridge port and /interface bridge lans.
There is no VLAN filtering configured on the network, it’s purely in one bridge with each subnet only configuring the gateway. Are you thinking that’s the problem in itself or? I’m not following how that could be related to WireGuard traffic not going the way it should.
Im saying a bridge gets one address, if you want different subnets you can cover ports A-F with the same subnet and single bridge and use different addresses for ports G,H,I NOT on the bridge, as that will cover three different subnets.
OR
use one bridge and assign as many vlans as you need (subnets) going over the various ports
The point is wireguard is not the real issue at the moment. Once the config is fixed, then we will be able to see whats going with wireguard, if its still a problem.
Understood. Well, VLAN filtering is something that we will eventually get down to testing and implementation on maintenance windows as we don’t have many of those due to 24/7 network usage on the company. Besides that, this kind of configuration has worked very well and even when I took MTCNA I was explained it the exact same way for subnetting, so I’m not sure where you’re seeing the problem exactly and how it might be fixed? Bear in mind we already have 7 subnets and will likely expand in the future, so assigning one port for every subnet is not realistic.
Then setup vlan filtering now and once its smooth, do the wireguard, should take me 10minutes to fix once you have an initial config its like butta.
First however, its best to work the config from an OFF the bridge position.
What i recommend is create an offbridge port for local emergency access.
So remove etherX from /interface bridge port settings.
Modify the following entry /ethernet
set [ find default-name=etherX ] name=OffBridgeX
Give it an Ip address /ip address
add address=192.168.77.1/30 interface=OffBridgeX network=192.168.77.0
Add it to the Interface List Members /interface list
add name=LAN
add name=TRUSTED /interface list member
add interface=vlanManagement list=LAN
add interface=OffBridgdeX list=LAN
add interface=vlanManagement list=TRUSTED
add interface=OffBridgdeX list=TRUSTED
Now you should be able to plug your laptop into etherX, change the IPV4 settings on the laptop to 192.168.77.2, then using winbox enter the router with username and password.
Do all the initial config here as well!
Note the netmask of 30 on the address only allows two addresses to work on the router, .1 and .2.