Community discussions

MikroTik App
 
samsung172
Forum Guru
Forum Guru
Topic Author
Posts: 1191
Joined: Sat Apr 04, 2009 3:45 am
Location: Østfold - Norway
Contact:

A strange day - VRRP/Wireguard

Mon Mar 18, 2024 3:11 am

Hello! Long time no see - it's been a while since I was here now - but today has been my strangest MikroTik day in some years - so I need to ask you guys for some tips.

I had a major failure at my data center today - it's actually the 3rd time in a year now this is happening - so I was rebuilding my core network a bit. Earlier this Sunday morning (it's always Sundays - right?) the core router connecting to my transit provider went offline, and I needed to take a trip to the Orange data center where I have my equipment installed. When I arrived, the core router was in a boot hang mode - and was rebooting and rebooting. This is the 3rd time this year - and last time I was suspecting a console cable to make this happen - but I was not sure. I have an original blue Cisco DB9 (RS232) to RJ45 cable connected directly from my CCR1036 to a Cisco switch to have a backdoor for login. When I arrived, I tried to take power in and out - without success 4-5 times. I then wanted to connect the CCR to my computer using USB to serial cable - but at once I plugged out the blue Cisco cable from the CCR - it started normally. This is for me a weird behavior - so I tried to plug the cable back in - but everything seems fine and there was no problem when it already was rebooted.

My plan has been for a while to have VRRP at the transit - and it has been ready config for this from the transit provider for a while. So I had a spare CCR (CCR1009) and mounted this for my second internet line. This was fine. I then thought I should test the console cable and plugged it also into this router - but everything seems fine - until I upgraded the software and rebooted the router. Then the same thing was happening to this router. It just hangs in the boot - times out after some minutes and tries to boot again - without success. Ok - Strange - might it be ROS upgrade? I tested a bit more - and it seems like every time I tried to reboot the router - it just hangs before even loading the kernel. For me, this is a strange behavior - but I know there is a boot menu and so on when connecting using a console cable - and push the right buttons in a boot - but should the Cisco do this by itself? Anyone having any clue to this? - Anyhow - I don't need the backdoor to the Cisco switch - so I just plugged the cable out - and had no problems with this after the job. Now it was time to configure VRRP. No problem - it went up like a charm and worked as planned.

Almost - After this job - almost all my WireGuard tunnels stopped working. I had 13 tunnels to different stuff. And 11 of these tunnels were down. The tunnel to my home was ok - and one more going through an LTE device. The 10 others were not working. I was doing all stuff I know to trying to figure out what's wrong here - but nothing helped, and I didn't know why 2 of the tunnels were online. After some research, I figured out that my line - that was working was the only one with a public IP - All the other ones were behind NAT. The 2 devices that were working - had an SSTP tunnel since the provider of the line was blocking all traffic except port 443. So it had a WireGuard going via an SSTP tunnel - And it was working.

Now I have kind of broken the problem down a bit - it seems that public IP - public IP works well - and direct IP-IP in SSTP - but at once as the traffic comes from a NATed device - it would not go up. One of the NATs is from AWS and a virtual router here - allowing all traffic to come without any firewall to come - anyway - the problem is also here.
As a quick fix- and to have the lines up (I was running EoIP via the WireGuard tunnels) I quickly changed all lines to SSTP and EoIP directly here - but due to the SSTP overhead - I want to go back to my WireGuard tunnels.

All the tunnels were working before the change to VRRP - and there are actually no other changes to the setup. Public/private key is the same. IP addresses, port numbers, etc., are the same - the only different thing is the link IP now using VRRP but the same router as the main router - the new one in backup state. The VRRP IP is just a link net IP between my transit provider and me and is a gateway for my subnets with public IP's. I have one IP from my own net to the router itself that is the IP that the tunnel is connecting to - I don't even use the VRRP link IP - but one routed behind this. All other traffic behind the router behaves normal - and there is nothing that has changed.

In the peers window at WireGuard - I see endpoint address from the client - so there is at least some traffic going here. The error I get is the Handshake for peer did not complete after 5 seconds, retrying message - so there is something about the handshake not working. - but why does it work public IP - public IP, SSTP - SSTP - but not Public to - NAT'ed? Anyone having any clues here? I can provide more config - but I don't think this will help here - as my config was working earlier - and I have done no change to this.

I tried a debug WireGuard logging - but there was really no more info there.

Sorry for my long post about a lot of stuff that I don't need here - but I needed to get some frustration out :)
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3506
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: A strange day - VRRP/Wireguard

Mon Mar 18, 2024 4:41 am

Quite the tale. So you have a WAN with some subnet of public IP, 10 WG tunnels, EoIP using WG peers. No BGP? e.g. just a subnet of public IPv4 on some bridge & using VRRP to potentially move one/more to another router on same "WAN bridge"? And all was working until you added a VRRP to WAN side?

Some diagram would help here for sure.

Your case is bit similar to one involving route rules and WG's heart beat, where WG's heart[beat] follows it own beat. See viewtopic.php?t=205278&hilit=wireguard
 
samsung172
Forum Guru
Forum Guru
Topic Author
Posts: 1191
Joined: Sat Apr 04, 2009 3:45 am
Location: Østfold - Norway
Contact:

Re: A strange day - VRRP/Wireguard

Mon Mar 18, 2024 9:27 am

Thank you for the reply.

The link to the other post gave me some clues - but not the solution itself. I'm wondering if the WireGuard handshake behaves a bit differently than other stuff in the routers regarding egress traffic?

My setup is as simple as possible - with kinda nothing special. No BGP or other routing protocols - it's not multi-homed, and I don't have an AS anymore for my addresses - it's just a subnet provided by my transit provider. I get 2 fibers from the provider - with the same subnet configured and with VRRP. (I also have a 10Ge third one providing me L2 VLANs to different sites, but that's not a part of this story)

Simple setup. Let's say 4.3.2.1/29 as a link net between me and the provider. 1.2.3.1/24 as my net. I have R1 and R2 connected.

The provider then routes 1.2.3.1/24 via 4.3.2.2 that's the "gw" IP at my side. Before changing to VRRP - this was my "external" IP directly to the interface pointing to my provider. The second fiber was just connected - but not in use. After throwing in one more router - linking the second fiber to this router and changing to VRRP, the R1 router itself has 4.3.2.3, and the second one R2 has 4.3.2.4 - both with 4.3.2.2 as its VRRP IP (old R1 IP). Pure simple VRRP setup. On the other side of R1, I have defined 1.2.3.1/24 and 1.2.3.2/24 to a loopback interface. (Not yet started to have redundancy using R2) A ping/traceroute to the 4.3.2.2 show that everything is fine and the R1 with IP 4.3.2.3/4.3.2.2 is responding as it should to traffic to DST 4.3.2.2. If I, for example, reboot R1 - the R2 (4.3.2.4) will be the one to respond to 4.3.2.2. (still to come to respond to the 1.2.3.1/24 net - but that's next on my to-do list)

Anyway - WireGuard is connecting to the 1.2.3.1 IP that in my theory should be unaffected by the VRRP interface at the other side of the router. My question is now - how does WireGuard handle this at egress? Does it actually use the VRRP side of the router to send its handshake - even if "clients" are connecting to the inside IP here? Might the best solution here be to just have R1 and R2 as so dumb router as possible - and throw in a third router to do the VPNs and stuff like that, so there is no issue about what to have as egress/ingress, etc.

I'll do a bit more research here now trying to figure out about this.
 
samsung172
Forum Guru
Forum Guru
Topic Author
Posts: 1191
Joined: Sat Apr 04, 2009 3:45 am
Location: Østfold - Norway
Contact:

Re: A strange day - VRRP/Wireguard

Mon Mar 18, 2024 10:19 pm

After some job i was able to conclude that the error is indeed wireguard that dont know what to do with egress when there are floating IP at vrrp redundancy. Trafffic is comming in via the correct interface and IP, but egress will use the other ip in the VRRP setup.

Problem solved by puting a 3. router to do the VPN stuff and let the 2 front routers do the VRRP stuff.
 
User avatar
Amm0
Forum Guru
Forum Guru
Posts: 3506
Joined: Sun May 01, 2016 7:12 pm
Location: California

Re: A strange day - VRRP/Wireguard

Mon Mar 18, 2024 10:44 pm

Since you might have traffic using VRRP IP over WG, you cannot just add a /routing/rule to drop it - which was my original thought.

So.... a separate VPN router seems like a good call. That simply thinking about the WG interactions with VRRP if they are NOT on same router. e.g. You need to do a lot of tracing to figure out what was actually going with VRRP and WG ... ONLY to have a slight better picture and specific IP/interface/etc... BUT not necessarily a immediate solution.
 
User avatar
anav
Forum Guru
Forum Guru
Posts: 19404
Joined: Sun Feb 18, 2018 11:28 pm
Location: Nova Scotia, Canada
Contact:

Re: A strange day - VRRP/Wireguard

Tue Mar 19, 2024 2:43 am

Without a diagram I have no clue what you are trying to do. Any explanation of requirements to date IS NOT user traffic based only, and is confused with config speak, a no no for communicating requirements. Short story, no diagram no user traffic requirements, no diagram, cannot help.
Furthermore, without evidence, all you say is pure conjecture/opinion......... of what you think you have or are facing.
Bless your heart ( learned that from Alabama ) Ammo! for trying ;-))

Who is online

Users browsing this forum: ggwp, GoogleOther [Bot], own3r1138 and 34 guests