WireGuard Peer not functioning after a router restart

WireGuard Peer does nothing after a reboot until I disable and reenable it.
Is it a bug, or do I need to enable/configure something?

“Does nothing” means the “Last Handshake” counter is not running, hosts routed via the WireGuard connection are not reachable.
After disabling-enabling counter starts running, and hosts become reachable.

My WireGuard side is “client.”

I’m using RouterOS 7.1.1 on RB5009.

maybe setting the persistent keepalive value would help you if you have not already tried that.

The value for keepalive is 20 seconds now. Unfortunately, this doesn’t help.

If for your endpoint you have a DNS instead of an IP addr and the DNS subsystem is not yet able to resolve that host by the time wireguard wants to connect, it will fail, wireguard doesn’t retry resolving that DNS, you can script it though.
It’s mentioned around the forum in a few places.

Yes, I’m using DNS for the WireGuard endpoint.

That is rather unfortunate, but it seems, that I have no choice but to script peer reinitialization.

Thanks for the answer.

It’s indeed a problem since the integration of WG into ROS7 when using DDNS and DNS resolving is not yet active when WG interface starts.

Netwatch - remote IP on other end.

On Down, something like this:
/interface wireguard peer disable 0
:delay 5
/interface wireguard peer enable 0
log info “WG Peer toggled”

Very basic but does what needs to be done.

Hi Hoelvetn,
Can you describe a bit more on whats going on here.
I dont quite get the issue. Are we saying that the client MT router trying to resolve the new Server Client mynetname takes longer than the peristent keep alive so that fails and the link is broken??

A bit earlier.
When the router starts, if dns resolution is not yet possible and wireguard interface starts, it fails and does not retry later on. So the interface never comes up.
A simple disable of the peer and re enable, retriggers the resolution of the endpoint dns.
If you use ip, you will not have this problem.
Only with dns endpoint names.

It is a known issue since wireguard has been integrated into ROS7 but easy to solve.
IMHO the solution would be to prevent wireguard interface to start IF dns name is used as endpoint AND DNS resolution is not functional yet.
Or retry the starting of the WG interface when DNS resolution is operational again if previous attempt to start failed because of this.
Multiple ways to skin that cat.

So your script is for the client Device, to delay the initial wireguard connection?
Lets say I am a user on the client I open my browser and want to go out.
I will get no result during this delay correct.

It seems the five seconds is rather arbitrary.
OKay I see what you are doing, as simple as possible but effective.

I was looking as something overly complex like…

If DNS mynetname is NOT resolved, delay Peer 5secs,
If DNYSmynetname is resolved, carry on.

Not sure how to state it but would use the firewall address list entry for mynetname??

Added some additional info in my post but yes, the 5s is very arbitrary. It’s what worked for me so far.
I’m not the inventor of this procedure (credit where credit is due but I am sorry for not remembering who the originator was), I saw it being suggested by someone else as manuel intervention to solve this annoyance.

And I simply put it into a very crude yet effective netwatch procedure.
Default thing I add now when using Wireguard on a “client” peer device.

Sounds like an addition to the wireguard article needs to be worked in.

No, this procedure is for that tiny period in time at power-on where starting of the wireguard interface with a named endpoint would be right in the same timeframe where possibly DNS resolution is not yet functional. This does not apply, to my knowledge, when you use IP addresses as endpoint.
When that WG start procedure is done in such an event, ROS does NOT retry to start the interface. It simply will not come up anymore.
Again, toggling the peer status retriggers that start procedure, at that time DNS should work (so we hope :laughing: ) and the starting of the WG interface will also become successful (again, so we hope).

In ideal circumstances that netwatch script will NEVER run.
Most cases it will run once during a period where power is applied, shortly after boot. And possibly again once when power is lost and up again.
But ideally never.

Feel free to add it. It’s something I also see being mentioned from time to time.
Until they provide a real fix embedded in ROS, this is the easy workaround.

When WireGuard is configured properly the issue described is not an issue … Wireguard recommends that each Peer interface be assigned an IP Address — THAT is the correct way and if that is done no issue … so your WireGuard article should reflect that correct procedure otherwise a kludge has to be applied and there is no need for a kludge. :slight_smile: OK its not a kludge but a add-in routine that is simply not necessary when WireGuard is properly configured.

Another very important POINT that many miss about WireGuard:
WireGuard does not focus on obfuscation. Obfuscation, rather, should happen at a layer above WireGuard, with WireGuard focused on providing solid crypto with a simple implementation .
https://encomhat.com/2021/07/obfuscate-wireguard/

@mozerd,
I respectfully disagree.
The problem here is not the WG peer IP address itself.
The problem is the ENDPOINT address, when specified as dns name (which is perfectly possible and is the ONLY way to use when having a dynamic IP on “server” side).

ONE end of the Wireguard interface needs to be publicly available.
Being it static IP, static DNS or dynamic DNS.
The peer will connect, if it can resolve the dns. The “server” will then figure out the address of the “client” once the interface comes up.

Oh mighty @mozerd aka the one that can't read labels, please test the mentioned scenario in the first post of this topic and in other places on the forum:
Set up two peers, each on different routers ofc.
Let’s say Peer A sits behind a dynamic public wan IP, so you have to use a DDNS to reach it, let’s say that peer-A-public.wg-ddns.ru points to this peers current IP and gets updated properly.
And we have one Peer B that sits behind a CGNAT IP / NAT whatever that you have no control over, and this peers wants to connect to our Peer A.
So you set under Peer Bs config, the endpoint for our Peer A, which is peer-A-public.wg-ddns.ru:whateverport.
Now, reboot your Peer Bs router but with your wan cable unplugged, let it boot, and plug the wan cable 10 seconds after it finished booting.
Wireguard will not come up, wireguard attempts to resolve peer-A-public.wg-ddns.ru and fails because we have no internet by the time it tries to do so, AND THAT’S IT, it doesn’t try again, nothing, nada, just like your ordered switch without PoE doesn’t magically have PoE.
Oh, and by all means, please configure your peers properly before the test.
Cheers.
Also, this is not something MikroTik specific, it’s just how WireGuard works, for other platforms there are scripts provided:
https://github.com/WireGuard/wireguard-tools/tree/master/contrib/reresolve-dns

Out of curiosity. Can someone point out to me other topics with the same problem discussed?
I’ve searched for them before creating this topic but had no success.

Someone needs to upgrade some searching skills … :laughing:

1- http://forum.mikrotik.com/t/microtik-wireguard-to-raspberry-pi/152177/1
2- http://forum.mikrotik.com/t/wireguard-use-hostname-in-endpoint/143014/1
3- http://forum.mikrotik.com/t/v7-1rc3-development-is-released/151711/1 (post 135)
4- http://forum.mikrotik.com/t/v7-1rc4-development-is-released/152002/42 (post 13 indicating it was solved but the issue slipped back in)
I’ll stop here …

Thanks for answering the question,
Yes, that may be the case with my search skills )

Search query “wireguard peer not connected after reboot” gave me nothing, and for “wireguard peer” about the first five topics was irrelevant.

@Znevna
Nice sequence to reproduce the behavior ! That’s exactly what will trigger it.

Whereas you say it might be “default behavior” for the protocol, I see it as a bug on the integration in ROS.
The OS should take care to solve this issue for the user.

But there are more stringent problems to fix then this little annoyance which can be circumvented quite easily.
E.g. some devices becomes inaccessible through winbox or ssh on 7.2rc1 (been there, done that)
Or significant increase in CPU load, also 7.2rc1 (I’ve seen it happen too)
Routing protocols not working properly, … etc etc.

But then again, ROS 7.1.1 might be called stable, to me the complete ROS7-stream it’s still testing-quality.
But it works for what I need :smiley:

In all fairness: I more or less knew which terms would give me the needed hits :laughing:

wireguard toggle peer site:forum.mikrotik.com