Hi, I have a wireguard server in a datacenter (static IP) and a wireguard client on home with a CCR (dynamic IP and NAT)… I have enable persistent keepalive because I´m behind a NAT.
As we know, wireguard is a stateless connection and I´m experiencing connections issues… maybe when my home ISP changes my IP the connection looses…
I have configured a netwatch script that ping to the remote tunnel IP and when detect is down, disable, wait 30 seconds and enable wireguard interface… it works but for example, yesterday my home internet connection was down for 2 hours and when it comes up again, the script only reset once… not try until the tunnel goes up…
The issue with not connecting is due to the ENDPOINT, the SERVER, changing its IP address and using a dyndns name or mynetname for this Server endpoint.
In this case the SERVER is fixed and does not change.
The OP also noted he has selected keep alive on the client…
THUS according to WIREGUARD ROAMING, the SERVER & CLIENT keep each other appraised of any changes based on the last connection.
Therefore although it may be true the server may not now have the correct IP for the client, the CLIENT WG will have the correct IP and port for the Server and thus on the next keep alive segment will UPDATE the wg server on the correct settings and FULL connectivity should be established.
In other words the longest delay on a change should be the keep alive cycle or perhaps two cycles.
Or are we saying that a change in ISP address for the client may take such a long time for the router to acquire a new address that several keep alive cycles have passed and the wireguard protocol stops trying to contact the server ???
As far as I see it, it’s the other way around here but your reasoning is correct.
The “server” is in this case behind a fixed IP. Nothing changes there. It’s sitting nicely on his tower which doesn’t move.
The “client” is behind a dynamic IP and that’s why the keep-alive is needed, to notify the “server” if connection details have been changed.
Wireguard handles that beautifully (tested it already multiple times using cell phone moving abroad from ISP to ISP, so each time a new CGNAT-IP, bar some minor disruptions to handle the take-over, it recovers nicely).
But if for some reason or the other the “client” gets disconnected and the keep-alive is not passing anymore, toggling the peer status will make sure it makes connection again with the server, which hasn’t moved a bit.
Toggling on the “server” side is pointless, it doesn’t know where to go to (and if it’s a “client” behind CGNAT, it will not even be ABLE to pass) so it simply waits for something to come in.
Then the tunnel is made and we’re back in business.
Similar to dynamic dns and startup but just a bit different
If what your saying is true, then crypto roaming is broken on RoS and its still wrong. There should not be a loss in connectivity due to the peer changing IPs…
or I am out to lunch…
Something is wrong, that’s for sure.
But there might be another reason why the device isn’t able to get the keepalive packages towards the “server”.
The basis for all of this doing what it is supposed to do transparently is a fully functional connection towards internet, so the “server” can be reached. If that’s broken for one reason or the other, nothing anyone can do.
Except use some workaround to kick things in gear again. Which is not ideal and not a permanent solution, true, but it if works, why not ?
This shouldn’t happen. Problem with current RouterOS WG is when there’s remote endpoint with hostname, and when WG first tries to contact it, hostname can’t be resolved, and it doesn’t try again later, so tunnel stays down (because other side can’t initiate connection to dynamic peer). If the tunnel was already up, client can roam as much as it wants, and there shouldn’t be any problem, because it already knows server’s endpoint. And any communication from client to server will immediatelly update client’s endpoint on server.
In my case, no hostname… we connect through to the fixed IP address… before to do the scripts to down and up the interfaces, the wireguard connections drops down even without changing the dynamic address of the client… I don´t know where comes the issue, but with the netwatchs scripts it comes “alive” again…
I ask here for search another solution because I think this is a “temporal” solution…
More or less… I think isn´t working well… I has enabled persistent keep alive in both sides (on the Mikrotik and on both VPS), keep in mind that I has two VPS with wireguard, one in Spain and one in Germany… fails connections in both VPS´s… but the connections on both server not fails at same time… is randomly… one first… the other or both…
I do the configuration from zero, upgraded Mikrotik to 7.x and reseted to factory defaults with no default config… start configuring from zero in two times and same issue… WireGuard in VMs are latest versions
I will avoid routes, IP addresses and so, because the issue is in wireguard I will post wireguard configuration and firewall configuration… I think it´s sufficient…