Wireguard connection "issues"

donsergio · March 21, 2022, 8:16am

Hi, I have a wireguard server in a datacenter (static IP) and a wireguard client on home with a CCR (dynamic IP and NAT)… I have enable persistent keepalive because I´m behind a NAT.

As we know, wireguard is a stateless connection and I´m experiencing connections issues… maybe when my home ISP changes my IP the connection looses…

I have configured a netwatch script that ping to the remote tunnel IP and when detect is down, disable, wait 30 seconds and enable wireguard interface… it works but for example, yesterday my home internet connection was down for 2 hours and when it comes up again, the script only reset once… not try until the tunnel goes up…

Is anyway to prevent this? or solve this “issue”?

This is the script:

/tool netwatch
add down-script="/interface/wireguard/disable 0\r\
    \n:delay 30\r\
    \n/interface/wireguard/enable 0\r\
    \nglobal telegramMessage \"Warning: Wireguard tunnel reset\"\r\
    \n/system script run script-telegram-alerts" host=192.168.2.1 timeout=6s

Thanks!!

anav · March 21, 2022, 1:31pm

Do you use a dnydns url for the Server ???
It shouldnt matter if the home IP changes as that is not the Server.

Would need to see the wireguard settings for each end…

holvoetn · March 21, 2022, 3:07pm

I trust the IP you’re monitoring is ‘on the other side’ (the side with fixed IP) ?

You should toggle the PEER status, not the Wireguard status for this trick to work.

So that script should be (assuming there is only 1 peer definition):

/tool netwatch
add down-script="/interface wireguard peers disable 0\r\
    \n:delay 30\r\
    \n/interface wireguard peers enable 0\r\
    \nglobal telegramMessage \"Warning: Wireguard tunnel reset\"\r\
    \n/system script run script-telegram-alerts" host=192.168.2.1 timeout=6s

And 5 seconds delay is enough

anav · March 21, 2022, 3:16pm

but why??
This is wrong!!!

The issue with not connecting is due to the ENDPOINT, the SERVER, changing its IP address and using a dyndns name or mynetname for this Server endpoint.

In this case the SERVER is fixed and does not change.
The OP also noted he has selected keep alive on the client…
THUS according to WIREGUARD ROAMING, the SERVER & CLIENT keep each other appraised of any changes based on the last connection.
Therefore although it may be true the server may not now have the correct IP for the client, the CLIENT WG will have the correct IP and port for the Server and thus on the next keep alive segment will UPDATE the wg server on the correct settings and FULL connectivity should be established.

In other words the longest delay on a change should be the keep alive cycle or perhaps two cycles.

Or are we saying that a change in ISP address for the client may take such a long time for the router to acquire a new address that several keep alive cycles have passed and the wireguard protocol stops trying to contact the server ???

holvoetn · March 21, 2022, 3:36pm

As far as I see it, it’s the other way around here but your reasoning is correct.

The “server” is in this case behind a fixed IP. Nothing changes there. It’s sitting nicely on his tower which doesn’t move.
The “client” is behind a dynamic IP and that’s why the keep-alive is needed, to notify the “server” if connection details have been changed.
Wireguard handles that beautifully (tested it already multiple times using cell phone moving abroad from ISP to ISP, so each time a new CGNAT-IP, bar some minor disruptions to handle the take-over, it recovers nicely).

But if for some reason or the other the “client” gets disconnected and the keep-alive is not passing anymore, toggling the peer status will make sure it makes connection again with the server, which hasn’t moved a bit.
Toggling on the “server” side is pointless, it doesn’t know where to go to (and if it’s a “client” behind CGNAT, it will not even be ABLE to pass) so it simply waits for something to come in.
Then the tunnel is made and we’re back in business.

Similar to dynamic dns and startup but just a bit different

That’s how I see it.

anav · March 21, 2022, 4:58pm

I was referring to (6) MY NETNAME found here https://forum.mikrotik.com/viewtopic.php?t=182340

If what your saying is true, then crypto roaming is broken on RoS and its still wrong. There should not be a loss in connectivity due to the peer changing IPs…
or I am out to lunch…

holvoetn · March 21, 2022, 6:43pm

Something is wrong, that’s for sure.
But there might be another reason why the device isn’t able to get the keepalive packages towards the “server”.
The basis for all of this doing what it is supposed to do transparently is a fully functional connection towards internet, so the “server” can be reached. If that’s broken for one reason or the other, nothing anyone can do.
Except use some workaround to kick things in gear again. Which is not ideal and not a permanent solution, true, but it if works, why not ?

Sob · March 21, 2022, 7:03pm

This shouldn’t happen. Problem with current RouterOS WG is when there’s remote endpoint with hostname, and when WG first tries to contact it, hostname can’t be resolved, and it doesn’t try again later, so tunnel stays down (because other side can’t initiate connection to dynamic peer). If the tunnel was already up, client can roam as much as it wants, and there shouldn’t be any problem, because it already knows server’s endpoint. And any communication from client to server will immediatelly update client’s endpoint on server.

donsergio · March 24, 2022, 6:23pm

In my case, no hostname… we connect through to the fixed IP address… before to do the scripts to down and up the interfaces, the wireguard connections drops down even without changing the dynamic address of the client… I don´t know where comes the issue, but with the netwatchs scripts it comes “alive” again…

I ask here for search another solution because I think this is a “temporal” solution…

anav · March 24, 2022, 6:37pm

Its as if the persistent keep alive is not working???

donsergio · March 24, 2022, 6:41pm

More or less… I think isn´t working well… I has enabled persistent keep alive in both sides (on the Mikrotik and on both VPS), keep in mind that I has two VPS with wireguard, one in Spain and one in Germany… fails connections in both VPS´s… but the connections on both server not fails at same time… is randomly… one first… the other or both…

anav · March 24, 2022, 6:43pm

Do you mean two connections from the same Mikrotik Router/Device??

If so please post config
/export file=anynameyouwish

donsergio · March 24, 2022, 6:52pm

yes, I have established two wireguard connections from the Mikrotik to two VPS´s

holvoetn · March 24, 2022, 7:09pm

How did your device get to the current version ?
Upgrade after upgrade and so on ?

Long shot, if yes:
Did you already try to clean install that device, taking over config from export right before reset ?

donsergio · March 24, 2022, 7:22pm

I do the configuration from zero, upgraded Mikrotik to 7.x and reseted to factory defaults with no default config… start configuring from zero in two times and same issue… WireGuard in VMs are latest versions

anav · March 24, 2022, 7:34pm

Por favor, provide the Mikrotik config.

/export file=anynameyouwish

holvoetn · March 24, 2022, 8:27pm

Clear. Then we can rule out rogue leftover settings as well.

As anav indicated, it might help to have a look at your config.
Make sure to edit out any leftovers of sensitive information.

donsergio · March 25, 2022, 12:44pm

I will avoid routes, IP addresses and so, because the issue is in wireguard I will post wireguard configuration and firewall configuration… I think it´s sufficient…

# mar/25/2022 13:37:34 by RouterOS 7.1.3
# software id = xxxxxxxx
#
# model = RB1100Dx4
# serial number = xxxxxxxx
/interface wireguard
add listen-port=13231 mtu=1420 name=vps1
add listen-port=13230 mtu=1420 name=vps2
/interface wireguard peers
add allowed-address=0.0.0.0/0,::/0 endpoint-address=vps1-ip endpoint-port=51820 interface=vps1 persistent-keepalive=25s \
    public-key="publickey-vps1"
add allowed-address=0.0.0.0/0,::/0 endpoint-address=vps2-ip endpoint-port=51820 interface=vps2 persistent-keepalive=25s \
    public-key="publickey-vps2"

/ip firewall filter
add action=accept chain=input disabled=yes protocol=icmp
add action=accept chain=input dst-port=13230 protocol=udp src-address=vps2-ip
add action=accept chain=input dst-port=13231 protocol=udp src-address=vps1-ip

The other configurations are routes that not affect to wireguard operations…

Znevna · March 25, 2022, 12:49pm

We don’t see the IP addresses set for those two wireguard interfaces.
We don’t see the routes set for those two wireguard interfaces.

holvoetn · March 25, 2022, 1:15pm

Exactly.
Complete config please.

You may think something is not relevant while it very much may be.