Wireguard does not work after reboot

Good afternoon
I configured Wireguard between two Mikrotik routers:
Router-A:
CCR2004-1G-2XS-PCIe v7.15.2
To access the Internet, a pppoe connection and a static IP address are used.

/interface wireguard
add comment=Router-B listen-port=52203 mtu=1420 name=wireguard203
/interface wireguard peers
add allowed-address=10.0.203.1/32,224.0.0.5/32 endpoint-address=\
 Router-В_ip endpoint-port=52203 interface=wireguard203 name=Router-В \
 persistent-keepalive=25s preshared-key=\
 "preshared-key" public-key=\
 "public-key"
/ip address
add address=10.0.203.2/30 comment=Router-B interface=wireguard203 network=\
 10.0.203.0

Router-B:
RB450Gx4 v7.15.2
To access the Internet, an IPoE connection and a static IP address are used.

/interface wireguard
add comment=Router-А listen-port=52203 mtu=1420 name=wireguard203
/interface wireguard peers
add allowed-address=10.0.203.2/32,224.0.0.5/32 interface=wireguard203 \
 is-responder=yes name=Router-А persistent-keepalive=25s \
 preshared-key="preshared-key=" private-key=\
 "private-key" public-key=\
 "public-key="
 /ip firewall filter
 add action=accept chain=input comment=Router-A dst-port=52203 protocol=\
    udp

With these settings it works, ping passes.
The problems start after rebooting Router-A, solution: change the port from 52203 to any other, for example 52204, but after reboot it doesn’t work again until you change the port.
Has anyone had this problem?
Why might this happen?

Sorry, don’t know.

However I would turn off the persistent-keepalive on router-B.
Perhaps trying to connect back to the IP/Port it was last connected too is doing something.

Also, you can check the counters on the firewall rule on Router-B, and see if packets are actually getting in, enable logging on the firewall rule.
See where they are coming from.

Perhaps for more detailed logging
You could enable debug logging for wireguard under system/logging
You could create a mangle passthrough firewall rule, that counts (and perhaps briefly log) every packet coming into port 52203.

Another slight possibility
Router-B is not happy, because router-A still on same port, but wg timestamps from router-A went backwards,
Perhaps router-A wireguard needs restart after Router-A has got correct time.

Thanks for your reply!

I disabled persistent-keepalive on Router-B, changed the port on Router-A to 13231, pings pass, I reboot Router-A, pings do not pass, I see from the Firewall and Mangle counters that packets are arriving:

Router-A

wireguard info:
wireguard203: [Router-B] public_key: Handshake for peer did not complete after 5 seconds, retrying (try 2)

wireguard debug:
wireguard203: [Router-B] public_key: Sending handshake initiation to peer (Router-B_ip:52203)

Router-B

wireguard debug:
wireguard203: [Router-A] public_key: Sending handshake response to peer (Router-A_ip:13231)

firewall info:
prerouting: in:ether1 out:(unknown 0), connection-state:established src-mac 08:96:ad:35:ca:d6, proto UDP, Router-A_ip:13231->Router-B_ip:52203, len 176

firewall info:
input: in:ether1 out:(unknown 0), connection-state:established src-mac 08:96:ad:35:ca:d6, proto UDP, Router-A_ip:13231->Router-B_ip:52203, len 176

Added a similar rule to Mangle on Router-A:

mangle rule changed by winbox-3.40/tcp-msg(winbox):user@ip (/ip firewall mangle set *1 action=accept chain=prerouting disabled=no log=yes log-prefix="" protocol=udp src-port=52203)

Counter = 0

Would need to see full configs of both devices…

There is another similar report about recent 7.15 version:
http://forum.mikrotik.com/t/wireguard-link-on-7-15-gets-stuck-after-peer-was-down-a-ping-or-cycling-the-peer-will-unstuck-it/177105/1

Is this a brand new setup or it is something that used to work and now with 7.15(.2) stopped working?

Thanks everyone for the answers!
I post the configurations:
Router-B.rsc (14 KB)
router-A.rsc (10.9 KB)

The problem only occurs between Router-A and Router-B.

I have another router that connects to the Internet via a USB modem and installs a tunnel similar to Wireguard to Router-B, there are no problems with it. This connection works even after a reboot.

There is also a 4th router, which is located behind NAT and also installs a Wireguard tunnel to Router-B. No problem.

Could there be a problem with pppoe on Router-A?

I found out empirically that if you turn off the interfaces and peers on both routers for 10-15 minutes and then turn them on, everything works.

If you’re receiving a non-routable IPv4 address over pppoe?

I think I have seen something similar in the past, if you turned off the wireguard interface and then turned it back on it fairly soon after. It
didn’t seem to reset to its defaults properly, it seemed to remember at least some of the running state it had before it was turned off.
If you left it off for a while (don’t know how long), it then did seem to reset properly.
Maybe it is supposed to work this way??

I haven’t looked at this for a long time now.
Maybe it would be nice to have a cold restart button, setting on the whole wg interface, and/or individual peers.

yes

But connections to other routers work stably. The settings are similar.

a) Changing the port helps
b) Idling for some time helps
At this point it looks like a connection tracker with an old connection stuck in it.
c) Non-routable IP address
There’s surely a connection tracker - your ISP does NAT, and in most cases that requires a connection tracker.
d) If the non-routable IP address is new each time you connect PPPoE?
If the address is new every time, and older connections are not purged from the NAT connection tracker when you reconnect PPPoE (they should be purged), the connection tracker uses an old NAT rule, redirecting the packets destined to the WG port on your routable static IP address to your old non-routable IP address, thus the packets can’t reach your device (and “Counter = 0” confirms it). Changing the port bypasses the old connection, and idling lets the connection timeout.

If it’s the case, it’s unfortunately not under your control. And getting the ISP to fix their NAT could be just next to impossible.

Thanks for your reply!
The ip address is always the same because I activated the Public IPv4 service. In this case, I shouldn’t have NAT.

My ISP fiber connection:

Could this be a problem? Maybe an ISP GPON terminal?
mikrotik-isp.jpg

Perhaps I should use a different VPN connection between these routers? For example, IPSec + L2TP?

The problem was solved using a script on both routers:

:local pingresultA [/ping 10.0.203.1 count=5]; 
:if ($pingresultA <= 0) do= {
  /interface/wireguard/disable wireguard203;
  /interface/wireguard/peers/disable 0;
  delay 600s;
  /interface/wireguard/enable wireguard203;
  /interface/wireguard/peers/enable 0;
};

Helps after reboot and PPPoE interruptions

Well, more than a solution, it is a (please allow me, ugly) workaround, if it takes 600s to execute, better than no connection, still …

In an earlier post you mentioned that changing the port to another and then restoring the original port worked to re-establish the connection, maybe doing that in the script would take less than 10 minutes?

In the near future the equipment will move to another location and I will exclude PPPoE.
Hopefully the problem is really in the ISP and the problem will be solved.