Azure to Mikrotik IPSec Site-to-Site VPN painfully slow on one direction

I have an ikev2 site-to-site vpn set up between an Azure vpn gateway and my CCR 1009 using standard Azure VPN parameters (SHA1 AES-256). The connection went live easily. I made sure to disable FastTrack which I understand is a common issue that affects IPSec. I also explicitly set Forward rules to allow traffic between the two subnets. My local connection is Gigabit fiber that pings a steady 17ms to my Azure VM over the VPN.

Here is the strange issue: when I transfer files from my on-prem server to my Azure VM over the VPN, it’s very smooth and fast. I get the full 100mbps rated speed of Azure’s Basic VPN tier.

But when I transfer files the other way, from my Azure VM to my on-prem server, the speed is unbearably slow and unstable: it would start the transfer, then immediately stall and get stuck. But pinging/tracerouting my on-prem server from Azure VM is all normal.

Any advice on how to troubleshoot this would be greatly appreciated!

Maybe something MTU and PMTU related?
Try to add this to see if that fixes it:

/ip firewall mangle
add action=change-mss chain=forward new-mss=clamp-to-pmtu passthrough=yes \
    protocol=tcp tcp-flags=syn

WOW that instantly fixed it! Can you give me a super brief explanation (I’m a networking noob) on how this one change can make such a huge difference?

Also, does this affect my non VPN traffic in any way? Should I put some restrictions to this rule such as ipsec-policy etc?

Finally, is this a RouterOS thing? I have SonicWalls etcs and their VPN usually just works after entering the credentials, never had to tinker with any of these low level networking parameters.

Thank you so much!

The problem is that a VPN tunnel has a slightly smaller MTU (maximum packet size) than a plain ethernet network connection, so the router behaves like a funnel that will not let packets that are too large through to the other side.
It should inform the sending system whenever a packet is too large, using an ICMP message, however:

  • sometimes those messages are firewalled by beginner network admins
  • sometimes the messages are processed but “forgotten” too soon (i.e. the next packet sent will be smaller, but then the sending system sends a full size packet again, reducing the throughput
  • sometimes, especially with technologies like IPsec, the sending system does not recognize the returned ICMP message as belonging to its connections

The mangle rule that I showed examines all “TCP session setup” packets and reduces the “maximum size” that is included there as a parameter to the max size the VPN allows.
This “hides” the problem because now the endpoint systems know how to behave. But it only works for TCP, not for UDP. So it would still be preferable to find and fix the actual problem.

In your case, it is apparently asymmetrical. One system knows how to behave and the other doesn’t. Why this exactly happens can usually only be found by tracing on the actual network.

Also, some types of VPN doe not exhibit this behavior because when a packet is too large they split it in two smaller packets and glue them together at the other end.
This can sometimes be configured in the routers. The default setting can be different for routers.

Thank you for the detailed and informative explanation. I looked back at Azure’s documentation and found this paragraph that I ignored earlier:


In addition, you must clamp TCP MSS at 1350. Or if your VPN devices do not support MSS clamping, you can alternatively set the MTU on the tunnel interface to 1400 bytes instead.

I assume the mangle rule you provided does exactly this, and it’s actually required by Azure due to their network design? Also, the documentation only mentions TCP, not UDP. Should I be worried about the UDP side?

Or is it better to “set MTU on the tunnel interface to 1400”, and if so, how do I do that in RouterOS (does it mean setting the WAN Ethernet interface MTU to 1400 from 1500, which I assume will reduce my Gigabit LAN/WAN speed?)

Thanks again!

Well, when you confingure a site-to-site IPsec VPN there is no explicit tunnel interface, so there is nowhere where you can set the MTU.
This is one reason why I prefer not to use that configuration, but instead recommend to use some type of explicit tunnel (like GRE or IP tunnel)
with IPsec configuration underneath. Then you have the tunnel interface which has an explicit MTU (it is automatically calculated but you
can manually lower it even furher when there are limitations that RouterOS cannot know about, like a PPPoE connection somewhere halfway
the path) and the chances that the ICMP “size exceeded” message will be generated correctly (so the sender understands it) are much better.

However, I don’t know Azure so I do not know if they support that config. I only configure it with routers (e.g. MikroTiks) at both sides
of the connection. Another advantage is that you decouple the routing and the IPsec, so you can run an automatic routing protocol to
insert the routes into the table to make the subnets at both sides properly route to eachother, instead of having to hardwire that into
the IPsec policies.

When your only application is to make these two systems talk to eachother for e.g. file transfers, the MSS clamping solution is likely to
work just fine.