problem to reach some websites

Hello, I dont know what to do… since 3 weeks I can not reach some pages. For example the activesync.t-online.de over outlook, gmail is reachable. Or one bankaccount over hbci (another one with hbci is reachable with same app). Or the streamings from twitch (I see the chats but the streams dont start). It felt, that i cannot reach around 1% of the sites… but they are every time the same.

On my mikrotik I have 4 “different” networks (two hotspot nets; another network for some different users and my local network)

Twitch for example is rechable from the other network; activesync.t-online.de is reachable over the other network but not over the hotspot networks…

I have a little bit more firewall rules, but I dont see, why this sites are not reachable in some networks.

Does anyone has an idea how I can test und look for the issue?

Thanks!

Olli

This kind of problems is usually caused by “block all ICMP” in the firewall, e.g. after reading blog postings from wannabe security experts.
Make sure you pass ICMP, even when those guys tell you to block it. They are wrong.
(it should be sufficient to allow established/related when ICMP is not explicitly blocked)

I have a hEX with main VLAN (1) and guest VLAN (20)
For some reason guest could not reach a handful of websites, like netflix.com, some apple sites ++
This happens some week ago, and I did not know anything about it before today.
So what happen for some week ago?
I did upgrade from 6.43.4 to 6.43.16. (21.jun.2019)
Solution upgrade from 6.43.16 to 6.44.5 (20.jul.2019)

Not 100% sure if that was the problem and if just a reboot would have fix it, but since I did see this post, I do see other with same problem.

PS not config was changed, just RouterOS upgrade.

The problem I mention has to do with MTU size discovery. Sometimes it works, sometimes it doesn’t.
The problem can be inside your network/router or it can be elsewhere along the path.
Some people block all ICMP after they have read some clueless advise from people like Gibson, and they cause such issues.
Other times MikroTik break the automatic MTU clamping in PPPoE and suddenly it does not work while it worked in a previous version.
You can try adding this to the configuration:

/ip firewall mangle
add action=change-mss chain=forward new-mss=clamp-to-pmtu passthrough=yes \
    protocol=tcp tcp-flags=syn

When it fixes your problem, it means there is something wrong somewhere else, but at least you have worked around that.
In severe cases you could even try something like new-mss=1400 instead of new-mss-clamp-to-pmtu.

So you say this rule in may router is the root cause?

/ip firewall filter
add action=drop chain=input comment="Drop ICMP on outside IF" in-interface=ether1 protocol=icmp

But how come that one VLAN is ok and other is not?
Why did a firmware upgrade solve the problem?

That rule is causing problems when there is a lower MTU further down the path. You should not have such a broad ICMP blocking rule.

That does not explain why things stopped up, and why an upgrade(or reboot) did solve the problem.
I added it to logged rules, so will have a look in Splunk to see who hits this rule, and when.

The reason can be that the above mentioned mangle rule in some versions is implicit part of PPP interfaces, and in some releases this feature is broken.
These problems especially affect internet connections via PPPoE and without RFC4638 support.
Luckily I have PPPoE with RFC4638 so no problems here. But I do not block ICMP either.

I am testing on the moment NordVPN and used before other VPN providers. MTU was sometimes a problem and I could always go without any changes to the MTU. Using NordVPN in the same configuration as with PureVPN IKEv2 I could not reach some sites and it stayed on getting the certificates for TLS and then timed out.

I tried the clamp-to-pmtu with no result and revisited it because I wanted to test it with further. I manually changed the MTU and at 1398 it started to work again. I have now just for TCP/443 to test and when I run a speedtest all goes fine. However I think that then the whole TCP (sync) needs also be lowered to 1398. I have a RFC4638 PPPoE which is on the GW router. I use two routers in sequence to generate the config with Source IP for NAT IKEv2.

Sites that did not load with NordVPN are: pi-hole.nl or antary.de as examples.

Is there a possibility to use NordVPN without having to lower the MTU in Mangle?

Every VPN adds header overhead so it decreases the MTU. When it offers full 1500-byte MTU it uses fragmentation, which is even worse.
When you use a VPN, always adjust your MTU to the appropriate value.
When there are clueless operators on the path that block ICMP (or the software at the VPN provider itself is broken), you may also need to reduce the MSS.
That is what the mangle rule is doing.

Thank you! This solve my problem :smiley:

That is good, but please understand that this means there is an error somewhere else, and that you now fixed that only for TCP.
Other traffic (UDP and more) could still be dropped for being too large. A real solution would be to find the bad MTU setting and/or the bad ICMP drop rule.

Where could lie the problem. I noticed it with NordVPN and PureVPN did not show that problem. A thing I remember doing the speedtest (xs4all) with PureVPN that not always the upload started and even gave a timeout.

I have now only for NordVPN the MTU limited and not for PureVPN and the run side-by-side.

Situation:

Now: Mangle Routing Mark → Nat set src-address → Route → Second router → Nat set src-address (IKEv2) → route to PPPoE.
Next step : Mangle Connection Mark → Nat set src-address (IKEv2) → Route → Second router → route to PPPoE.

At each step in te chain you need to make sure that:

  • the (virtual) interfaces in that chain have the correct MTU
  • ICMP is allowed everwhere (both directions)

Even then it can fail, because other (clueless) admins may block ICMP to their servers.
Also, sometimes servers honor the ICMP “packet too big” message and send a smaller packet, but they fail to remember this and the next packet is again sent full-size.
This results in drastically lower throughput. For example, the MikroTik download server had that problem when I last checked it. Firmware downloads were very slow
when there was a VPN somewhere along the path.

The “change mss” mange rule works around that by telling the other side to send smaller packets all the time (for the whole connection, that is).
However, it only affects TCP. So other protocols will still fail unless the root cause is fixed.

Thanks and when I ping from my PC then normal (not VPN) has a 1472 MTU and both VPN connection a 1410 MTU. So I set the rule to do 1410 and then I have problems. Only When I am going down for NordVPN to MTU 1398 I can connect.

I think that noting is left to use Wireshark again to see what is different in NordVPN traffic compared to PureVPN.

Note that MTU and MSS are not the same thing!
When you set MSS, you cannot set it to MTU but you need to set it at least 40 bytes lower.
The clamp-to-pmtu option already subtracts the correct amount (it could be 44 or 48 as well depending on the TCP options used), but when you set a manual value you need to calculate it.
It never hurts to set it too low (not drastically too low) so when in doubt set it to some “nice” value like 1280 or 1024.

That rule is on input chain, so might only affect your router, do you have similar rule in forward chan?

No, only one ICMP rule.
Strange is that upgrading from 6.43.16 to 6.44.5 resolved the problem.

That is not strange. As I wrote, there are hidden places in RouterOS where the rule that I posted is applied to traffic (not in the firewall mangle but in the protocol/device driver).
There also have been times that some configurations inserted such a rule in the mangle table (dynamically) and you could actually see it.
However this thing sometimes works and sometimes doesn’t (in subsequent versions). You can sometimes see it mentioned in the release notes.

Hello guys,
I used to face the same problem with the topic starter.
I can not access some websites like Twitch, Outlook, APKNite,… and I tried to upgrade my RouterOS.
So now I have some issue with update/upgrade RouterOS from 6.3x to 6.4x. I’ve followed many guidelines I’ve found on Google, yet no luck for me. Upgrade/Install, download and manual reboot also not working, when rebooting it stuck not booting up. So, need to unplug/plug power and router bootup, yet old firmware still there. Any idea what I’ve done wrong?