Community discussions

MikroTik App
 
nmt1900
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 83
Joined: Wed Feb 01, 2017 12:36 am

Path MTU discovery problems with IPv6 on PPPoE

Fri Jul 26, 2024 3:38 pm

My ISP got IPv6 up and running few weeks ago and I managed to get it up and running on my network. I have PPPoE connection and 3 VLAN's on LAN. All devides run v7.15.3

Basic setup is here
/ipv6 settings set accept-redirects=no accept-router-advertisements=yes
/ipv6 dhcp-client add interface=pppoe-wan-eth1 pool-name=pd1 prefix-hint=::/56 rapid-commit=no request=prefix use-peer-dns=no

/ipv6 address add address=::1 from-pool=pd1 interface=v80
/ipv6 address add address=::1 from-pool=pd1 interface=v10
/ipv6 address add address=::1 from-pool=pd1 interface=v99

/ipv6 nd set [ find default=yes ] disabled=yes
/ipv6 nd add dns=2620:fe::fe interface=v10 ra-lifetime=2h
/ipv6 nd add interface=v99 ra-lifetime=2h
/ipv6 nd add dns=2620:fe::fe interface=v80 ra-lifetime=2h
/ipv6 nd add advertise-dns=no interface=pppoe-wan-eth1 ra-lifetime=none
IPv6 firewall is mostly at default Mikrotik configuration. There is no ICMP filtering interfering in between so this can not be the cause of my problem.

Now to the problem itself... While most of the IPv6 enabled internet works OK, I stumbled on the problem with some of Microsoft services which were inaccessible when IPv6 was enabled. There can be more of them but my focus was on packages.microsoft.com, appsource.microsoft.com and entra.microsoft.com.

At first I made tests on ipv6-test.com, test-ipv6.com and on dedicated ICMP blackhole test page icmpcheckv6.popcount.org. All these returned OK results including "large packet send" test, so it did not seemed to be overall ICMP filtering problem. ICMP "packet too big" messages appeared on my firewall log as well.

Then I tried some tracepath testing (last one is one of test-ipv6.com nodes)

Initially all looked OK (even if somewhat patchy)
$ tracepath -6 dns.quad9.net
 1?: [LOCALHOST]                        		   0.009ms pmtu 1500
 1:  2001:xxxx:2300:b801::1                                0.839ms
 1:  2001:xxxx:2300:b801::1                                0.778ms
 2:  2001:xxxx:2300:b801::1                                0.704ms pmtu 1492
 2:  2001:xxxx:ffff:f232::232                              1.319ms
 3:  2001:xxxx:ffff:1070::205                              1.906ms
 4:  2001:7f8:50:1:0:2a:0:130                              1.814ms
 5:  2620:fe::fe                                           2.007ms !A
     Resume: pmtu 1492

~$ tracepath -6 ipv6-test.com
 1?: [LOCALHOST]                       			   0.021ms pmtu 1500
 1:  2001:xxxx:2300:b801::1                                0.781ms
 1:  2001:xxxx:2300:b801::1                                0.721ms
 2:  2001:xxxx:2300:b801::1                                0.688ms pmtu 1492
 2:  2001:xxxx:ffff:f232::232                              1.130ms
 3:  2001:xxxx:ffff:1070::205                              1.876ms
 4:  ae9-720.RT.ELN.TLL.EE.retn.net                        1.456ms asymm  5
 5:  RT.LIM.WAW.PL.retn.net                               15.477ms asymm  6
 6:  waw-atm-pb1-nc5.pl.eu                                15.881ms
 7:  2001:41d0:aaaa:100::5                                27.023ms asymm 10
 8:  2001:41d0:aaaa:100::7                                27.004ms asymm 10
 9:  no reply
10:  no reply
11:  fra1-lim1-g1-8k.de.eu                                31.669ms asymm  9
12:  2001:41d0:0:50::5:f943                               28.012ms asymm 10
13:  2001:41d0:0:50::5:39ad                               32.193ms asymm 10
14:  2001:41d0:0:1:3::4899                                31.252ms asymm 12
15:  2001:41d0:0:1:3::4593                                31.317ms asymm 13
16:  2001:41d0:0:1:3::4688                                34.953ms asymm 13
17:  no reply
18:  2001:41d0:701:1100::29c8                             28.209ms reached
     Resume: pmtu 1492 hops 18 back 16
     
$ tracepath -6 2a01:7e01::f03c:91ff:fe16:a2e9
 1?: [LOCALHOST]                       			   0.036ms pmtu 1500
 1:  2001:xxxx:2300:b801::1                                0.835ms
 1:  2001:xxxx:2300:b801::1                                0.726ms
 2:  2001:xxxx:2300:b801::1                                0.679ms pmtu 1492
 2:  2001:xxxx:ffff:f232::232                              1.144ms
 3:  2001:xxxx:ffff:1070::205                              1.940ms
 4:  ae9-720.RT.ELN.TLL.EE.retn.net                        1.281ms asymm  5
 5:  RT.EQX.FKT.DE.retn.net                               33.136ms asymm  9
 6:  ipv6.de-cix.fra.de.as63949.linode.com                33.753ms
 7:  2600:3c0f:10:32::1                                   32.735ms
 8:  2600:3c0f:10:35::14                                  33.450ms
 9:  2600:3c0f:10::416                                    31.986ms asymm 10
10:  2a01:7e01::f03c:91ff:fe16:a2e9                       30.361ms reached
     Resume: pmtu 1492 hops 10 back 11
but then we see problems (I just "abbreviated" that to save the room)
$ tracepath -6 packages.microsoft.com
 1?: [LOCALHOST]                       			   0.014ms pmtu 1500
 1:  2001:xxxx:2300:b801::1                                0.766ms
 1:  2001:xxxx:2300:b801::1                                0.735ms
 2:  2001:xxxx:2300:b801::1                                0.704ms pmtu 1492
 2:  2001:xxxx:ffff:f232::232                              1.186ms
 3:  no reply
 4:  netnod-ix-ge-a-sth-1500.microsoft.com                 7.759ms
 5:  no reply
 6:  no reply
 7:  no reply
.
.
.
29:  no reply
30:  no reply
     Too many hops: pmtu 1492
     Resume: pmtu 1492
I was scratching my head for a while and then started to suspect that this might be MTU/MSS issue

At first I tried this
/ipv6 firewall mangle add action=change-mss chain=postrouting new-mss=clamp-to-pmtu out-interface-list=internet passthrough=yes protocol=tcp tcp-flags=syn
which did not help, but then I tried to resort to fixed value and found that exact MSS value of 1340 was the maximum on which problematic Microsoft pages started to work without any problems
/ipv6 firewall mangle add action=change-mss chain=postrouting new-mss=1340 out-interface-list=internet passthrough=yes protocol=tcp tcp-flags=syn
Then I did some packet capturing (with mangle rules disabled) and saw that testpages which had "large packet send" test in them caused MSS to be decreased to 1255, while traffic to non-working Microsoft sites had MSS 1432 on them i.e. no path MTU discovery.

As I see it there's no overall blanket ICMP filtering from my ISP as dedicated tests would then fail and MSS would not be trimmed to 1255, but sites with problems have ICMP traffic borked which seems to be the cause of all these problems.

If I have any proper unterstanding of the situation then it looks like problem might be at ISP, at these sites or somewhere in between because ICMP traffic needed for PMTU discovery gets lost (on these sites/domains) while TCP traffic does not.

Any thoughts on this issue are welcome. To some extent it might even be good to be set on fixed MSS value as this eliminates hiccups and delays caused by PMTU, but for some sites/services it still is somewhat sub-optimal.
What do you think what shoud I try next?

P. S. I have put a message ahead to my ISP support. Maybe they even will be reading this thread eventually...
Last edited by nmt1900 on Sun Jul 28, 2024 3:45 pm, edited 1 time in total.
 
nmt1900
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 83
Joined: Wed Feb 01, 2017 12:36 am

Re: Path MTU discovery problems with IPv6 on PPPoE

Sat Jul 27, 2024 3:03 pm

Update - to not blindly set MSS for all outgoing traffic I did this
/ipv6 firewall mangle add action=change-mss chain=postrouting new-mss=1340 out-interface-list=internet passthrough=yes protocol=tcp tcp-flags=syn tcp-mss=1341-65535 
Now MSS is clamped only if it is initially over 1340 and not increased if it is smaller. Path MTU discovery must work per IPv6 specifications but in my case it does not work for all sites so this workaround is only reliable solution until issue gets resolved...
 
nmt1900
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 83
Joined: Wed Feb 01, 2017 12:36 am

Re: Path MTU discovery problems with IPv6 on PPPoE

Sun Jul 28, 2024 9:56 pm

Additional testing shows that we actually have ICMP blackhole but not in a sense that it is filtering ICMPv6 out, but it just fails to return ICMPv6 type 2 "Packet too big" messages and oversized ping packets just time out - and this blackhole is inside ISP's own prefix (probably inside their own infrastructure). This is obviously wrong and violates RFC4443 (section 3.2).

P. S. That's why ICMP blackhole tests do not reveal this as "packet too big" messages from the test site are forwarded without problems. This should mean that path MTU discovery is able to function if hop with lowest MTU is elsewhere...
 
User avatar
mkx
Forum Guru
Forum Guru
Posts: 12270
Joined: Thu Mar 03, 2016 10:23 pm

Re: Path MTU discovery problems with IPv6 on PPPoE

Sun Jul 28, 2024 10:14 pm

Every L3 device (e.g. IPv6 router) should be able to transmit ICMP messages if necessary. And pass them on of course.

BTW, I'm not sure if a generated ICMPv6 message (such as packet too big) is considered as "related" in case of stateful firewall ... probably it should be. And it's probably handled by chain=output.
 
nmt1900
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 83
Joined: Wed Feb 01, 2017 12:36 am

Re: Path MTU discovery problems with IPv6 on PPPoE

Mon Jul 29, 2024 2:04 am

Yes it should be able, but ICMPv6 type 2 packets are explicit requirement of the specification

Description

A Packet Too Big MUST be sent by a router in response to a packet
that it cannot forward because the packet is larger than the MTU of
the outgoing link. The information in this message is used as part
of the Path MTU Discovery process [PMTU].


https://datatracker.ietf.org/doc/html/r ... ection-3.2

P. S. This can be irrelevant only in situation when (1) all links of the router have same MTU or (2) all links have greater MTU than any of packets that can possibly reach the router over any of these links.
 
nmt1900
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 83
Joined: Wed Feb 01, 2017 12:36 am

Re: Path MTU discovery problems with IPv6 on PPPoE

Tue Jul 30, 2024 4:35 pm

Well... real solution is still pending (summer vacations or whatever else to blame maybe?) but further investigation shows that "partiality" of initial problem is related to root cause in a fairly straightforward way - as packets with size up to 1492 bytes are able to be transmitted to some address spaces while not to others as they will be choked into blackhole when MTU of blackhole is exceeded.
It is not possible to scan whole public IPv6 address space to know the exact spread of this problem but from what I see, Google services can be accessed with full-size packets,
[test2@n14-dude] > ping 2001:4860:4860::8888 do-not-fragment size=1400
  SEQ HOST                                     SIZE TTL TIME       STATUS
    0 2001:4860:4860::8888                      116  60 2ms821us   echo reply 
    1 2001:4860:4860::8888                      116  60 2ms776us   echo reply
    2 2001:4860:4860::8888                      116  60 2ms789us   echo reply
    3 2001:4860:4860::8888                      116  60 2ms854us   echo reply
    4 2001:4860:4860::8888                      116  60 2ms887us   echo reply
    sent=5 received=5 packet-loss=0% min-rtt=2ms776us avg-rtt=2ms825us max-rtt=2ms887us

[test2@n14-dude] > ping 2001:4860:4860::8888 do-not-fragment size=1401
  SEQ HOST                                     SIZE TTL TIME       STATUS
    0 2001:4860:4860::8888                      116  60 2ms939us   echo reply
    1 2001:4860:4860::8888                      116  60 2ms751us   echo reply
    2 2001:4860:4860::8888                      116  60 2ms777us   echo reply
    3 2001:4860:4860::8888                      116  60 2ms819us   echo reply
    sent=4 received=4 packet-loss=0% min-rtt=2ms751us avg-rtt=2ms821us max-rtt=2ms939us

[test2@n14-dude] > ping 2001:4860:4860::8888 do-not-fragment size=1492
  SEQ HOST                                     SIZE TTL TIME       STATUS
    0 2001:4860:4860::8888                      116  60 2ms809us   echo reply
    1 2001:4860:4860::8888                      116  60 2ms860us   echo reply
    2 2001:4860:4860::8888                      116  60 2ms902us   echo reply
    sent=3 received=3 packet-loss=0% min-rtt=2ms809us avg-rtt=2ms857us max-rtt=2ms902us
while address spaces of Cloudflare, Microsoft (and probably Apple) are not (Apple App Store downloads are showing timeouts as well if MSS is not clamped)
[test2@n14-dude] > ping 2606:4700:4700::1111 size=1400
  SEQ HOST                                     SIZE TTL TIME       STATUS
    0 2606:4700:4700::1111                     1400  60 1ms807us   echo reply
    1 2606:4700:4700::1111                     1400  60 1ms731us   echo reply
    2 2606:4700:4700::1111                     1400  60 1ms670us   echo reply
    sent=3 received=3 packet-loss=0% min-rtt=1ms670us avg-rtt=1ms736us max-rtt=1ms807us

[test2@n14-dude] > ping 2606:4700:4700::1111 size=1401
  SEQ HOST                                     SIZE TTL TIME       STATUS
    0 2606:4700:4700::1111                                         timeout
    1 2606:4700:4700::1111                                         timeout
    2 2606:4700:4700::1111                                         timeout
    sent=3 received=0 packet-loss=100%
MSS clamping has to stay in place for now. We'll see if provider is going to be able to fix it or not. Who knows - maybe it will be a time to switch to other ISP if this does not get fixed...
 
nmt1900
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 83
Joined: Wed Feb 01, 2017 12:36 am

Re: Path MTU discovery problems with IPv6 on PPPoE  [SOLVED]

Mon Aug 05, 2024 11:35 am

Kudos to my provider - they got this MTU problem actually fixed.

Now only question left - what to do with mangle rule.
Should it be like this
/ipv6 firewall mangle add action=change-mss chain=postrouting new-mss=1432 out-interface-list=internet passthrough=yes protocol=tcp tcp-flags=syn tcp-mss=1433-65535
this
/ipv6 firewall mangle add action=change-mss chain=postrouting new-mss=clamp-to-pmtu out-interface-list=internet passthrough=yes protocol=tcp tcp-flags=syn tcp-mss=1433-65535
or this?
/ipv6 firewall mangle add action=change-mss chain=postrouting new-mss=clamp-to-pmtu out-interface-list=internet passthrough=yes protocol=tcp tcp-flags=syn
As MTU of actual interface is 1492, then path MTU for IPv6 internet connection from router standpoint is 1492 and corresponding MSS value should be 1432...

Who is online

Users browsing this forum: Amazon [Bot], Google [Bot] and 32 guests