PPPoE Compatibility Issues with vBRAS/NFV

Hey guys,

If you encounter the following problems:
a. When PPPoE dial-up, the MTU will auto adjust from 1492 to 1480 after connected 3 seconds.
b.If you enable IPv6 you will get many warnings in the log similar to:
invalid mtu 1492 on pppoe-out1 from fe80::200:5eff:fe00:101
invalid mtu 1492 on pppoe-out1 from fe80::200:5eff:fe00:102
invalid mtu 1492 on pppoe-out1 from fe80::200:5eff:fe00:103
invalid mtu 1492 on pppoe-out1 from fe80::200:5eff:fe00:104
invalid mtu 1492 on pppoe-out1 from fe80::200:5eff:fe00:105
invalid mtu 1492 on pppoe-out1 from fe80::200:5eff:fe00:106

Then congratulations, this is because your ISP uses Huawei’s Broadband Remote Access Server and you are a victim of Huawei equipment.
Affected peer equipment models: ME60, NE40, NE9000, VNE9000, etc.

Here is an explanation of the this issue:
a.Huawei has launched the vBRAS structure, which splits the traditional BRAS into the User Plane and the Control Plane.
b.The Control Plane name is VNE9000, which is an X86 virtual machine deployed on the ISP’s cloud.
c.The User Plane consists of an X86 virtual machine or or an ARM physical machine or a traditional BRAS (such as ME60, NE40, etc.), there are deployed at sites near subscribers.
d.The User Plane is incorporated into the control plane and managed using OpenFlow.
e.The User Plane and Control Plane are connected using VXLAN, they may be hundreds of kilometers apart, with a delay of about 2 to 8 ms
f.Subscriber’s PPPoE dial-up request will first reach the User Plane. If it is a packet of 0x8863 or 0x8864, the subscriber information will be injected and forwarded to the Control Plane through VXLAN. Then, after the Control Plane completes PPP authentication, it will send the flow table to the User Plane through OpenFlow. If it’s a normal PPPoE Session packet, it will be fast forward on the user plane.
g.RouterOS and most network equipment implement PPPoE in compliance with RFC4638.
h.Therefore, when the RouterOS’s PPPoE Client dial-up, if you set the MTU>1488, it will send an Echo Request packet with a length equal to the MTU to the other end during the LCP negotiation phase.
i.Yes, you have found the issue now. The size of your Echo Request packet has exceeded the VXLAN MTU between the User Plane and the Control Plane. At this time, the Control Plane will never receive your Echo Request or reply to your Echo Request. Therefore, when your PPPoE Client fails to get an Echo Reply after three attempts, it adjusts its MTU to 1480 in disappointment.


What my friends did:
a. Contacted the ISP and they replied that it was a problem with the user’s equipment.
b. Contacted Huawei, they believe they are not at fault and therefore will not fix the problem.
c. Contacted another ISP and they replaced ZTE’s BRAS for my friend.

You may ask, have some suggest about this?
q. You can wait for RouterOS to launch an option that allows you to disable the RFC4638 test.
b. You can contact your ISP to replace the BRAS of other vendor for you. Such as ZTE, NOKIA, H3C, Juniper, they all have no problems.
c. You can complain to the ISP investigate this problem and ask the vendor to fix it.
d. You can adjust the maximum MTU to 1488 if you can accept it (RouterOS will not send RFC4638 test packets when MTU <= 1488).

Best wishes.

Yes exactly i am facing this issue (pppoe isp) ONLY with mikrotik while VYOS/OPENWRT/OPNSENSE/IPFIRE happily works with their default settings giving me mtu of 1492 ,while mk defaults to 1480 and 1488 if you adjust mtu manually ,i send many emails to mikrotik support they say there is some device between isp and their device which causes it.i tried many things but in vain..


Please send a email to support of mk to fix it!!!

Unfortunately, RouterOS is implemented according to RFC
Ask your ISP to implement its equipment according to RFC instead of letting its vendor break RFC
So you can contact your ISP to ask for a fix is ​​the best way

I spoke to isp they didn’t cooperate much but they told this ont box (splitter)is not huawei and it’s direct to their server
Here in dubai the isp are like this see attached pic
Photoroom-20250310_194510.png
Photoroom-20250310_194427.png

This has nothing to do with the ONT Box, maybe you can use PPPoE Scan to check your BRAS Server, which may help you.

If you run

/interface/pppoe-client/monitor 0 once

then you’ll see MAC address of BRAS (ac-mac) … which might indicate venfor of BRAS. Some ISPs even disclose some technical info in ac-name (my ISP included “ASR9910” and name of their core location in ac-name).

This only what i could find :open_mouth:
Untitled.png

It seems your ISP is “faking” MAC address of BRAS … 00:00:5E is registered to “ICANN, IANA Department” …

It looks more like the MAC of a VRRP group.

It’s Huawei’s VNE9000 vBRAS,they use VRRP mac address.

yes,Huawei’s VNE9000 vBRAS use a VRRP mac range from 00:00:5E:00:01:01 to 00:00:5E:00:01:80

This issue also occurs with Huawei NE20. It’s a chronic problem, and I’ve already given up on trying to get Huawei to solve it.

In June 2024, I opened a ticket with MikroTik (SUP-144663), trying to get them to create an option to disable the test for item 5.2 of RFC4638:

“This capability SHOULD be enabled by default. It SHOULD be configurable and MAY be disabled on networks where there is some prior knowledge indicating that the test is not necessary.”

The option to disable the test MUST be configurable, meaning RouterOS is not following the RFC.

MikroTik’s response was:

“If there will be more, similar requests like yours, we will see how this can be added in future versions.”

So… make some noise if you want some solution from MikroTik. Better than expecting something from Huawei.

This was their reply when I asked them to disable rfc 4638 test
"Hello,

Thank you for contacting MikroTik Support.

Such option is not available at the moment.

Best regards,
"
But I agree what you say all of us send them reminders every week

I’d stick to the compatibility argument… That’s a better one.

You’re wrong on RFC: SHOULD == “RECOMMENDED”. MUST is specific term in RFC lingo, and it’s not used here… but MUST == REQUIRED. See RFC-2119

P.S.: This splitting of Control-Plane and Data-Plane is not used just by Huawei…
It is a tendency on all the big vendors. This is how mobile networks are done since forever.
Juniper BNG CUPS uses the same concepts, and IIRC the name of that in Cisco portfolio is cnBNG.

I agree with you!
Compatibility argument, SHOULD vs MUST is the better argument.

Hummm… I need to disagree!
What they did was to fake a behavior that SEEEMS to meet the requirements of RFC4638, but in fact does not follow the negotiation script to the letter. And ends up falling back to 1480 in a hardcoded way.

That “L2MTU” thing they created? What is that in practice?
A lazy workaround to presume how the things are in lower layer and avoid having really to deal with what is happening below?
Who else use that? How had they fit that in silicon SDKs (Ex.: Marvell.)?

I believe that this issue on Echo-Request/Echo-Reply in LCP is related to some presumption made through that L2MTU thing.

A hypothesis that could be tested to create some arguments with MikroTik?
Try to simulate some equivalent scenario with frame with enough sizes to allow PPPoE with MTU of 1500 Bytes, with a MTU of ethernet interface of 1508. O even bigger… 1600, 2000?
And the try to manipulate the L2MTU and MTU on the interface of PPPoE-Client on MikroTik you could get some equivalent behavior.
Compare that with a CPE running the latest OpenWRT.
And then PCAP is the king.

IMHO, this mess in MTU of PPPoE with MikroTik is related with their stubbornness/difficulty on:

  • Splitting Control-Plane vs Data-Plane
  • Splitting Underlay and Overlay

Or maybe it goes deeper! Maybe their difficulty on splitting that is caused by some shortcuts they took in the past to avoid dealing some layers, and now they are facing the consequences.
If that is the case… We will probably start to hear about v8 on ticket responses.

This is getting interesting :thinking: now and Mikrotik is sleeping on this Major issue and releasing new versions so frequently strange

Yes, I am wrong. The translator replaced SHOULD with MUST, and I didn’t notice it.

To avoid any doubt: “meaning RouterOS is not following the RFC” means that RouterOS is not doing everything the RFC states and recommends. In fact, the RFC does not REQUIRE the implementation of a button to enable/disable the test, but sometimes we have to be a bit sensationalist.

I have opened SEVERAL tickets with MikroTik, and by far, they are the manufacturer that has best addressed my requests over time. Coincidentally or not, mentioning an RFC or some technical standard (such as TR-101) seems to make things move forward more effectively, including my ticket regarding this issue. Before I insisted and detailed the RFC4638, their response was simply: “PPPoE MRU and MTU values must not be larger than 1492”, and that was it.

I agree with you, the compatibility argument seems better. However, based on my experience with MikroTik support, appealing to a technical standard is likely to be more effective. But I could be wrong.

Totally off-topic, but I really don’t get why some ISPs still insist on using outdated tech like PPPoE with FTTx. I mean, all modern BNGs support IPoE with optional VLAN tagging, which is so much easier to set up and manage. Using PPPoE is just plain dumb and overcomplicated. :wink:

Unfortunately, I have tested the compatibility of various vendor devices with my ISP,
and even Juniper and Cisco’s products with separated control plane and data plane can handle this issue correctly.

This issue only occurs on Huawei devices.
I think this needs to be fixed by Huawei.

A hypothesis came to mind…

Could this case be happening due to a simple implementation failure of the tunnel between Data-Plane and Control-Plane?

Just to give an example: Maybe a deployment guide that previously used IPv4 as the VXLan underlay, and started using IPv6, and as a result lost some bytes in the Payload. Maybe even a deployment guide that led to this error.

If this is the case, it is not a software failure, but rather just a deployment failure that can be easily dealt with by the network engineer in the environment.

Based on your report, have you managed to contact the operations and engineering team of the ISP in question? Do you think it is worth suggesting a double-check on this?

ok there has been response from support@ mikrotik they have made some changes for pppoe Finally!!!

But i have a x86 box and there is no chr/iso image in this can someone test it and give feedback!!

https://box.mikrotik.com/d/bc148d4405e94feaa2cc/