PPPoE Client "Random" Authentication Fail through Bridge

daneco · May 9, 2019, 5:44pm

Hello All,

I’ve encountered a problem and I’m not sure whether it is down to a bad configuration on my part or perhaps a bug:

As things presently stand, I have a CRS328-24P-4S+RM Cloud Router Switch (CRS) with a “main” connection going into port 24. On this connection there are 3 VLANs: One is for general broadband access (PPPoE logins), the second is for a private management network and the third is another private customer network. A VLAN interface has been set up for each of these VLANs on port 24. The two private network VLAN interfaces are then assigned to a separate bridge each (bridge 1 and bridge 2 for example) and various ports on the CRS are mapped to either bridge 1 or 2. I also have a PPPoE client running on the CRS which authenticates through the broadband access VLAN interface and provides connectivity to a third bridge (bridge 3) on the CRS, offering internet connectivity to our office and again, various ports on the CRS are mapped onto this bridge.

Everything has been working absolutely fine and exactly as expected until a new client moved into our building. They wanted connectivity from us and so we installed a cable through to their room. Their own router will need to login via PPPoE and so my plan was to create a fourth bridge (bridge 4) and assign the broadband access VLAN to this bridge, tell our PPPoE client on the CRS to authenticate via this bridge and then map one of the spare ports on the CRS to the new bridge so that our client can also login via bridge 4. As soon as I did this, the CRS PPPoE client would disconnect with an authentication failure, connect for a split second and then fail again in a repeating cycle. The new customer connection worked flawlessly from the minute I created the new bridge and their router logged in with PPPoE no problem at all. After various attempts at enabling and disabling bridges and interfaces, the CRS PPPoE client connected fine on bridge 4 and worked all through the night as did the new customer’s connection. However, come this morning and I find that everything had gone to pot! The CRS kept crashing and needed a reboot and the only way I could make things stable again was to revert my changes and have the CRS PPPoE client connect directly via the broadband access VLAN interface. This of course means that our new client is without connectivity once again.

Can anybody recommend a configuration scenario that would allow our CRS to connect using PPPoE over the broadband access VLAN and enable me to assign the same VLAN to a port on the CRS such that the new client’s router can also login with PPPoE? My sincere apologies if I have not explained things very well.

I really hope someone can come forward with a solution and I shall eagerly await your replies / suggestions.

With best wishes,

Paul.

sindy · May 9, 2019, 7:02pm

As for the configuration part, it would have been simpler and more useful to follow the hint in my automatic signature. The way how bridge3 is linked to the PPPoE client is not really clear, I suppose it hosts a private subnet and the route to internet goes via the PPPoE client with src-nat, but better to be sure about this.

As for the behaviour, it’s clear. When the PPPoE client is attached directly to an /interface vlan, it works flawlessly, when you make that same /interface vlan a member port of a bridge created for the purpose and attach the /interface pppoe-client to that bridge rather than to the /interface vlan, you get the fireworks you’ve described.

So first of all, it should work this way. And the fact that the CRS sometimes crashes confirms that there is a bug. So what I suggest below is a way how to make the data flow a bit differently inside the machine in hope that it will not trigger the bug. I assume you run the latest long-term or stable release so it is not some bug from the deep past.

What might help would be to set protocol-mode=none on the /interface bridge (the ISP may be confused by RSTP packets) and to assign a fixed MAC address to the newly added bridge using /interface bridge set bridge4 auto-mac=no mac-address=mac:address:of:ether24 (if the MAC address of the bridge eventually changes for some reason, but it’s not very likely).

If that doesn’t help, I would rearrange the order of interfaces from the current ethernet - vlan - bridge - pppoe-client to ethernet - bridge - vlan - pppoe-client. With a CRS3xx and the simple network structure you have it doesn’t make much sense to use the bridge-per-vlan approach. So starting from the working configuration where the new customer is disconnected, the first step would be to add another bridge (e.g. called all-vlan-bridge), make it the carrier interface of all the /interface vlan currently carried by ether24, and make ether24 its only (so far) member port. Then, I would prepare the configuration for vlan filtering on that bridge:

/interface bridge vlan
add bridge=all-vlan-bridge vlan-ids=broadband.vlan.id tagged=all-vlan-bridge,ether24 untagged=ether-for-the-new-customer
add bridge=all-vlan-bridge vlan-ids=bridge1-vlan-id tagged=all-vlan-bridge,ether24
add bridge=all-vlan-bridge vlan-ids=bridge2-vlan-id tagged=all-vlan-bridge,ether24
add bridge=all-vlan-bridge vlan-ids=bridge3-vlan-id tagged=all-vlan-bridge,ether24

/interface bridge port
add bridge=all-vlan-bridge interface=ether24 pvid=1
add bridge=all-vlan-bridge interface=ether-for-the-new-customer pvid=broadband-vlan-id

And finally, /interface bridge set all-vlan-bridge vlan-filtering=yes.

This way, everything would stay as it was before, except the order of bridge and vlan interfaces between ether24 and pppoe-client.

If that fixes the PPPoE client issues, you may want to migrate the member ports of bridges 1 and 2 to bridge-all-vlans by modifying the rows in /interface bridge port and /interface bridge vlan accordingly.

And finally you may activate the “hardware accelerated bridging”, i.e. forwarding of the frames within the VLANs by the switch chip, without loading the CPU.

daneco · May 10, 2019, 12:13am

Hello Sindy,

Thank you very much for your reply. Presumably you mean I should dump the router’s entire config and then anonymise before posting here?

I will give your suggestion a go, hopefully tomorrow and see where I end up. Do you know how I should go about bringing this to the attention of Mikrotik so that they can look at fixing the bug? My configuration might be excessive but it is the way my brain’s logic seems to function (or not function as the case may be)!

Best wishes,

Paul.

sindy · May 10, 2019, 6:02am

The way to let Mikrotik know is to generate the supout.rif file, ideally while the issue just happens, and send it to support@mikrotik.com (so to make the supout.rif useful, you have to repeat the previous setup). You can copy-paste the description from your OP or just put a link to the forum topic into the message.

When exporting the configuration, hide-sensitive prevents passwords, pre-shared keys and alike from being exported, but doesn’t change public IP addresses and domain names, so these have to be anonymized manually before posting. However, to prevent dependencies between addresses, subnets, firewall rules from getting lost in the process, the anonymization has to be done with care. In this particular scenario the issue is an L2 one so IP addresses are not so important, but there may theoretically be an issue with MAC addresses, so an output of several subsequent /interface bridge print detail and of /interface bridge host print while the issue is present might reveal something interesting (in particular, changing MAC address of the bridge or its conflict with one of the hosts). No need to post it, just have a careful look through it.

Other than that - the CPU in your machine is an ARM but just a single-core one, so loading it with the task of bridging is not a good idea given that you have the feature-rich switch chip there. If all hosts connected to each VLAN talk to something else via routing and rarely to each other, it is not worth the effort to activate the hardware switching, but this doesn’t seem to be your case.