My fiber modem is not connecting reliably

Hi everyone,

I'm seeking some advice on a persistent initialization issue with my
RB4011iGS+ (v7.22.1) and an ODI DFP-34X-2C3 SFP stick on a Telekom Germany
FTTH connection.

The Story
Every time the router reboots, I'm met with a dead connection. Looking at the
logs, the PPPoE client (on VLAN 7) is stuck in a frustrating loop: it
initializes, tries to connect, and then immediately terminates/disconnects.
This cycle repeats indefinitely.

The Workaround
To get back online, I have to manually disable and then enable the SFP
interface (sfp-sfpplus1) multiple times. At some point—seemingly at random—the
authentication finally "sticks," and I get a stable connection that stays up
until the next reboot.

Attached Evidence
I have attached the following files for a detailed look:

  1. mikrotik_config_redacted.rsc: My pruned configuration (Interfaces and
    PPPoE).
  2. sfp_info_api.txt: Detailed monitor output of the ODI stick.
  3. mikrotik_logs_ppp_filtered.txt: The log showing the ~12-minute loop of
    failed attempts followed by the successful manual recovery.

The Question
It feels like the RB4011 SFP port and the ODI stick aren't "syncing" their
state correctly during the cold boot sequence. Once I kick the interface
manually, it works.

Has anyone encountered this specific behavior with the RB4011 and ODI modules?
Is there a known way to delay the PPPoE client or stabilize the SFP link
initialization without manual intervention?

Thanks in advance for any insights!

mikrotik_config_redacted.rsc (6.1 KB)

mikrotik_logs_ppp_filtered.txt (14.3 KB)

sfp_info_api.txt (1.3 KB)

One interesting thing would be to sniff the traffic on the sfp-sfpplus1 port into a file while the unsuccessful attempts continue, to see (using Wireshark) whether the communication is bidirectional and if yes, what is actually happening there.

Also, after a few cycles like that, I would generate the supout.rif file and file a support ticket at Mikrotik.

As a workaround, you can use /system/scheduler (and possibly even/system/script) to delay the establishment of the pppoe connection. I would try something like this:

/system/scheduler/add start-time=startup on-event={:delay 5s ; /interface/pppoe-client/disable pppoe-out1 ; /interface/ethernet/disable sfp-sfpplus1 ; :delay 30s ; /interface/ethernet/enable sfp-sfpplus1 ; delay 5s ; /interface/pppoe-client/enable pppoe-out1} (not tested, there may be typos).

If this is not enough, you will need to write a script that will do the same sequence of actions if it finds the pppoe-out1 to be enabled but not running, and schedule it to run every minute.

@boxcee

Did you torch the SFP interface to see if there are any VLAN=7 tagged frames coming in/out?

I did now. Nothing going in or out. Only thing I see every now and then is “pppoe discovery” with one packet.

Do you have any firewall rules that could prevent outgoing traffic?

@BartoszP no, I don’t think so. It does work eventually. Just I have to disable/enable the interface a few times first.

I face another problem where the sniffer stopped working on a disabled interface (or at least didn’t continue to sniff after I enabled it again), so I wrote a script for this:

:local targetInterface "sfp-sfpplus1"
:local searchString "authenticated"
:local timeoutMax 60
:local isFound

:log info "Starting 1-minute sniff-and-bounce monitor on: $targetInterface"

:while (!$isFound) do={
    :local initialLogCount [:len [/log find message~"$searchString"]]

    /tool sniffer stop
    :delay 1s

    /interface disable [find name=$targetInterface]
    :log info "Disabled interface: $targetInterface"
    :delay 2s

    /interface enable [find name=$targetInterface]
    :log info "Enabled interface: $targetInterface"
    :delay 2s

    /tool sniffer set filter-interface=$targetInterface
    /tool sniffer start
    :log info "Started packet sniffer on interface: $targetInterface"

    :local currentLogCount $initialLogCount
    :local timeoutCounter 0

    while ($currentLogCount = $initialLogCount and $timeoutCounter < $timeoutMax) do={
        :set currentLogCount [:len [/log find message~"$searchString"]]
        :set timeoutCounter ($timeoutCounter + 1)
        :delay 1s
    }

    :if ($currentLogCount > $initialLogCount) do={
        :set isFound true
        /tool sniffer stop
        :log info "Match found! '$searchString' appeared in logs. Sniffer stopped and script finished."
    } else {
        /tool sniffer stop
        :log warning "1 minute elapsed without seeing '$searchString'. Restarting the bounce cycle..."
    }
}

But I am not any smarter after looking at the packets.

In “broken” state I only see PPPoE discovery packets:

1 0.000000000 Routerboardc_0d:41:b0 Broadcast PPPoED 42 Active Discovery Initiation (PADI)

In “working” state I saw this packet exactly once (packets before 9 where IPv6 packets, which I have also seen in “broken” state with a low frequency):

9 2.120020160 Routerboardc_0d:41:b0 Broadcast PPPoED 42 Active Discovery Initiation (PADI)
10 2.225663362 JuniperNetwo_3b:9c:43 Routerboardc_0d:41:b0 PPPoED 73 Active Discovery Offer (PADO) AC-Name='LEIJ16'

I am not an expert in inspecting packets. I don’t really know what to look for.

Hm, just had a look in here: MikroTik wired interface compatibility - RouterOS - MikroTik Documentation.

Mine is definitely not listed in there.