"spotty" starlink optimizations?

I have replaced starlink’s router with mikrotik. My question is about situations where dishy is obstructed and there’s connection drops every X minutes.
Can I do something to tune the TCP or Wireguard stacks so WG tunnel is re-established faster? Can I do anything so TCP is more responsive and handles better the intermittent failures.
Please note that part of the roaming is potentially getting new IP in the process.

Tricky problem. I guess it depends on how long the outages are… Now, moving the starlink so it’s obstruction free is obviously better plan, but presume you’re asking since that ain’t possible.

I suppose you can try tweak “persistent-keepalive” lower so you’re not waiting for traffic to find out the starlink is in some outage window be one possible step.

Another factor is beyond just outages reported in the Starlink app is I suspect speed also get significantly reduced before/after satellites come into view. This may have all sorts of effects on the traffic inside the tunnel, especially since the RTT be changing dramatically as starlink+WG both reconnect. So using some queue like fq_codel on the inner traffic before it goes over WG may help smooth out the RTT/etc.

The third issue, and perhaps your biggest problem, is if the IP does change after reconnection. If it’s same, the all of /ip/firewall/connections be fine. But if it changes…all the cached connections on the Mikrotik firewall become invalid. This is more problematic. WG does not have an “on-up” / “on-down” scripts, so you can just automatically flush/remove the connections on reconnection. So I’m not sure how to cause these to be flushed sooner than changing the connection tracking timeouts (which may have other effects on normal traffic). Perhaps someone else has an idea here.

assuming all your recommendations are for WG, let me pong some feedback & questions:

  • yes I’ve dropped the keepalive to 1sec and the WG tunnel seems to recover faster now. This doesn’t make much sense to me however, because I constantly have traffic via the tunnel so I’d expect the keepalive to have no effect
  • speed is not a problem, as most of the traffic is low-volume but constant. such as DNS requests
  • haven’t tracked how often the IP changes, as a matter of fact. What you are writing is valid problem however. I’d expect the starlink terminal or CGNAT infrastructure to take care of this? or I can use some netwatch rule to force reset them.

edit: just had small outage and WG took full 10 seconds to reconnect. On the other end is ubuntu, not sure if this matters.

after monitoring for some time, it turns out the IP and the outgoing port on starlink’s side egress gateway is stable. Reading the WG protocol it sounds like there’s hardcoded 15sec rekey timeout that triggers the re-negotiation. My guess would be WG’s internal state machine drops negotiation attempts coming in before the timeout has kicked in.

Get a Big Leaf Cellular package with their service.
Plug the Starlink and the BigLeaf cell into their box.
Feed that to your Tik router.

They dynamically check the connection every single second.
When Starlink goes out… it will use the cellular.
AND YOUR IP ADDRESS WILL NOT CHANGE.