ZeroTier Gateway Tunneling On MikroTik Device

Device: hap-ac3
RouterOS Version: I’ve tried on both v7.6 and v7.12.2. I don’t want to update to latest because changes after v7.12.2 break our MikroTik scripts. Regardless, I looked through the RouterOS changelog and didn’t see any mentions of Zerotier after v7.12.2.
Issue Description:

  • Intro: We have a bunch of routers controlling the networking of our robotic systems. Each router is attached to a ZeroTier VPN network for remote access.
  • Goal: We’d like to tunnel all non-zerotier traffic (see https://zerotier.atlassian.net/wiki/spaces/SD/pages/7110693/Overriding+Default+Route+Full+Tunnel+Mode) through a proxy server running on Azure. We have this proxy server setup and working. I can, on a linux computer, route all traffic through this interface.
  • Problem: ZeroTier has a parameter “allow default” that allows it to automatically create the ZeroTier gateway interface and route traffic through this server. It specifically creates a dynamic route to 0.0.0.0/0 with a smaller path cost than the actual gateway. As soon as I turn on this parameter, however, I lose all connection to the internet as well as the VPN. I suspect the problem has to do with routing gateway traffic. Because ZeroTier is a VPN without any real access to the internet, the router reroutes these VPN packets to the default gateway which is just another ZeroTier address. Instead of ZeroTier reverting to the literal gateway, the packets are simply dropped because the router can’t directly reach the Azure proxy server.

I suspect this a bug but I wouldn’t be completely shocked if I can change some routing rules to support ZeroTier tunneling. Even if it’s not a bug, it’s quite crazy that enabling that parameter causes a loss of internet on the device. Looking for any advice I can get!

Thanks!
Dani

This problem doesn’t actually have anything to do with ROS. You’ll probably get better help on Zerotier Community Support, or if you have a commercial license contact Zerotier support directly.

In short, Zerotier does exactly what you’re saying it to do with ‘Allow Default’ which overrides the default route for the device if you for example want to use full “Full Tunnel” mode. See the ZeroTier article “VPN Exit Node - Full Tunnel Mode or, Overriding Default Route” for more information.

Appreciate the response! Unfortunately, ZeroTier developers said the exact opposite:

Hello,
thanks for writing.
Mikrotik heavily modified ZeroTier to run on their routers, so most of the config is different and we’re not super familiar with it. (And don’t have access to their code)
To clarify, you’re checking “allow global” (and “allow default”) and experiencing the issue on the Mikrotik devices?
It might be best to check with Mikrotik for help on this, if they don’t already have an article for it.
regards

Well, the settings ‘allow global’ and ‘allow default’ still do exactly the same on ROS as on any other clients. It’s just the administrative client interface that MT has added to ROS, not modified the actual client code (i.e. ZeroTier One).

Anyhow, you don’t need any of those settings to tunnel non-ZeroTier traffic. Just treat the ZT interface as any other tunnel interface and use normal routing or routing policies just like in Linux. Although even if ROS is just a layer on top of nftables, you’re never quite sure how it’s actually used in practice.

What are you trying to achieve more precisely without using ZeroTier terminology?

EDIT:
Btw, I didn’t find anything about your use case on the ZeroTier forum..

Basically, I want all non-zerotier traffic to be routed through a specific gateway on the zerotier network. I’m sure this works since I can set my computer allowDefault to true and the traceroute works as expected, routing all traffic to the gateway and eventually to the actual dst. However, the problem is not with the allowDefault not working on MikroTik. It does exactly what it’s supposed to and creates a 0.0.0.0/0 static route to the specific IP address. The issue is I lose all connectivity to the internet including the zerotier network. This means that the zerotier interface on mikrotik is likely trying to reroute all traffic through this default interface. But, because it’s really a virtual gateway, the packets never reach the internet.

That’s pretty basic. Let’s say:

  1. Your local network (LAN) is 192.168.10.0, with a ZeroTier address of 172.16.10.10.
  2. Zerotier network is 172.16.10.0,
  3. The “Exit Node” you want to route your LAN to is on network 192.168.20.0 with a ZeroTier address of 172.16.10.50.

In Zerotier Central, go to Networks > Settings > Advanced > Managed Routes, then

  1. add ‘192.168.20.50 via 172.16.10.50’.
  2. add ‘192.168.10.50 via 172.16.10.10’.

As another option, you could just handle this routing in ROS if it’s just for a couple of devices.

That’s it, just like using ordinary routing.

Understood. I’ve already done these steps and I agree, this should be basic! I’ve added a 0.0.0.0/0 managed route to zerotier central and the route shows up on the router. The problem is once I allowDefault, I lose all connection to the internet because the ZeroTier interface doesn’t know how to properly route the packets to the cloud. Instead, from what I understand, the router keeps trying to route traffic through the default gateway (i.e. the zerotier gateway) but doesn’t isn’t able to find the 2nd gateway (i.e. the facility router). So once this route is created, I lose all connection to the internet.

P.S. and note, when I allowDefault on my linux computer, this all works flawlessly.

As I mentioned, with ROS, you don’t have control over the nftable chain and need to use policy routing to explicitly manage different paths for default routes. For example, you might have one default route for the router itself and another for the ZeroTier network. Then you can use the allowDefault setting. This approach works the same for any other tunnel types as well.

Ok. I’m starting to understand this. Here’s what I’m considering.

  • Create a new routing table called zerotier
  • Routing rule to send all packets destined for 0.0.0.0/0 to go to the 192.168.188.1 (tunnel server IP)
  • Setting up the default gateway of the zerotier table to be the main table so that packets can be routed to the internet

Is this correct? If so, how do I implement this?

Since this seems to be more of a hypothetical discussion, it’s kinda hard to give more specific advice on how to implement it. Are you familiar with using policy routing with ROS in general? Perhaps you could provide a brief description of the network topology that includes the Mikrotik router and the endpoint addresses of both the LAN and ZT networks. This would make it easier to provide more hands-on examples.

Basically, the company I work for buys robots off the shelf, programs them and lease them out to customers along with all equipment needed to run them. Included with the robots are a bunch of networked equipment all connected to a MikroTik router. This router is ZeroTier enabled and handles all routing (e.g. remote access to each of the networked components through port forwarding).

No, sadly I’m not very experienced using policy-based routing.

Network Topology:

  • LAN 1: 192.168.250.0/24 handles internal traffic between many of the components
  • LAN 2: 192.168.251.0/24 handles internal traffic between cameras and computer (according to camera documentation, these need to be on a dedicated network)
  • ZeroTier VPN network: 192.168.188.0/22
  • Azure tunnel server IP: 192.168.188.1

Also, I really appreciate your help and your patience, Larsa!

I’m on my way to a meeting and will be back later or tomorrow for more details if needed so this will be a ‘quick and dirty’ answer. There are several ways to solve this: ordinary routing, mangle, or policy routing with something like the example below:

  1. /routing table add name=ZerotierTable fib
  2. /ip route add dst-address=0.0.0.0/0 gateway=192.168.188.1%ZerotierInterface routing-table=ZerotierTable
  3. /routing rule add src-address=192.168.250.0/24 action=lookup-only-in-table table=ZerotierTable
  4. /routing rule add src-address=192.168.251.0/24 action=lookup-only-in-table table=ZerotierTable

Brief explanation:

1: Creates a separate routing table just for ZeroTier named ‘ZerotierTable’.
2: Specifies that the default route for the routing table ‘ZerotierTable’ is ‘192.168.188.1’ found at the network interface ‘ZerotierInterface’ (usually named ‘zerotier1’).
3/4: When traffic originates from ‘192.168.250/251’, it tells the routing engine to use the routing table ‘ZerotierTable’ which has its default gateway pointing to ‘192.168.188.1’. The ‘action=lookup-only-in-table’ means that if there is no working tunnel the default route will fail and not “leak” to the router’s default gateway.

This assumes that the Azure gateway ‘192.168.188.1’ knows its way back to ‘192.168.250/251’ i.e. they already exist in “Zerotier managed routes” or are routed some other way.

As an alternative you can use NAT in both directions, like ‘/ip firewall add action=masquerade chain=srcnat out-interface=ZerotierInterface’ and the other way around from the gateway ‘192.168.188.1’ back to ‘192.168.250/251’

EDIT:
Regarding remote access, I really hope you’re not using port forwarding from a public IP address which would be a pretty serious security vulnerability. Why not use ZeroTier and create a separate (OOB) management network for that purpose? This is a perfect fit for using Zerotier.

EDIT 2
Btw, is the company you work for located in LA?

Nah we’re in Boston.

Regarding the security hole, you’re very correct but we don’t do this through a public IP address. We only allow port forwarding through the zerotier interface. Additionally, we have ZeroTier routing rules preventing robot interfaces from contacting others. In fact, none of these routers have any public IP addresses. All are NAT’ed through customer facility routers.

So I think I didn’t communicate the situation all that well. The LANs 1 and 2 are only internal to each robot network. These allow us to use consistent IP addresses for each device across robots. For example, all robot computers have the same IP address but each is remotely accessible through the dedicated router ZeroTier interface along with a port forwarding rule. The internal devices can reach the internet through a masquerade dst NAT rule from the router. The Azure server can reach each device through the ZeroTier IP and the corresponding NAT port. I’m not sure this changes the configuration steps you gave me though. Intuitively, these still should work. Will try this now!

I can’t get this to work unfortunately. I’ve tried using both routing rules and mangle prerouting but traceroutes don’t show the expected results. Here’s my config:

[admin@tutor-ruby-router] /ip/route> print detail  
Flags: D - dynamic; X - disabled, I - inactive, A - active; c - connect, s - static, r - rip, b - bgp, o - ospf, i - is-is, d - dhcp, v - vpn, m - modem, y - bgp-mpls-vpn; H - hw-offloaded; + - ecmp 
   DAd   dst-address=0.0.0.0/0 routing-table=main pref-src="" gateway=192.168.1.1 immediate-gw=192.168.1.1%5ghz_up distance=5 scope=30 target-scope=10 vrf-interface=5ghz_up suppress-hw-offload=no 

   DAc   dst-address=192.168.1.0/24 routing-table=main gateway=5ghz_up immediate-gw=5ghz_up distance=0 scope=10 suppress-hw-offload=no local-address=192.168.1.37%5ghz_up 

   DAc   dst-address=192.168.188.0/22 routing-table=main gateway=zerotier2 immediate-gw=zerotier2 distance=0 scope=10 suppress-hw-offload=no local-address=192.168.190.1%zerotier2 

   DAc   dst-address=192.168.250.0/24 routing-table=main gateway=tutor-lan immediate-gw=tutor-lan distance=0 scope=10 suppress-hw-offload=no local-address=192.168.250.1%tutor-lan 

 0  As   dst-address=192.168.251.0/24 routing-table=main pref-src="" gateway=192.168.250.50 immediate-gw=192.168.250.50%tutor-lan distance=1 scope=30 target-scope=10 suppress-hw-offload=no 

 1  As   dst-address=0.0.0.0/0 routing-table=zerotier pref-src="" gateway=192.168.188.1 immediate-gw=192.168.188.1%zerotier2 distance=3 scope=30 target-scope=10 suppress-hw-offload=no 

 2  As   dst-address=192.168.188.0/22 routing-table=zerotier pref-src="" gateway=zerotier2 immediate-gw=zerotier2 distance=1 scope=30 target-scope=10 suppress-hw-offload=no

Routing rule:

src-address=192.168.250.0/23 dst-address=0.0.0.0/0 action=lookup-only-in-table table=zerotier

OR:

Mangle rule:

chain=prerouting action=mark-routing new-routing-mark=zerotier passthrough=yes dst-address-list=0.0.0.0/0 in-interface=tutor-lan log=no log-prefix=""



➜  ~ traceroute 8.8.8.8
traceroute to 8.8.8.8 (8.8.8.8), 30 hops max, 60 byte packets
 1  router.lan (192.168.250.1)  7.424 ms  7.391 ms  7.377 ms
 2  * 192.168.1.1 (192.168.1.1)  64.630 ms *
 3  .......

I get the general problem, but kinda lost in what’s where. Some simple diagram would help here.

But when things don’t just work, you can look to /ip/firewall/connection (and filter) to see what going on with NAT/routing - as NAT is my generalized guess here.

WRT to ZT providing 0.0.0.0/0 routes… Don’t know if this was your original troubles, but you need to enable the “Allow Global” for it to be populated – which is a client-only setting (i.e. you cannot tell a ZT member to “use ZT as default gateway” from the controller/my.zerotier.com.


This all makes perfect sense – access a robot is via port forwarding via the local Mikrotik ZeroTier address which is connected to robot LAN and some local internet source. With the ZT address forming the “global LAN” address for that robot/etc (where the robot IP may globally not be unique).

So, to summarize – feel to correct me – you want any other traffic originating (like DNS etc) from the “robot router” to use the 0.0.0.0/0 gateway from ZeroTier? And, only the ZT tunnels themself would directly use the local internet?

And using the 0.0.0.0/0 route from ZT didn’t work to do this. ZT searches beyond the routing table for its paths (limited by the interfaces selected in “zt1” instance), so it should find the “real WAN” itself even if the default route is install. I haven’t tested that recently, but pretty sure it worked at some point.

I’m having trouble wrapping my head around this sentence: “These allow us to use consistent IP addresses for each device across robots. For example, all robot computers have the same IP address but each is remotely accessible through the dedicated router ZeroTier interface along with a port forwarding rule.”

Are you using the same subnet at all sites for LAN1 and LAN2, i.e. 192.168.250/251? If 192.168.188.1 isn’t a central exit node to the internet but is used for something else why the need for 0.0.0.0/0? Are there other subnets behind LAN1/2?

To get a better picture of the whole thing, perhaps you could expand the previous Network Topology with an example of two different sites, each with a couple of node addresses on LAN1/LAN2, and how they need to communicate with some service on a network behind 192.168.188.1 (or vice-versa).

That what confused me too. I think… there is a ZT member at Azure to act as an internet gateway & idea is the “robot router” send everything but ZT tunnel themselves via that. I get using ZT address to act as a “global LAN” IP address for the device doing port forwarding part.

But where I get lost is there are multiple subnet being mentioned…those might require NAT or making sure all subnets are routed everywhere. If the desire was to just have all traffic go via Azure’s ZT gateway… Then I’m left wondering if “Allow Global”, a 0.0.0.0/0 going to Azure ZT (that has internet route out), and a src-nat on LANs to ZT may be all that’s needed here. ZT will find paths beyond the routing table, so adding the 0.0.0.0/0 should have worked initially here is kinda my thought.

Also keep in mind any routes added by ZT with have distance=1 by default. This may not be want you want, since there may be local routes at same distance. I’d also increase the distance= on the ZT instance (“zt1” typically). The default distance=1 of ZT injected /ip/route’s can easily lead to ECMP routes to local subnets being accidentally created - since same distance= load balances.

Anyway, a diagram be a good start here. Since I’m kinda guessing.

I’m having trouble wrapping my head around this sentence: “These allow us to use consistent IP addresses for each device across robots. For example, all robot computers have the same IP address but each is remotely accessible through the dedicated router ZeroTier interface along with a port forwarding rule.”

We are not using routers for the standard “router” use-cases. Each robot we deploy requires a significant number of networked components for operation. For example, a computer, robot control box, PDU, managed switch, etc. are all put in a server box next to the robot arm. Along with all of this equipment, we install a MikroTik router inside the box to handle the networking between these devices, direct access into the internal network for debugging purposes, etc. ZeroTier mixed with port forwarding gives us remote access to this equipment. The 192.168.250.0/24 and 192.168.251.0/24 LAN’s are the internal router LANs. This allows us to use static IP’s while communicating between devices within the server box’s network and static port forwarding.

Here’s a link to a networking diagram. I can’t figure out how to upload an image here. This may not be what you’re looking for so let me know if I can improve this in any way:
https://www.dropbox.com/scl/fi/xpqow9t2xjw8v5kafn3ar/networking_diagram_jan10_24-1.jpg?rlkey=aiqgqegxunz7tca31o8o6yrn6&st=9ypn51et&dl=0

Here is the diagram without the link (there is 1MB limit on images):
Screenshot 2024-06-05 at 10.58.10 AM.jpg