Multiple Subnets on Single Bridge Issues

Hi, I am trying to create two separate subnets on the same bridge that can communicate with each other directly. The reason for doing this is to manually separate some devices logically from others but there is no need to isolate them security-wise. Since I need to be able to add any random device on the network to this separate subnet I don’t believe I can do something like a VLAN for certain ethernet ports.

The addresses are defined as follows:

[admin@MikroTik] > ip address print  
Flags: D - DYNAMIC
Columns: ADDRESS, NETWORK, INTERFACE
#   ADDRESS           NETWORK       INTERFACE
;;; defconf
0   10.0.0.1/24       10.0.0.0      bridge   
1 D x.x.x.x/22  x.x.x.x  ether1   
2   10.10.0.1/24      10.10.0.0     bridge

The issue I had initially was that packets could reach IPs on separate subnets but would almost immediately be dropped by the router because of the “drop invalid packets” rule in the FORWARD firewall chain.

[admin@MikroTik] > ip firewall filter print 
Flags: X - disabled, I - invalid; D - dynamic 
 0  D ;;; special dummy rule to show fasttrack counters
      chain=forward action=passthrough 

 1    ;;; defconf: accept established,related,untracked
      chain=input action=accept connection-state=established,related,untracked 

 2    ;;; defconf: drop invalid
      chain=input action=drop connection-state=invalid 

 3    ;;; defconf: accept ICMP
      chain=input action=accept protocol=icmp 

 4    ;;; defconf: accept to local loopback (for CAPsMAN)
      chain=input action=accept dst-address=127.0.0.1 

 5    ;;; defconf: drop all not coming from LAN
      chain=input action=drop in-interface-list=!LAN 

 6    ;;; defconf: accept in ipsec policy
      chain=forward action=accept ipsec-policy=in,ipsec 

 7    ;;; defconf: accept out ipsec policy
      chain=forward action=accept ipsec-policy=out,ipsec 

 8    ;;; defconf: fasttrack
      chain=forward action=fasttrack-connection hw-offload=yes connection-state=established,related 

 9    ;;; defconf: accept established,related, untracked
      chain=forward action=accept connection-state=established,related,untracked 

10    ;;; defconf: drop invalid
      chain=forward action=drop connection-state=invalid log=no log-prefix=""

11    ;;; defconf: drop all from WAN not DSTNATed

Disabling rule 10 resolved this issue and seemed to allow the devices to communicate. If anyone has any insight into why the router isn’t properly tracking these connections I would appreciate it, since I’m not sure why disabling the rule was necessary at all.

However, now devices on the 10.10.0.0/24 subnet can access services hosted (e.g. a website) on the 10.0.0.0/24 subnet fine, but devices on the 10.0.0.0/24 subnet cannot access services hosted on the 10.10.0.0/24 subnet. The packets are no longer being dropped as invalid, however, anything more complicated than ICMP seems to not work at all.

Additionally, ICMP from 10.10.0.0/24 → 10.0.0.0/24 results in no redirects, but ICMP from 10.0.0.0/24 → 10.10.0.0/24 results in continuous ICMP redirects from the router, though I’m not sure if that’s relevant.

ping 10.10.0.10
PING 10.10.0.10 (10.10.0.10): 56 data bytes
64 bytes from 10.10.0.10: icmp_seq=0 ttl=64 time=6.361 ms
92 bytes from 10.0.0.1: Redirect Host(New addr: 10.10.0.10)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 98b1   0 0000  3f  01 cdff 10.0.0.229  10.10.0.10

64 bytes from 10.10.0.10: icmp_seq=1 ttl=64 time=7.069 ms
92 bytes from 10.0.0.1: Redirect Host(New addr: 10.10.0.10)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 dcf9   0 0000  3f  01 89b7 10.0.0.229  10.10.0.10

64 bytes from 10.10.0.10: icmp_seq=2 ttl=64 time=6.923 ms
92 bytes from 10.0.0.1: Redirect Host(New addr: 10.10.0.10)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 e14f   0 0000  3f  01 8561 10.0.0.229  10.10.0.10

64 bytes from 10.10.0.10: icmp_seq=3 ttl=64 time=11.011 ms
92 bytes from 10.0.0.1: Redirect Host(New addr: 10.10.0.10)
Vr HL TOS  Len   ID Flg  off TTL Pro  cks      Src      Dst
 4  5  00 0054 663e   0 0000  3f  01 0073 10.0.0.229  10.10.0.10

Besides the changes made to the IP assignment and the firewall everything else is from the default config. Here is the routing table as well:

[admin@MikroTik] > ip route print detail 
Flags: D - dynamic; X - disabled, I - inactive, A - active; c - connect, s - static, r - rip, b - bgp, o - ospf, i - is-is, d - dhcp, v - vpn, m - modem, y - bgp-mpls-vpn; H - hw-offloaded; + - ecmp 
   DAd   dst-address=0.0.0.0/0 routing-table=main pref-src="" gateway=x.x.x.x immediate-gw=x.x.x.x%ether1 distance=1 scope=30 target-scope=10 vrf-interface=ether1 suppress-hw-offload=no 
 
   DAc   dst-address=10.0.0.0/24 routing-table=main gateway=bridge immediate-gw=bridge distance=0 scope=10 suppress-hw-offload=no local-address=10.0.0.1%bridge 

   DAc   dst-address=10.10.0.0/24 routing-table=main gateway=bridge immediate-gw=bridge distance=0 scope=10 suppress-hw-offload=no local-address=10.10.0.1%bridge 

   DAc   dst-address=x.x.x.x/22 routing-table=main gateway=ether1 immediate-gw=ether1 distance=0 scope=10 suppress-hw-offload=no local-address=x.x.x.x%ether1

Any help is appreciated!

You’ve placed yourself in a pond of mud …

I assume your client devices are configured with /24 subnet and proper gateway address, so initially they don’t know a squat about the other subnet being available on the same physical network. And this is what happens:

  1. deviceA (e.g. from 10.0.0.0/24 subnet) wants to communicate with deviceB (from the other subnet, e.g. 10.10.0.0/24)
  2. deviceA notices that deviceB is in another subnet, so deviceA decides to use its default gateway as next communication hop
  3. deviceA sends a packet to router. router does the routing decission and passes packet to device B.
  4. at the same time router notices that egress interface is the same as ingress (bridge in both cases), so it thinks “well, deviceA could communicate with deviceB directly”. So it sends out ICMP redirect message to deviceA telling it to communicate with deviceB directly
  5. at the time of passing initial packet, connection tracking takes a note about new connection
  6. deviceB replies to deviceA … since deviceA is in different subnet than deviceB, deviceB uses its gateway
  7. return packet arrives at router, it passes it to deviceA
  8. step 4 gets repeated for deviceB
  9. connection tracking notes the return packet, so far so good
  10. deviceA receives packet and sends another one (e.g. the third packet of 3-way TCP handshake). This time it decides to communicate directly with deviceB without using gateway … exactly as instructed by router in step 4.
    To do that, deviceA somehow has to know deviceB’s MAC address, seems like ARP works in this case somehow (might be that ARP replies can be routed?)
  11. deviceB replies to a packet, but ignores ICMP redirect, received in step 8, so it sends reply to gateway
  12. router’s connection tracking sees the packet, but it finds it invalid … the packet is a regular packet, but connection tracking didn’t see connection to properly establish due to packet bypassing router in step 10. At this point, the packet is declared as invalid and firewall rule blocks it.

What to do? If you want to continue with two-subnet-on-single-interface topology, then you have to instruct connection tracking machinery not to track packets between these two subnets … and in this case default firewall will pass packets:

/ip/firewall/raw
add action=notrack src-address=10.0.0.0/24 dst-address=10.10.0.0/24
add action=notrack src-address=10.10.0.0/24 dst-address=10.0.0.0/24

It’s important to set both src and dst addresses, or else connections towards other networks (including internet) might skip connection tracking and that would mean no statefull firewall.

Another, perhaps even more elegant solution, would be to configure router not to send out redirects. This way devices from both LAN subnets will always use gateway to communicate with each other (does increase load on router, but removes all the problems with firewall) as they will remain unaware of the fact they are using same physical network:

/ip/settings
set send-redirects=no

Thank you for the quick response!

I set the send-redirects option to “no”, and at first did not see any difference and still received redirects from the router. I then added the notrack rules in the firewall which seemed to get the network in a similar state as disabling the drop invalid packets rule did.

After restarting the router, the send-redirects change seems to have taken effect and clients no longer receive redirects when pinging, however, connectivity between subnets still exhibits very weird and intermittent behavior. For instance, before restarting the router the 10.10.0.0/24 devices could access web servers hosted in 10.0.0.0/24 and after the restart, they no longer could. Additionally, 10.0.0.0/24 devices could now access webservers on 10.10.0.0/24 but the load speed is extremely slow, to the point where the browser will often time out the request.

It is clear that there is connectivity between the subnets though, as they can consistently ping each other, however, I wonder if some other issue with routing is taking place that could cause these extreme slowdowns.

Here are the new settings after updating:

[admin@MikroTik] > ip firewall raw print 
Flags: X - disabled, I - invalid; D - dynamic 
 0  D ;;; special dummy rule to show fasttrack counters
      chain=prerouting action=passthrough 

 1    chain=prerouting action=notrack src-address=10.0.0.0/24 dst-address=10.10.0.0/24 

 2    chain=prerouting action=notrack src-address=10.10.0.0/24 dst-address=10.0.0.0/24
[admin@MikroTik] /ip/settings> print 
              ip-forward: yes
          send-redirects: no
     accept-source-route: no
        accept-redirects: no
        secure-redirects: yes
               rp-filter: no
          tcp-syncookies: no
    max-neighbor-entries: 8192
             arp-timeout: 30s
         icmp-rate-limit: 10
          icmp-rate-mask: 0x1818
             route-cache: yes
         allow-fast-path: yes
   ipv4-fast-path-active: no
  ipv4-fast-path-packets: 0
    ipv4-fast-path-bytes: 0
   ipv4-fasttrack-active: yes
  ipv4-fasttrack-packets: 721665
    ipv4-fasttrack-bytes: 875881718

After removing the redirects, no devices contain ARP entries from outside of their own subnets.

? (10.0.0.229) at xx:xx:xx:xx:xx:xx [ether] on eth0
? (10.0.0.46) at <incomplete> on eth0
? (10.0.0.1) at xx:xx:xx:xx:xx:xx [ether] on eth0

Here’s a sample tcpdump from a client in the 10.10.0.0/24 subnet attempting to access an SSH server hosted in the 10.0.0.0/24 subnet. The client hangs and a proper connection never seems to get established, but the server clearly attempts to respond. (These are multiple separate attempts)

00:11:43.752815 IP 10.10.0.89.55472 > 10.0.0.5.ssh: Flags [S], seq 4045251319, win 64240, options [mss 1460,sackOK,TS val 977048938 ecr 0,nop,wscale 7], length 0
00:11:43.752965 IP 10.0.0.5.ssh > 10.10.0.89.55472: Flags [S.], seq 3925179122, ack 4045251320, win 65160, options [mss 1460,sackOK,TS val 2019554341 ecr 977048938,nop,wscale 7], length 0
00:11:44.759685 IP 10.0.0.5.ssh > 10.10.0.89.55472: Flags [S.], seq 3925179122, ack 4045251320, win 65160, options [mss 1460,sackOK,TS val 2019555348 ecr 977048938,nop,wscale 7], length 0
00:11:46.775682 IP 10.0.0.5.ssh > 10.10.0.89.55472: Flags [S.], seq 3925179122, ack 4045251320, win 65160, options [mss 1460,sackOK,TS val 2019557364 ecr 977048938,nop,wscale 7], length 0
00:11:49.086513 IP 10.10.0.89.55472 > 10.0.0.5.ssh: Flags [F.], seq 22, ack 1, win 502, options [nop,nop,TS val 977054269 ecr 2019554341], length 0
00:11:49.086738 IP 10.0.0.5.ssh > 10.10.0.89.55472: Flags [.], ack 1, win 510, options [nop,nop,TS val 2019559675 ecr 977048938,nop,nop,sack 1 {22:23}], length 0
00:11:49.096548 IP 10.10.0.89.55472 > 10.0.0.5.ssh: Flags [P.], seq 1:22, ack 1, win 502, options [nop,nop,TS val 977054280 ecr 2019559675], length 21: SSH: SSH-2.0-OpenSSH_9.3
00:11:49.096687 IP 10.0.0.5.ssh > 10.10.0.89.55472: Flags [.], ack 23, win 510, options [nop,nop,TS val 2019559685 ecr 977054280], length 0
00:11:49.159463 IP 10.0.0.5.ssh > 10.10.0.89.55472: Flags [P.], seq 1:33, ack 23, win 510, options [nop,nop,TS val 2019559747 ecr 977054280], length 32: SSH: SSH-2.0-OpenSSH_9.2p1 Debian-2
00:11:49.166518 IP 10.10.0.89.55472 > 10.0.0.5.ssh: Flags [R], seq 4045251342, win 0, length 0
00:11:49.521069 IP 10.10.0.89.47964 > 10.0.0.5.ssh: Flags [S], seq 4023814519, win 64240, options [mss 1460,sackOK,TS val 977054706 ecr 0,nop,wscale 7], length 0
00:11:49.521220 IP 10.0.0.5.ssh > 10.10.0.89.47964: Flags [S.], seq 4286235992, ack 4023814520, win 65160, options [mss 1460,sackOK,TS val 2019560109 ecr 977054706,nop,wscale 7], length 0
00:11:50.551701 IP 10.0.0.5.ssh > 10.10.0.89.47964: Flags [S.], seq 4286235992, ack 4023814520, win 65160, options [mss 1460,sackOK,TS val 2019561140 ecr 977054706,nop,wscale 7], length 0
00:11:52.567702 IP 10.0.0.5.ssh > 10.10.0.89.47964: Flags [S.], seq 4286235992, ack 4023814520, win 65160, options [mss 1460,sackOK,TS val 2019563156 ecr 977054706,nop,wscale 7], length 0
00:11:53.891055 IP 10.10.0.89.47964 > 10.0.0.5.ssh: Flags [F.], seq 22, ack 1, win 502, options [nop,nop,TS val 977059074 ecr 2019560109], length 0
00:11:53.891293 IP 10.0.0.5.ssh > 10.10.0.89.47964: Flags [.], ack 1, win 510, options [nop,nop,TS val 2019564479 ecr 977054706,nop,nop,sack 1 {22:23}], length 0
00:11:53.900654 IP 10.10.0.89.47964 > 10.0.0.5.ssh: Flags [P.], seq 1:22, ack 1, win 502, options [nop,nop,TS val 977059086 ecr 2019564479], length 21: SSH: SSH-2.0-OpenSSH_9.3
00:11:53.900797 IP 10.0.0.5.ssh > 10.10.0.89.47964: Flags [.], ack 23, win 510, options [nop,nop,TS val 2019564489 ecr 977059086], length 0
00:11:53.963986 IP 10.0.0.5.ssh > 10.10.0.89.47964: Flags [P.], seq 1:33, ack 23, win 510, options [nop,nop,TS val 2019564552 ecr 977059086], length 32: SSH: SSH-2.0-OpenSSH_9.2p1 Debian-2
00:11:53.971047 IP 10.10.0.89.47964 > 10.0.0.5.ssh: Flags [R], seq 4023814542, win 0, length 0

Even simple pinging saw pretty major packet loss, but only when across the two subnets.

ping 10.0.0.5 -f
PING 10.0.0.5 (10.0.0.5) 56(84) bytes of data.
....................................................................................................................^C
--- 10.0.0.5 ping statistics ---
371 packets transmitted, 255 received, 31.2668% packet loss, time 2601ms
rtt min/avg/max/mdev = 3.175/5.621/35.810/3.802 ms, pipe 4, ipg/ewma 7.030/4.356 ms

I stand by my first line of my previous post.

I’d think again (and again) about necessity to run two IP subnets over single ethernet broadcast domain.

I have white clothes on, not diving in LOL

I’d be happy to consider a different way of doing this. Is there any better way to achieve something similar (i.e. two subnets without having to connect devices to specific ports or APs)? Even if I can do VLAN tagging based on specific MAC addresses I would still need to route the traffic from bridge->bridge which I would think would result in the same behavior.

I’ll try to filter L3 broadcasts since I can’t set rules for L2 broadcasts and see if that changes anything. I’ll switch to a different approach if that doesn’t help.

Wish I could help, dont have the networking acumen, all I can see is stuffing cooked spaghetti noodles up a straw.

Multihoming any interface is a bad idea. Totally possible but it does have a lot of side-effects in the firewall.

One dirty hack you can do potentially… Is change the subnet to be a /16 for both IPs on the bridge. Then just ARP be needed.

Nope, from IP layer point of view, it would be vlanX ↔ vlanY traffic … in this case, bridge interface has no meaning any more. Many users are confused about bridge and different personalities, so read this tutorial, it might shed some light.