I’m trying to troubleshoot a “sluggish” problem on our network that apparently is only happening on one subnet. The current scenario is as follows:
Mikrotik RB3011 as Firewall and router between 2 internal subnet, running RouterOS V6.46.7
Port SFP1 is connected to the Internet via a Cable Provider
Ports 1-5 are bridged together (bridge1) and Port 1 is used to connect to subnet ADMIN - 192.168.0.0/24
Ports 6-10 are bridged together (bridge2) and Port 6 is used to connect to subnet CFTV - 172.16.0.0/20
First switch for ADMIN network is a TP-Link model T1600G-28TS connected to Mikrotik Port 1 and has IP address 192.168.0.2
First switch for CFTV network is a Allied Telesis model x510-28GTX connected to Mikrotik Port 6 and has IP address 172.60.0.2
STP protocol is disabled in all network switches (Mikrotik, TP-Link, Allied Telesis)
Attached diagram represents this layer os the network.
During troubleshooting I’m observing the following behavior:
PING from RB3011 to 192.168.0.2 (TP-Link) via interface bridge1 gets several timeouts, despite the fact the switch is directly connected to RB3011 Port 1. See screen capture attached.
PING from RB3011 to 192.168.0.2 (TP-Link) via interface bridge1 with ARP PING selected gets no timeouts.
PING from RB3011 to 172.16.0.2 (Alllied Telesis) via interface bridge2 gets no timeouts, with or without ARP PING selected.
PING from RB3011 to either 192.168.0.2 or 172.16.0.2 via respective physical ports 1 and 6 gets timeouts and attempts to reach addresses via external IP address
I’m inclined to replace switch TP-Link for testing purposes, but before I do that could someone explain to me why PING on RouterOS has such behavior? I wopuld expect the same results pinging from BRIDGE port or Physical Ethernet Port. Why the behavior is different?
When an interface is a member port of a bridge, it should not have any IP configuration attached to itself, the IP configuration should be attached to the bridge instead. So indicating the member port as an interface for ping doesn’t make much sense, but there is probably no check on this in the ping utility, which would return a corresponding error message.
Worse than that, when the IP configuration is attached to a member port rather than the bridge, it “almost works”, with some weird effects now and then. So if that’s your case, move the IP configurations to the bridges and try again.
Also, disable the “detect internet” functionality, it also can cause surprises.
Thanks for the hints. The IP configuration is attached to the bridges 1 and 2, not to the Ethernet interfaces. I’ll check the “detect internet” feature to see if it’s enabled.
Any advantage in modifying the configuration to put RB3011 as “router only” (no bridges) between ports 1 and 6?
If you don’t need the other ports on the 3011 to be in the same bridge as ether1 (ether6) because you have enough ports on the external switches, you may save a couple of CPU cycles per packet as the bridge processing would not be necessary. So depending on your traffic volume, you may prefer to keep the flexibility by having multiple ports bridged together, or you may prefer to releave the CPU a bit by removing the bridge.
Having two bridges doesn’t disable hardware offload for one of the bridges?
I’d suspect an IP conflict too, don’t know if it can be spotted in IP/ARP but I know I’ve seen one when doing an IP Scan using Tools/IP Scan for the whole subnet (one IP was showing twice with two different MACs) which were assigned by mistake statically to two devices.
There’s no such thing as weird, there’s an explanation for every “weird issue”, usualy a misconfig, or a bug. But I doubt that there’s a bug here since I have a few RB3011 running without issues.
RB3011 has 2 switch chips. As far as I understand, the fact I’m using SFP1 to connect to Internet reduces bridge2 (ports 6-10) bandwidth to CPU to “only” 1 GBit (single link to CPU1) whether the bridge1 keeps 2 x 1GBit links (CPU0 and CPU1). Both bridges are using hardware offloading.
Regarding configuration, I’ve had an issue with one past version upgrade on RB3011 where the upgrade “changed” de IP address assignment from bridge1 to Ethernet 1 by itself. Took me a couple of days to pinpoint the problem. This is why the RB3011 is currently running 6.46.7 (long term) instead of 6.47.4 (stable).