2 routers, and VLANs, ICMP works, but TCP does not work well.

Hi,

I’m having a problem with routing through a 2nd router when using VLANs.

Diagram:

  ISP1     ISP2
   |        |
router1  router2
   \        /
    \      /
     switch----workstation

ISP1 is my primary internet with a distance of 1, ISP2 is my backup with a distance of 2. My previous setup had both ISPs in a single router and it works fine.

If ISP2 fails, no worries. However if ISP1 fails I need all traffic being routed to ISP1 to route through ISP2. This works as expected for the default VLAN. This however does not work for vlan20, except for ICMP traffic. Traceroute reports exactly what I expect, but actual traffic can be best described as flaky and very slow with timeouts.

If I set the workstation def gateway to the router2 vlan20 address, it works just fine, so it workstation → router1 → router2 → isp2 that is unreliable. router1 and router2 can ping each other over the default and vlan20 interfaces.

What am I missing here? Thanks.

I’d have to see your configurations to make a diagnosis.

Post an export from each router and the forum crew will take a look:

/export hide-sensitive

Thanks for taking a look. My configs are below.

Using the diagram above this is how wired:

switch sfp-sfpplus1 → router1 sfp-sfpplus1
switch ether6 → router2 ether2
router1 ether1 → ISP1
router2 ether1 → ISP2
workstation → ether17 (when testing vlan20)
workstation → ether19 (when testing default vlan)

Desired Behavior:

  1. If router1 fails, then use router2: uses VRRP. PASS with both default vlan and vlan20. VRRP .254 def gateway must be used.

  2. Any machine to directly use router1 or router2 (.1 or .2) gateways directly. PASS. I need this for my monitoring code to verify health of network as well as using vtrunkd.

  3. if ISP1 fails, then route out router2 AND support for PBR using mangle and address lists: uses routing rules and filters, PASS with default vlan IFF default gateway is NOT VRRP. FAIL with vlan20.

If I dump router2, put ISP2 on router1 ether2, and change that to be marked as isp2, then everything in #3 works as expected (this is my current running solution on a different router; the configs below are in a lab getting ready for deployment). OTOH, I do not have a 2nd router or VRRP. Not a huge issue, but I figured why not? since I’ll have a 2nd rack mount router. The HeX playing the role of router2 is just a place holder.

Other things I have tried:

  1. disable fasttrack. I’ve had issues with fasttrack and vlans in the past.
  2. Connect ether3 to ether3 of the routers and use a different subnet to route from 1 to 2.

Switch:

# sep/10/2017 08:19:40 by RouterOS 6.41rc26
# software id = 4HJ8-GWWB
#
# model = CRS326-24G-2S+
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no igmp-snooping=no name=bridge1 vlan-filtering=yes
/interface vlan
add interface=bridge1 name=vlan20 vlan-id=20
/interface bridge port
add bridge=bridge1 interface=ether1
add bridge=bridge1 interface=ether2
add bridge=bridge1 interface=ether3
add bridge=bridge1 interface=ether4
add bridge=bridge1 interface=ether5
add bridge=bridge1 interface=ether6
add bridge=bridge1 interface=ether7
add bridge=bridge1 interface=ether8
add bridge=bridge1 interface=ether9
add bridge=bridge1 interface=ether10
add bridge=bridge1 interface=ether11
add bridge=bridge1 interface=ether12
add bridge=bridge1 interface=ether13
add bridge=bridge1 interface=ether14
add bridge=bridge1 interface=ether15
add bridge=bridge1 interface=ether16
add bridge=bridge1 interface=ether17 pvid=20
add bridge=bridge1 interface=ether18
add bridge=bridge1 interface=ether19
add bridge=bridge1 interface=ether20
add bridge=bridge1 interface=ether21
add bridge=bridge1 interface=ether22
add bridge=bridge1 interface=ether23
add bridge=bridge1 interface=ether24
add bridge=bridge1 interface=sfp-sfpplus1
add bridge=bridge1 interface=sfp-sfpplus2
/interface bridge vlan
add bridge=bridge1 tagged=sfp-sfpplus1,ether6 untagged=ether17 vlan-ids=20-30
/ip address
add address=172.16.1.1/16 interface=bridge1 network=172.16.0.0
add address=172.20.1.1/16 interface=vlan20 network=172.20.0.0
/ip route
add distance=1 gateway=172.16.0.1
/system identity
set name=switch
/system package update
set channel=release-candidate
/system routerboard settings
set boot-os=router-os

router1:

/interface bridge
add fast-forward=no name=bridge1
/interface vrrp
add interface=sfp-sfpplus1 name=vrrp1 priority=254
/interface vlan
add interface=sfp-sfpplus1 name=vlan20 vlan-id=20
/interface vrrp
add interface=vlan20 name=vrrp20 priority=254 vrid=20
/interface list
add name=external
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/routing bgp instance
set default as=100 disabled=yes
/interface bridge port
add bridge=bridge1 interface=ether4
add bridge=bridge1 interface=ether5
add bridge=bridge1 interface=ether6
add bridge=bridge1 interface=ether7
add bridge=bridge1 interface=combo1
/interface list member
add interface=ether1 list=external
add disabled=yes interface=ether2 list=external
/ip address
add address=172.16.0.254 interface=vrrp1 network=172.16.0.254
add address=172.16.0.1/16 interface=sfp-sfpplus1 network=172.16.0.0
add address=172.20.0.1/16 interface=vlan20 network=172.20.0.0
add address=172.20.0.254 interface=vrrp20 network=172.20.0.254
/ip dhcp-client
add disabled=no interface=ether1
/ip firewall address-list
add address=172.20.1.10 comment="preferisp2, failover to isp1" disabled=yes list=preferisp2
add address=172.20.1.10 comment="only out isp1" disabled=yes list=onlyisp1
add address=172.20.1.10 comment="preferisp1, failover to isp2" disabled=yes list=preferisp1
add address=172.20.1.10 comment="only out isp2" disabled=yes list=onlyisp2
add address=172.16.1.10 comment="preferisp2, failover to isp1" disabled=yes list=preferisp2
add address=172.16.1.10 comment="only out isp1" disabled=yes list=onlyisp1
add address=172.16.1.10 comment="preferisp1, failover to isp2" disabled=yes list=preferisp1
add address=172.16.1.10 comment="only out isp2" disabled=yes list=onlyisp2
/ip firewall filter
add action=accept chain=input comment="defconf: accept ICMP" protocol=icmp
add action=accept chain=input comment="defconf: accept established,related" connection-state=established,related
add action=drop chain=input comment="drop all from WAN, external interface list" in-interface-list=external
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" connection-state=established,related
add action=accept chain=forward comment="defconf: accept established,related" connection-state=established,related
add action=drop chain=forward comment="defconf: drop invalid" connection-state=invalid
add action=drop chain=forward comment="drop all from WAN not DSTNATed, external interface list" connection-nat-state=\
    !dstnat connection-state=new in-interface-list=external
/ip firewall mangle
add action=mark-routing chain=prerouting comment="prefer isp1, failover to isp2 (default)" new-routing-mark=preferisp1 \
    passthrough=yes
add action=mark-routing chain=prerouting comment="only out isp1 (onlyisp1 address list)" new-routing-mark=onlyisp1 \
    passthrough=yes src-address-list=onlyisp1
add action=mark-routing chain=prerouting comment="prefer isp2, failover to isp1 (preferisp2 address list)" disabled=\
    yes new-routing-mark=preferisp2 passthrough=no src-address-list=preferisp2
add action=mark-routing chain=prerouting comment="only out isp2 (onlyisp2 address list)" new-routing-mark=onlyisp2 \
    passthrough=yes src-address-list=onlyisp2
/ip firewall nat
add action=masquerade chain=srcnat comment="external interface list" out-interface-list=external
/ip route
add check-gateway=ping distance=2 gateway=172.16.0.2 routing-mark=isp2
/ip route rule
add comment="isp1 only / disallow failover" routing-mark=onlyisp1 table=main
add routing-mark=onlyisp1 table=isp1
add action=unreachable routing-mark=onlyisp1
add comment="prefer isp1 but allow failover to isp2" routing-mark=preferisp1 table=main
add routing-mark=preferisp1 table=isp1
add routing-mark=preferisp1 table=isp2
add comment="isp2 only / disallow failover" routing-mark=onlyisp2 table=main
add routing-mark=onlyisp2 table=isp2
add action=unreachable routing-mark=onlyisp2
add comment="prefer isp2 but allow failover" disabled=yes routing-mark=preferisp2 table=main
add disabled=yes routing-mark=preferisp2 table=isp2
add disabled=yes routing-mark=preferisp2 table=isp1
add comment="unmarked traffic" table=main
add table=isp1
add table=isp2
/lcd
set backlight-timeout=never color-scheme=dark default-screen=interfaces
/lcd interface
set sfp-sfpplus1 disabled=yes
set combo1 disabled=yes
set ether2 disabled=yes
set ether3 disabled=yes
set ether4 disabled=yes
set ether5 disabled=yes
set ether6 disabled=yes
set ether7 disabled=yes
/routing bgp peer
add disabled=yes in-filter=from-router2 name=router2 remote-address=172.16.0.2 remote-as=100
/routing filter
add chain=dynamic-in distance=1 prefix=0.0.0.0 prefix-length=0 set-check-gateway=ping set-routing-mark=isp1
add bgp-local-pref=100 chain=from-router2 disabled=yes set-check-gateway=ping set-distance=3 set-routing-mark=isp3
/system identity
set name=router1

router2:

# sep/10/2017 08:47:41 by RouterOS 6.41rc26
# software id = 8CFI-AN74
#
# model = 960PGS
/interface bridge
add admin-mac=XX:XX:XX:XX:XX:XX auto-mac=no comment=defconf fast-forward=no igmp-snooping=no name=bridge1
/interface ethernet
set [ find default-name=ether2 ] poe-out=off
set [ find default-name=ether3 ] poe-out=off
/interface vrrp
add interface=ether2 name=vrrp1
/ip neighbor discovery
set ether1 discover=no
/interface vlan
add interface=ether2 name=vlan20 vlan-id=20
/interface vrrp
add interface=vlan20 name=vrrp20 vrid=20
/interface wireless security-profiles
set [ find default=yes ] supplicant-identity=MikroTik
/ip hotspot profile
set [ find default=yes ] html-directory=flash/hotspot
/port
set 0 baud-rate=115200 name=usb1
/routing bgp instance
set default as=100 client-to-client-reflection=no disabled=yes redistribute-connected=yes
/interface bridge port
add bridge=bridge1 comment=defconf disabled=yes interface=ether2
add bridge=bridge1 comment=defconf hw=no interface=sfp1
add bridge=bridge1 disabled=yes interface=ether3
add bridge=bridge1 interface=ether4
add bridge=bridge1 interface=ether5
/interface bridge vlan
add bridge=bridge1 disabled=yes tagged=ether2 vlan-ids=20
/ip address
add address=172.16.0.2/16 comment=defconf interface=ether2 network=172.16.0.0
add address=172.16.0.254 interface=vrrp1 network=172.16.0.254
add address=172.20.0.2/16 interface=vlan20 network=172.20.0.0
add address=172.20.0.254 interface=vrrp20 network=172.20.0.254
/ip dhcp-client
add comment=defconf dhcp-options=hostname,clientid disabled=no interface=ether1
/ip firewall filter
add action=accept chain=input comment="defconf: accept ICMP" protocol=icmp
add action=accept chain=input comment="defconf: accept established,related" connection-state=established,related
add action=drop chain=input comment="defconf: drop all from WAN" in-interface=ether1
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" connection-state=established,related
add action=accept chain=forward comment="defconf: accept established,related" connection-state=established,related
add action=drop chain=forward comment="defconf: drop invalid" connection-state=invalid
add action=drop chain=forward comment="defconf:  drop all from WAN not DSTNATed" connection-nat-state=!dstnat \
    connection-state=new in-interface=ether1
/ip firewall mangle
add action=mark-routing chain=prerouting comment="only out isp2 (onlyisp2 address list)" disabled=yes \
    new-routing-mark=onlyisp2 passthrough=yes src-address-list=onlyisp2
add action=mark-routing chain=prerouting comment="prefer isp2, failover to isp1 (default)" disabled=yes \
    new-routing-mark=preferisp2 passthrough=yes
add action=mark-routing chain=prerouting comment="prefer isp1, failover to isp2 (preferisp1 address list)" disabled=\
    yes new-routing-mark=preferisp1 passthrough=yes src-address-list=preferisp1
add action=mark-routing chain=prerouting comment="only out isp1 (onlyisp1 address list)" disabled=yes \
    new-routing-mark=onlyisp1 src-address-list=onlyisp1
/ip firewall nat
add action=masquerade chain=srcnat comment="defconf: masquerade" out-interface=ether1
/ip route rule
add comment="isp2 only / disallow failover" routing-mark=onlyisp2 table=main
add routing-mark=onlyisp2 table=isp2
add action=unreachable routing-mark=onlyisp2
add comment="prefer isp2 but allow failover to isp1" disabled=yes routing-mark=preferisp2 table=main
add disabled=yes routing-mark=preferisp2 table=isp2
add disabled=yes routing-mark=preferisp2 table=isp1
add comment="isp1 only / disallow failover" disabled=yes routing-mark=onlyisp1 table=main
add disabled=yes routing-mark=onlyisp1 table=isp1
add action=unreachable disabled=yes routing-mark=onlyisp1
add comment="prefer isp1 but allow failover" disabled=yes routing-mark=preferisp1 table=main
add disabled=yes routing-mark=preferisp1 table=isp1
add disabled=yes routing-mark=preferisp1 table=isp2
add comment="unmarked traffic" table=main
add table=isp2
add disabled=yes table=isp1
/ip smb shares
set [ find default=yes ] directory=/pub
/routing bgp peer
add disabled=yes name=router1 out-filter=to_router1 remote-address=172.16.0.1 remote-as=100
/routing filter
add chain=dynamic-in distance=1 prefix=0.0.0.0 prefix-length=0 set-check-gateway=ping set-routing-mark=isp2
add action=discard chain=to_router1 disabled=yes prefix=10.0.0.0/24
add action=discard chain=to_router1 disabled=yes prefix=172.16.0.0/16
add action=discard chain=to_router1 disabled=yes prefix=172.20.0.0/16
/system identity
set name=router2
/system package update
set channel=release-candidate
/tool graphing interface
add
/tool mac-server
set [ find default=yes ] disabled=yes
add interface=bridge1
/tool mac-server mac-winbox
set [ find default=yes ] disabled=yes
add interface=bridge1

Remember that when your VLAN has the default MTU of 1500 your transporting LAN should have room for 1504 byte frames.
It normally is not an issue with MikroTik equipment when directly using ethernet, but you are using bridges and 6.41RC software (not a good idea…)
and it could be different there.

In interface bridge vlan make sure you are adding the bridge to each vlan (likely as a tagged interface). Also, toggle VLAN filtering to on for any bridge you’ve setup VLANs on.

With router1 and router2 move your VLAN interfaces to the bridge not the interface and adjust the bridge ports to tag or untagged as needed. I have been statically setting the tag and untagged settings in bridge vlan to match what I want but it’s not required for a typical “access” port. Setting the pvid will add the port to untagged dynamically when it is connected

Lastly, big flat networks per VLAN are generally frowned upon. You may want to consider revising your address scheme to use /24s per VLAN. Maybe 172.16.x.0/24 where x is the VLAN. I personally stay away from the 172.16/12 space because I prefer typing 10. something over 172. the extra character makes me sad face.

Your switch for sure is missing this. Also, pechi, I think this an ideal place to leverage the 6.41rc code. He has varying hardware types. This allows him to configure VLANs consistently across hardware. The RC is the only way to do that at the moment. MTU with VLANs and the new bridges has not been an issue in my testing since they were first released in 6.40rc before it was reverted.

Looks like my L2 frames are 1592 and 1588 (vlans) on the CRS switch, on router1 (CCR) they are 1580 and 1576 (vlans), and on router2 (HeX), they are 1598 and 1592 (vlans).

The actual MTU on all 3 is 1500.

Are the inconsistencies with the L2 frames an issue?

On the CRS switch the vlan (and everything else) is in bridge1. On the routers however nothing in use is in a bridge. I cannot setup VRRP if the interface is in a bridge. The resulting config is flagged as invalid. If I put everything on all routers in bridges, then I can use VRRP on the bridge interface, however in testing we found two problems. The first problem was flapping between them (random master/backup flapping, sometimes fixed with reboots). The second problem was that we could measure significant performance problems (up to 10%). Leaving VRRP interfaces (both physical and virtual) out of bridges has created a very fast and stable interface.

Lastly, big flat networks per VLAN are generally frowned upon. You may want to consider revising your address scheme to use /24s per VLAN. Maybe 172.16.x.0/24 where x is the VLAN. I personally stay away from the 172.16/12 space because I prefer typing 10. something over 172. the extra character makes me sad face.

Currently I’m using 192.168.x for VLANs. To avoid confusion with current setup I switched to 172.x for vlans in this lab. I see no technical reason why this would be an issue. As for 10.x networks, I’ve frequently had too many collisions with others using 10.x, so I just have avoided it. Given that I control 100% of this network, I’ll take your suggestion under advisement.

Your switch for sure is missing this.

Sorry, missing what? On the switch the vlan20 is on bridge1 and vlan filtering enabled on that bridge.

Thanks again for all your help.

You’d leverage VRRP the same way, it’s interface would be the VLAN interface attached to the bridge. It will stay invalid until you assign an IP address to it.

/interface bridge add name=test1 vlan-filtering=no
/interface bridge vlan add bridge=test1 vlan-ids=10 tagged=test1
/interface bridge vlan add bridge=test1 vlan-ids=20 tagged=test1

/interface vlan add interface=test1 vlan-id=10 name=test1-vlan10
/interface vlan add interface=test1 vlan-id=20 name=test1-vlan20

/interface vrrp add interface=test1-vlan10 version=3 v3-protocol=ipv4 disabled=no name=test1-vlan10-vrrp1
/interface vrrp add interface=test1-vlan10 version=3 v3-protocol=ipv4 disabled=no name=test1-vlan20-vrrp1

/ip address add interface=test1-vlan10 address=172.16.10.1/24
/ip address add interface=test1-vlan20 address=172.16.20.1/24

/ip address add interface=test1-vlan10-vrrp1 address=172.16.10.254/32
/ip address add interface=test1-vlan20-vrrp1 address=172.16.20.254/32

/interface bridge set test1 vlan-filtering=yes

Relavent chunk of your switch configuration:

/interface vlan
add interface=bridge1 name=vlan20 vlan-id=20
/interface bridge vlan
add bridge=bridge1 tagged=sfp-sfpplus1,ether6 untagged=ether17 vlan-ids=20-30
/ip address
add address=172.16.1.1/16 interface=bridge1 network=172.16.0.0
add address=172.20.1.1/16 interface=vlan20 network=172.20.0.0

When you use /interface bridge vlan and then assign a VLAN interface to the bridge like you did the bridge needs to be part of that VLAN as well. You do this by tagging or untagged the bridge interface to that VLAN. Because yor bridge’s PVID is set to the default value of 1 (untagged) you need to tag VLAN20 to the bridge interface for it to correctly pass traffic.

A more complete solution:

Let’s assume you have r1, r2 and sw1 (you do). r1 and r2 are connected to sw1 via ether1 and ether2 respectively. Each router is uplinked on ether1 to it’s ISP and linked to sw1 via it’s ether2 interface as a trunk port. We are going to leave VLAN1 untagged and without IP addressing. We’ll provision two VLANs, 172.16.10.0/24 for VLAN10 and 172.16.20.0/24 for VLAN20. Two PCs will be plugged into sw1 on ports ether3 and ether4. The first PC will be on VLAN10 and the second will be on VLAN20, both will send untagged traffic to the switch (access port). We’ll use the new VLAN aware bridges across the board.

All devices:

/interface bridge add name=br1 vlan-filtering=no
/interface bridge vlan add bridge=br1 vlan-ids=10 tagged=br1
/interface bridge vlan add bridge=br1 vlan-ids=20 tagged=br1
/interface vlan add interface=br1 name=br1-vlan10
/interface vlan add interface=br1 name=br1-vlan20

On both routers:

/interface bridge port add bridge=br1 interface=ether2 pvid=1
/interface bridge vlan set [ find where vlan-ids=10 or vlan-ids=20 ] tagged=br1,ether2
/interface vrrp add interface=br1-vlan10 version=3 v3-protocol=ipv4 name=br1-vlan10-vrrp1 disabled=yes
/interface vrrp add interface=br1-vlan20 version=3 v3-protocol=ipv4 name=br1-vlan20-vrrp1 disabled=yes
/ip address add interface=br1-vlan10-vrrp1 address=172.16.10.254/32
/ip address add interface=br1-vlan20-vrrp1 address=172.16.10.254/32

On sw1:

/interface bridge port add bridge=br1 interface=ether1 pvid=1
/interface bridge port add bridge=br1 interface=ether2 pvid=1
/interface bridge port add bridge=br1 interface=ether3 pvid=10
/interface bridge port add bridge=br1 interface=ether4 pvid=20
/interface bridge vlan set [ find where vlan-ids=10 or vlan-ids=20 ] tagged=br1,ether1,ether2
/interface bridge vlan set [ find where vlan-ids=10 ] untagged=ether3
/interface bridge vlan set [ find where vlan-ids=20 ] untagged=ether4
/ip address add interface=br1-vlan10 address=172.16.10.11/24
/ip address add interface=br1-vlan20 address=172.16.10.11/24

Only on r1:

/ip address add interface=br1-vlan10 address=172.16.10.1/24
/ip address add interface=br1-vlan10 address=172.16.20.1/24
/interface vrrp set br1-vlan10-vrrp1 disabled=no priority=220
/interface vrrp set br1-vlan20-vrrp1 disabled=no priority=220

Only on r2:

/ip address add interface=br1-vlan10 address=172.16.10.2/24
/ip address add interface=br1-vlan10 address=172.16.20.2/24
/interface vrrp set br1-vlan10-vrrp1 disabled=no priority=210
/interface vrrp set br1-vlan20-vrrp1 disabled=no priority=210

Last step, turn on VLAN filtering on all devices:

/interface bridge set br1 vlan-filtering=yes

Then you just setup a default route on each device to the respective ISP, you can then do additional things on r1 to force traffic from r1 to r2 when the ISP1 route fails. Maybe doing 2 static routes with a check-gateway and lower distance on ISP1 on r1 with an additional route of lower distance pointing to the .2 address of either the VLAN10 or VLAN20 interface on R2 will be sufficient. It’s safe to assume that if r1 is down from a VRRP perspective that the attached ISP connection is unavailable as well so only the single default route to ISP2 is needed on r2.

So, this is pretty much what I started with before removing the ports and vlans from the bridge on router1.

The problem is that to put a vlan in a bridge I have to also put the uplink port (sfpplus1) also in the same bridge and then traffic (non-vlan) suffers ~5-10% performance hit. This is my first CCR, there is no switch asic, just 9 cores and 9 ports. A different mystery to solve.

I’m unclear how putting vlans on the routers into bridges solve my problem where ICMP works, but TCP is flaky. I have no problems from the switch to either router, performance and traffic flows as expected, with or without vlan, with or without vrrp. The problem is only when router1 detects ISP1 is down and tried to route that traffic to router2.

Your configs had irregular settings in them. Getting to a base like I posted will put you in a position to troubleshoot reliably what is happening.

I run 6.41rc with multiple VLANs on the new bridges connected to the rest of my lab gear which is Linux or Cisco without any effect like you describe. Additionally a quick mock up in GNS doesn’t duplicate your problem.

Get to a good solid config. Test and then if needed dive in further. There may be a bug in the code somewhere that needs MikroTik’s attention.

I use the new bridges even on non accelerated hardware because it keeps my configurations consistent. I don’t have to mix and match between the implementations. Also, if a he accelerated feature is turned on in a future releasey config will take immediate advantage of it.