Strange Mangle Performance problem

Dear Forum,
this is my first topic, as I can not really find any solution for my problem.

I searched 3 days but do not find anything.

Following constellation:

All my traffic and inter- VLAN routing and ISP connection is done by a hexS router.

The ISP is connected to port 5 via PPPoE on VLAN 10 (DSL Modem of a local ISP).

Port 1 is maintaining ALL internal VLANs and connected to a VLAN aware switch.

Most of the VLANs are used to isolate devices to each other and are using the hexS’s IP as Gateway to go into internet via the local ISP and to other networks as wished via firewall rules.

All of this is working really really fine and fast :slight_smile:

In VLAN 1 there is another special router (serving another internet connection). For example the Mikrotik has IP 192.168.0.254 and the other router has 192.168.0.253 and is Ubuntu 22.04 based.

The devices in the 192.168.0.0/24 network are using the mikrotik as the default gateway as they must reach services in the other device VLANs (10.0.0.x/24) served and firewalled by the mikrotik.

Via a packet mangle I can define which devices in the 192.168.0.0/24 use the PPPoE connection of the mikrotik or the other linux router 192.168.0.253 for getting into internet. All devices in 192.168.0.0/24 should use the linux router outer one IP address should use the mikrotik’s ISP).

So I defined 2 mangle rules, the first is marking src 192.168.0.0/24 dst:all OUTER THE PRIVATE IPs (a list with local IPs) with routing mark LinuxGW and the 2nd rule remarks src 192.168.0.1/32 dst:all OUTER THE PRIVATE IPs with MikrotikGW.

Additionally I created 2 static routes to 0.0.0.0/24 with mark LinuxGW via 192.168.0.253 and 0.0.0.0/0 with mark MikrotikGW via the PPPoE Gateway.

Suprisingly this works like a charm!!! The machines in 192.168.1.0/24 network are using the linux router for internet and I get 99% of the speed desired by the Linux router!!!
I can reach the services in the 10.1.x.y VLANs as the mikrotik is the first gateway and not mangling packets to private IP address.

Now the problem:

I also want this in a VLAN subnet.
For example VLAN 100, IP range 10.1.1.0/24

I create a 3rd mangling rule to mark even all packets coming from src: 10.1.1.0/24 with LinuxGW.

The Linux machine also knows how to reach 10.1.1.0/24 via 192.168.0.254 and masqerades all packets with it’s public IP even from 192.168.0.0/24 and 10.1.1.0/24.

I can ping the test machine from the linux GW and vice versa.

The strange thing: it is workig but extremely extremely slow.

Test for example:
Windows machine 10.1.1.1 Gateway the Mikrotik routers 10.1.1.254 IP and DNS is done by google for test purposes (8.8.8.:sunglasses:
Ping to internet: normal 10-20ms
ping to Linux Router 1ms
Ping from Linux Router 1ms

so far so good.
When I try to surf it struggles and I can not really open a web page (it takes 10-20 seconds to display google and more complex sites even not loading or taking minutes to load).
When I start a download: 5-15 Kilobyte per second. When I look at any networking graphs it seems that packets are coming in every 5-10 seconds.
What is my IP: the public IP of the Linux router (so the rule seems to work).

When I disable the mangle rule:
Google comes in 1 second
Ping to internet: normal 10-20ms
Start Download: 10 Megabyte per second (as my ISP is giving me 98 Mbit /s).
Public IP: the IP of the mikrotik’s PPPoE Interface.

What I tested:
Untag the 100 VLAN on port 4 and connecting a laptop, to close out that all traffic goes over port 1 and the switch failing
swapped the switch with a complete different model and brand
enabled a masqerading within the microtik that it tells the linux router the packets are coming from 192.168.20.254 instead of 10.1.1.1
played around with the options of the mangle rules

Thinkings:
it may be that the back routing is not working properly. As the Linux router and the clients in the 192.168.0.0/24 subnet are in the same subnet the answer from the internet via the linux Router are not passing the mikrotik, they are passing directly from the Linux to the client, while the packets to the 10.1.1.1 mus pass the mikrotik.
If there is a routing fault or a mangling fault it should not work completely (even no ping and so on)
I think there is a bug or a small piece of the puzzle is forgotten by me.

Hardware and Versions:
Routerboard hexS with Router OS 7.8
Vlan aware HP Aruba switch 1930 series
Vlan aware FS switch 3400 series
Linux Router: Ubuntu 22.04 with a flat iptables constellation and forwarding enabled.

Hope anyone can help and read and understand so far :slight_smile:

Thx
Dirk

Hello and welcome to the forum!

If you provide a full export (minus sensitive stuff), a simple network diagram and finally clearly state what you want to achieve, it will make it easier for people to grasp your setup and help you out.

Concur an accurate network diagram would help big time and then complete config
/export file=anynameyouwish ( minus router serial number and any public WANIP info etc. )

Hi,

I exported the config and reduced it to the “ham” and replaced all sensitive stuff.

A bit I changed the rules of my first post as I spoke from examples.
Here the export and a small network explanation:

vlan 1:
IP range: 192.168.20.0/24 → Mikrotik 192.168.20.254, Linux Router: 192.168.20.253, Special device using the normal internet connection: 192.168.20.249 and all other should use the 253 as route for 0.0.0.0/0 (as described in the mangel). Working fine!!! How Is My IP tells the public IP of the Linux Server OUTER on the 249, there the public IP of the Mikrotik is shown! Speed is at least 98% of the speeds of the internet connections!

vlan 10:
used for pppoe connection. This is “ruled” by my ISP. The Packets are directly routed to my broadband modem

VLAN 20:
VLAN for normal use, 10.10.20.0/24 → Mikrotik 10.10.20.254, Clients using the Mikrotik as Gateway to 0.0.0.0/0 via the PPPoE internet connection. Works as expected: Ip is the public of the Mikrotik and Speed is 99% of the ISP.

VLAN 200:
Special VLAN, 0.10.200.0/24 → Mikrotik 10.10.200.254, Clients using the Mikrotik as Gateway BUT for 0.0.0.0/0 they should be routed to 192.168.20.253 and out to the internet.
It works, as how is my ip shows the public IP of the Linux router but a download is at a few kilobyte per second. It seems that the data is incoming blockwise every 5-10 seconds with an average of 5-10 kilobytes per second. More complex sites are completely not shown.

That the construct in generally is working shows, that 192.168.20.0 clients can be routed via 192.168.20.254 → 192.168.20.253 → 0.0.0.0 with 100 megabit per second…

as you can see: all i connected to ethernet1 and the switch is outbraking the VLANs to the access ports. You MAY think this is the problem but I also tested the pppoe modem on port 5 while the client in VLAN200 was connected to port 4 → that changed nothing. I also changed the brand of the switch and that also changed nothing!

hope it helps you in helping me

thx
Dirk
20230818_1133_simple.rsc (4.41 KB)

Sorry without the network diagram I dont have a clue what the linux router is doing and how located etc…
As far all interfaces belonging to bridge, not normally the case so suggesting taking off the pppoe…
No idea what you are doing with vlanid1 but remove it …

It seems you’re blocking ping on the PPPoE on interface. And PPPoE has 1492 MTU. Normally PMTUD would kick in to fixup MTU but it requires incoming ping. So the TCP stack of the clients has to figure this out to send smaller packet sizes, but only based on errors, which is slower/error-prone.

Thx for the quick reply.
I created 2 first rules to allow input and forward icmp protocols.
when I try to download again the counter not rises and download is also slow.

In the opposite to your answer stands, that, when I disable the mangle rule for the 200 VLAN the clients reach the internet exactly via this 1492 PPPoE quiet fast.
When I enable the rule we do not have the 1492 PPPoE interface in this way BUT that is the moment the problem starts.
Also from VLAN 1 I have no problems, even when i route to the 192.168.20.253 router or when I route to the PPPoE interface…

If you are having no problems, and refuse to provide requested information, then I guess I can move to help others.
The sexy winged mouse and sterile bullet can assist.

Dear anav,
pls apologize, i was sure, that my description and config is clear enough.
I try to provide as much information as I can. Thought with my description and the config it was clear what I need and not working well.

The linux router is more a standard router and installation. It has a 2nd internet connection (Company, fiber, static ip). It is nothing complex. It is a basic installation, forwarding enabled, a small iptables with the entry to masquarade all packets going out from private networks (192.168.20.0/24 and 10.10.200.0/24) and a routing table to route to 10.10.200.0 via 192.168.20.254.
You can also replace it by a netgear, dlink and whatever router. It does only masquarading and packet forwading nothing special. When I try to ping 10.10.200.1 from the Linux it pings, it routes correct through the mikrotik into the other vlan AND vice versa.

I tried to make a simple diagram showing what routes I wish, i have no tool for that, I painted it via mspaint. pls. do not laugh :wink:

thx
Dirk

EDIT:
what will I achieve???
ALL clients should use the Mikrotik as the default gateway
ALL VLANs (outer 10) should reach each other. All stuff should be regulated via the mikrotiks firewall in the next step (in the simple config all is open, in my running config there are round about 90 rules active).
Clients in 192.168.20.0/24 Subnet should use ISP 2 outer one IP (192.168.20.249) should use ISP 1 → this works like a charm with the first 2 mangling rules!!! Packet goes to 254 then to 253 and into www.
Clients in 10.10.20.0/24 should use ISP 1 via the PPPoE connection → this works like a charm !!!
Clients in 10.10.200.0/24 should use ISP 2 → this works but very very slow with the 3rd mangling rule. When I deactivate it, they use ISP 1 and running fine, but that is not the goal.

EDIT 2:
all is Subnetmask 24… in my diagram is a subnetmask 25 → it is a mistake shoud mean 24
Diagram 1.png

Lots of choices for diagram software here… https://forum.mikrotik.com/viewtopic.php?p=908118

I am still very confused about the role of the linux router.
IS IT PROVIDING DHCP for any subnets on the linux router.
Which users are supposed to use the linux for internet.
IS there a VPN connection going out linux router?

What do you mean all clients should use the MT as gateway? You mean to the internet or get DHCP service from MT?
What do you mean all vlans should reach each other, if so, you dont need vlans just one subnet

sorry, not seen your post. Will read it carefully the next days, as time is very limited. Hope my handmade diagram tells a bit how it should work.

The Linux router is a gateway to a 2nd (new, as the fiber was delivered last week) ISP (actually only testing purposes), not less, not more, only a NAT masquerading gateway.
NO DHCP - NO VPN, NOTHING else. As told, I could also use a simple nat router like dlink etc and disable dhcp. It has two network cards, one with a public IP connected to the router of the ISP and one with it’s internal 192.168.20.253.

The users of the VLAN 200 and VLAN1 should use the fiber. VLAN 1 is working perfectly, 200 is the problem.
As told before, no VPN

DHCP was not mentioned with only one word before. I tested exactly this attached simplified config on a 2nd hexS (for spare and testing purposes lying around) and I had the same problems. In my tests the IPs were manually entered in every W10 client. DHCP is not a topic. In my real Config DHCP is made by a windows Server and relayed by the Mirkotik to the VLANs but as it is not causing the problems I not mentioned it and do not think it is interesting for solving the problem. I meant, that the IP Address of the Mikrotik was entered in the clients / servers and machines as the default Gateway… Every machine has the MTs VLAN belonging IP address entered as the def. Gateway.
[/quote]

It is clearly separated via VLANs for security reasons. BUT there are devices like printers or scanners in the other subnets which should be reachable via a tcp port firewall rule from another subnet. Therefore in my real config there exist a accept rule for every scenario and a “last block forward everithing” rule for blocking every unnecessary traffig. In my simplified config I opened everything to close out a fail of the firewall rules. So they are also not interesting for solving.

Lets Assume:
192.168.20.0 is a server’s vlan, Servers and the Fiber Gateway (Linux machine) are in this subnet. All devices should use the fiber connection, outer one, polling device should use the cheap DSL connection with it’s dynamic IP address to connect to Internet.
10.10.20.0 is the production VLAN. They should not use the fiber, only the cheap DSL line from the ppoe. Even they must reach some servers and, as described, a printer in the 200s subnet.
10.10.200.0 is the office vlan. The clients even needs the fiber internet connection. Also in this network are some printers, they should be reachable from a printserver on the vlan 1 and even from a production machine on the 200 vlan, therefore exists a firewall rule to the 20 subnet to tcp 9100 of the printer


But all the “why” and other stuff like DHCP and VPN etc. is not essential as I tested this simplified config, not used some VPN or DHCP stuff and get even this strange results, too…
So, pls. let focus on the mangling rule and the poor routing performance with that, even as ALL OTHER services and firewall rules work like a charm since a long time.

We only got the new fiber connection and I wanted to “throw” the Servers and clients in the 200s subnet on this line. When I disable the 3rd mangling rule it runs as before we got the fiber, fine and fast but with the wrong line.

DNS is also not a problem as a also began download is working fast or not as the situation presents.

Thx
Dirk

Okay so treat the linux as WAN2 giving the MT a private WANIP address… Does that make sense.

I know there are much better and more elegant solutions, for example connecting the MT directly to the public router of ISP2.

But I have my own good reasons for this constellation, as it is the first step of my whole project and for testing the stability of the new connection (if this runs stable I want to reconstruct it more complex), also for building,learning and extending my own mind in stuff like routing, masquerading, mangling, linux, mikrotik etc, but that should not be discussed here.

In my opinion this construct should work but it does not work good, so, did I forgot something in the mangling rules or something else? I also tested a masquerading rule in the mikrotik, masquerading all packets going out from 10.10.200.0/24 with the mikrotiks 192.168.20.254’s IP address. The masquerading worked, I could saw it under connections in reply address… So I meant to close out a “back routing” problem back from 192.168.20.253 to 10.10.200.0/24 but it also did not speed up.

At this point of my testing (I thought I did all right) and understanding there must be clearly a firmware problem with the mikrotik.

More likely in my opinion, it is a case of a poor craftsman blames his tools

Read this thread bit of vlan confusion which discusses using tagged vlan 1 on the bridge device when you have not changed the pvid of the bridge from its default of 1.

In my opinion it is least confusing if you avoid using vlan 1 for user data when you are using vlan-filtering. pvid 1 is the implicit default, although it isn’t obvious from looking a the export.

Hi, thank you for the good tip.
This was also on my project agenda - but later. I wanted to kick the Servers and devices in VLAN 1 into another VLAN to get rid of VLAN1. But first of it I wanted to test, if all works as expected and then move the devices to another vlan ID (step by step, as if you do too much at the same time it is harder to find the errors) . Okay, will try to prioritize this step and test again. Perhaps the problem is solved automatically.

Thx
Dirk

Hi,

changed everything and completely removed vlan 1. Now it is replaced by vlan 2000, all VLAN tagged, no untagged VLAN at the mikrotik. But there is NO difference. Mangling works but horrible slow.

Is there nobody with an idea what I am missing? Maybe a routing, MTU or a internal mikrotik rule problem?
thx

Dirk

EDIT:
according to this post I got solved my problem:
http://forum.mikrotik.com/t/routing-marks-mangle/146455/1

Thx a lot