Hello. I am experiencing issues with Mikrotik hEX S over a 2-3 of months. Sometimes it works perfectly for a weeks, but for a period of 3-7 days it starting to work very unstable.
Conditions: ether 4 and 5 at bridge #1, ether 1 and 3 ports at bridge #2, ether2 port works itself and has own IP, WAN at sfp1, over 1000 networks black-listed at address list due to attacking our servers.
Bridge 1 and 2 has a couple of servers and they are not generating much traffic, as ether2, which connected to a network consists of 30 PC. Speed limit at SFP1 is 1 gigabit and this PCs is using it at full speed very often.
Problem: Sometimes all ether ports are stopping to communicate (it means, that they are running, they has links, but there is no connection to devices behind them). It seems, that only some service upload from that networks appearing at very low speeds (5-10 kbps) and no download at all. SFP1 works good, I have external access to that router, but nobody inside has internet or access over winbox to that heX S. Only soft reboot making it’s operable for a 5-10 minutes, then it happens again.
I noticed one strange thing: sometimes (and not at full gigabit speed at sfp1, more often 300-500 mbps) before that strange behavior I see, that cpu0 load goes to 100%, profiling says it used by “networking” for a 80-90%. Next thing I noticed and I didn’t saw that ever, at IRQ tab I see a thing, named “raether” which actively using IRQ at cpu0 and their count is increasing at very high pace (~100k usings per 10 s). I can’t switch CPU for that “raether”, it says me, that it’s read-only.
I have a large count of Mikrotik routers at my duty, and I didn’t ever seen “raether” somewhere else (note: there is only one mikrotik with sfp module, and it’s here at discussion, other routers ether-only).
What I tried to do: turning off HW Offloading at all ports, disable all “drop” rules at firewall, disable fast-track, tried to limit speed for ether2 with queue to 200Mbit/s, thinking it could lower CPU load, but it won’t. I have a 13 NAT rules (excluding masquerade), but only two of them were open to world, others has whitelist access. 2 month ago we tried to reset it, but that didn’t solve issue. I thought, that it could be a SYN Flood, and made filter rules to stop it, but that didn’t work and I tried to remap other IRQ stuff to other cpu’s.
Tried to update to a newest firmware (7.12 now)
6 month ago we had issue with a SFP module and replaced it.
Nothing bad at logs, ether ports just stops working and that’s all.
Could it be malicious software at router? I didn’t find anything useful googling “raether”.
Could it happen because of large address list?
Here is my firewall and NAT rules:
/ip firewall filter
add action=accept chain=input comment=\
"defconf: accept established,related,untracked" connection-state=\
established,related,untracked
add action=accept chain=input dst-port=8291 protocol=tcp
add action=drop chain=forward disabled=yes src-address=192.168.1.124
add action=drop chain=input comment="defconf: drop invalid" connection-state=\
invalid
add action=accept chain=input comment="defconf: accept ICMP" disabled=yes \
protocol=icmp
add action=accept chain=input comment=\
"defconf: accept to local loopback (for CAPsMAN)" dst-address=127.0.0.1
add action=drop chain=input comment="defconf: drop all not coming from LAN" \
in-interface-list=!LAN
add action=accept chain=forward comment="defconf: accept in ipsec policy" \
ipsec-policy=in,ipsec
add action=accept chain=forward comment="defconf: accept out ipsec policy" \
ipsec-policy=out,ipsec
add action=fasttrack-connection chain=forward comment="defconf: fasttrack" \
connection-state=established,related disabled=yes hw-offload=yes
add action=accept chain=forward comment=\
"defconf: accept established,related, untracked" connection-state=\
established,related,untracked
add action=drop chain=forward comment="defconf: drop invalid" connection-state=\
invalid
add action=drop chain=forward comment="defconf: drop all from WAN not DSTNATed" \
connection-nat-state=!dstnat connection-state=new in-interface-list=WAN
add action=jump chain=forward comment="SYN Flood protect" connection-state=new \
in-interface-list=WAN jump-target=SYN-Protect protocol=tcp tcp-flags=syn
add action=accept chain=SYN-Protect connection-state=new limit=400,5 protocol=\
tcp tcp-flags=syn
add action=drop chain=SYN-Protect connection-state=new protocol=tcp tcp-flags=\
syn
add action=add-src-to-address-list address-list=a-max-con address-list-timeout=\
1d chain=input connection-limit=20,32 protocol=tcp
/ip firewall mangle
add action=mark-packet chain=forward disabled=yes new-packet-mark=club \
out-interface=bridge passthrough=yes
/ip firewall nat
add action=netmap chain=dstnat comment=registrator dst-port=xxxxx protocol=tcp \
to-addresses=192.168.1.210 to-ports=xxxxx
add action=netmap chain=dstnat dst-port=xxxxx protocol=tcp to-addresses=\
192.168.1.254 to-ports=xxxxx
add action=netmap chain=dstnat comment="PaW Main SRV API SSL" dst-port=443 \
in-interface-list=WAN protocol=tcp src-address-list=!blacklist \
to-addresses=10.0.210.2 to-ports=443
add action=netmap chain=dstnat comment="PaW Main SRV API 8044" dst-port=8044 \
in-interface-list=WAN protocol=tcp src-address-list=!blacklist \
to-addresses=10.0.210.2 to-ports=8044
add action=netmap chain=dstnat comment="PaW Main SRV API 8045" dst-port=8045 \
in-interface-list=WAN protocol=tcp src-address-list=!blacklist \
to-addresses=10.0.210.2 to-ports=8045
add action=netmap chain=dstnat comment="Paw Main SRV RDP" dst-port=xxxxx \
in-interface-list=WAN protocol=tcp src-address-list=whitelist to-addresses=\
10.0.210.2 to-ports=3389
add action=netmap chain=dstnat comment="Paw Backup SRV RDP" dst-port=xxxxx \
in-interface-list=WAN protocol=tcp src-address-list=whitelist to-addresses=\
10.0.210.3 to-ports=3389
add action=netmap chain=dstnat comment="Moika SRV RDP" dst-port=xxxxx \
in-interface-list=WAN protocol=tcp src-address-list=whitelist to-addresses=\
10.0.211.2 to-ports=3389
add action=netmap chain=dstnat comment="Moika SBackup SRV MySQL" dst-port=xxxxx \
in-interface-list=WAN protocol=tcp src-address-list=whitelist to-addresses=\
10.0.211.3 to-ports=3306
add action=netmap chain=dstnat comment="Moika Backup SRV RDP" dst-port=xxxxx \
in-interface-list=WAN protocol=tcp src-address-list=whitelist to-addresses=\
10.0.211.3 to-ports=3389
add action=netmap chain=dstnat comment="Moika SRV MySQL" dst-port=xxxxx \
in-interface-list=WAN protocol=tcp src-address-list=!blacklist \
to-addresses=10.0.211.2 to-ports=3306
add action=netmap chain=dstnat comment="Moika SRV VNC" dst-port=xxxxx \
in-interface-list=WAN protocol=tcp src-address-list=whitelist to-addresses=\
10.0.211.2 to-ports=5900
add action=netmap chain=dstnat comment="Paw Main SRV PostgreSQL" dst-port=xxxxx \
in-interface-list=WAN protocol=tcp src-address-list=whitelist to-addresses=\
10.0.210.2 to-ports=5432
add action=masquerade chain=srcnat comment="defconf: masquerade" ipsec-policy=\
out,none out-interface-list=WAN
I excluded some external ports for a security reasons.
