Slow routing, fixed by reboot - how to troubleshoot?

Hi!

Got a CRS125-24G-1S router in my home, firmware 6.47.3, running a plain NAT router configuration. No queue’s nothing special. Just NAT and some firewall rules.

Recently (maybe since upgrading to 6.47.x??? Not sure), my internet connection speed drops into the single mbit range (this is on Ethernet - no Wifi). Rebooting the router always so far fixes it. There is plenty of memory (105MB free, cpu load is low and I see no other reason why the connection slows down so dramatically.

What can I do to troubleshoot this issues?
Are there any logs, resource counters or anything that I can/should be looking at?
Any know issues?

Thanks for your help!

Config if needed:

sep/06/2020 23:06:22 by RouterOS 6.47.3

software id = CZF3-PTXT

model = CRS125-24G-1S

/interface bridge
add name=bridge1
/interface ethernet
set [ find default-name=ether1 ] comment="Office 1 - Andre" speed=100Mbps
set [ find default-name=ether2 ] comment="Office 2 - Mei" speed=100Mbps
set [ find default-name=ether3 ] speed=100Mbps
set [ find default-name=ether4 ] speed=100Mbps
set [ find default-name=ether5 ] comment="Office 5 - Printer" speed=100Mbps
set [ find default-name=ether6 ] comment=Wifi speed=100Mbps
set [ find default-name=ether7 ] comment=Sebastian speed=100Mbps
set [ find default-name=ether8 ] speed=100Mbps
set [ find default-name=ether9 ] speed=100Mbps
set [ find default-name=ether10 ] comment="Livingroom 10 - TV" speed=100Mbps
set [ find default-name=ether11 ] speed=100Mbps
set [ find default-name=ether12 ] speed=100Mbps
set [ find default-name=ether13 ] speed=100Mbps
set [ find default-name=ether14 ] speed=100Mbps
set [ find default-name=ether15 ] comment=ISY994 speed=100Mbps
set [ find default-name=ether16 ] comment=NAS speed=100Mbps
set [ find default-name=ether17 ] speed=100Mbps
set [ find default-name=ether18 ] speed=100Mbps
set [ find default-name=ether19 ] speed=100Mbps
set [ find default-name=ether20 ] speed=100Mbps
set [ find default-name=ether21 ] speed=100Mbps
set [ find default-name=ether22 ] speed=100Mbps
set [ find default-name=ether23 ] speed=100Mbps
set [ find default-name=ether24 ] comment=WAN speed=100Mbps
set [ find default-name=sfp1 ] advertise=10M-half,10M-full,100M-half,100M-full,1000M-half,1000M-full
/ip pool
add name=dhcp ranges=10.0.1.100-10.0.1.199
/ip dhcp-server
add address-pool=dhcp disabled=no interface=bridge1 name=dhcp1
/queue type
add kind=sfq name=default-sfq
/user group
set full policy=local,telnet,ssh,ftp,reboot,read,write,policy,test,winbox,password,web,sniff,sensitive,api,romon,dude,tikapp
/interface bridge port
add bridge=bridge1 interface=ether1
add bridge=bridge1 interface=ether2
add bridge=bridge1 interface=ether3
add bridge=bridge1 interface=ether4
add bridge=bridge1 interface=ether5
add bridge=bridge1 interface=ether6
add bridge=bridge1 interface=ether7
add bridge=bridge1 interface=ether8
add bridge=bridge1 interface=ether9
add bridge=bridge1 interface=ether10
add bridge=bridge1 interface=ether11
add bridge=bridge1 interface=ether12
add bridge=bridge1 interface=ether13
add bridge=bridge1 interface=ether14
add bridge=bridge1 interface=ether15
add bridge=bridge1 interface=ether16
/ip neighbor discovery-settings
set discover-interface-list=none
/interface ethernet switch port
set 23 vlan-type=edge-port
/ip address
add address=10.0.1.1/24 interface=ether16 network=10.0.1.0
/ip dhcp-client
add disabled=no interface=ether24 use-peer-dns=no use-peer-ntp=no
/ip dhcp-server lease
add address=10.0.1.200 client-id=1:30:5a:3a:3:99:b7 comment="Andre's PC" mac-address=30:5A:3A:03:99:B7 server=dhcp1
add address=10.0.1.10 client-id=1:84:ba:3b:1:ae:ed comment="Canon TS9000" mac-address=84:BA:3B:01:AE:ED server=dhcp1
add address=10.0.1.11 client-id=1:d0:50:99:87:3e:8c comment=NAS mac-address=D0:50:99:87:3E:8C server=dhcp1
add address=10.0.1.3 client-id=1:0:21:b9:2:20:7a comment="Smarthome Controller" mac-address=00:21:B9:02:20:7A server=dhcp1
add address=10.0.1.201 client-id=1:2c:56:dc:93:f1:4a comment="Mei's PC" mac-address=2C:56:DC:93:F1:4A server=dhcp1
add address=10.0.1.50 client-id=1:0:24:46:4:25:ea comment="Tesla Gateway" mac-address=00:24:46:04:25:EA server=dhcp1
add address=10.0.1.202 client-id=1:d4:3d:7e:93:85:4d comment="Sebastian's PC" mac-address=D4:3D:7E:93:85:4D server=dhcp1
add address=10.0.1.12 client-id=1:c6:cd:bf:a8:d:76 comment="NAS - Jail" mac-address=C6:CD:BF:A8:0D:76 server=dhcp1
add address=10.0.1.2 client-id=1:28:bd:89:f5:9f:e2 comment="Google Wifi" mac-address=28:BD:89:F5:9F:E2 server=dhcp1
/ip dhcp-server network
add address=10.0.1.0/24 dns-server=10.0.1.1 domain=home.ironcreek.net gateway=10.0.1.1 netmask=24
/ip dns
set allow-remote-requests=yes cache-size=32768KiB servers=208.67.222.222,208.67.220.220
/ip firewall address-list
add address=0.0.0.0/8 list=NotPublic
add address=100.64.0.0/10 list=NotPublic
add address=127.0.0.0/8 list=NotPublic
add address=169.254.0.0/16 list=NotPublic
add address=172.16.0.0/12 list=NotPublic
add address=192.0.0.0/24 list=NotPublic
add address=192.0.2.0/24 list=NotPublic
add address=192.168.0.0/16 list=NotPublic
add address=192.88.99.0/24 list=NotPublic
add address=198.18.0.0/15 list=NotPublic
add address=198.51.100.0/24 list=NotPublic
add address=203.0.113.0/24 list=NotPublic
add address=224.0.0.0/4 list=NotPublic
add address=240.0.0.0/4 list=NotPublic
/ip firewall filter
add action=fasttrack-connection chain=forward comment="Fasttrack established, related" connection-state=established,related
add action=accept chain=forward comment="Accept established, related" connection-state=established,related
add chain=forward comment="Allow outgoing traffic" out-interface=ether24 src-address=10.0.1.0/24
add action=drop chain=forward comment="Drop bogus forwards" in-interface=ether24 log=yes log-prefix=forward-bogus src-address-list=NotPublic
add action=drop chain=forward comment="Drop the rest"
add chain=input comment="Accept Established / Related Input" connection-state=established,related
add chain=input comment="Allow internal access to router" src-address=10.0.1.0/24
add action=drop chain=input comment="Reject external DNS" dst-port=53 in-interface=ether24 log=yes log-prefix=input-dns protocol=udp
add action=drop chain=input comment="Drop everything else" in-interface=ether24
add action=add-src-to-address-list address-list=auto-ban address-list-timeout=6h chain=input comment="Dynamic ban for SSH login attempt" dst-port=22 protocol=tcp
/ip firewall nat
add action=masquerade chain=srcnat out-interface=ether24
/ip service
set telnet disabled=yes
set ftp disabled=yes
set www address=10.0.1.0/24 disabled=yes
set ssh address=10.0.1.0/24 port=222
set api disabled=yes
set winbox address=10.0.1.0/24
set api-ssl disabled=yes
/ip ssh
set forwarding-enabled=remote strong-crypto=yes
/system clock
set time-zone-name=America/Los_Angeles
/system identity
set name=Stargate
/system ntp client
set enabled=yes server-dns-names=0.pool.ntp.org,1.pool.ntp.org,2.pool.ntp.org,3.pool.ntp.org
/tool bandwidth-server
set enabled=no
/tool graphing interface
add interface=ether24
/tool mac-server
set allowed-interface-list=none
/tool mac-server ping
set enabled=no

Got a CRS125-24G-1S router

It’s a switch, with some low-powered routing capability.

An obvious configuration error, change:
/ip address add address=10.0.1.1/24 interface=ether16 network=10.0.1.0
to:
/ip address add address=10.0.1.1/24 interface=bridge1 network=10.0.1.0

Ok. Glad we clarified that :slight_smile:

I’m sure the basic routing I’m using it for does not stress it’s capabilities.
Note that it has been working well for 2+ years for the most part.


An obvious configuration error, change:

“obvious” :wink:

Thanks. Changed.

Happened again today :cry:
Really can’t figure out why. WAN speed drops from ~150mbit to single digits. Reboot fixes it 100% of the time.
No packet errors, no high resource consumption as far as I can tell, no spikes on the firewall rules etc.
Really don’t know how to debug this.

Any tips would be appreciated!

For now, downgraded firmware to the “Long term” tract. Doubt it’ll do anything, but hey…

In the meantime, keep the tips/tricks coming please.

Thanks!

Just guessing … number of still open connections (number of ports for NAT masquerade exhausted?) (UDP ports, TCP syn flood)

Any guess is appreciated at this point :slight_smile:

The router isn’t currently in a bad state (though it’s only been a day since last reboot). If it’s back in slow mode, I’ll check the open connections. Hadn’t previously looked there yet.
Thanks for the hint. Will update once the router hicks up again.

Keep the suggestions coming please!

Alright, woke up to a slow router again this morning.

Things to note:

  • Downgrading the firmware to the “Long term” tract, version 6.45.9 did not seem to help
  • Checked IP connections; less than ~200 connections with no obvious stand-outs, no high churn
  • Disabled all firewall rules for testing; no change
  • CPU usage low, lots of memory available

Reboot again fixed it…

Have you checked the physical port status after the slow down?
Could be some problems on the line, that force renegotiating to 10mbit.

Have not. Will add it to the list of things to check for next time this happens. Thanks!

To answer @xvo’s question:
Slowdown just happened again and WAN interface was still at 1 Gbps. I also disabled it and enabled it again and it came back up in 1 Gbps mode. Also did not resolve the issue.

Potential red herring:
At the last reboot about 2 days ago, I enabled syn-cookies. I had no real reason to do so, since there was nothing suspicious in the connection list, but I just gave it a try. To my surprise, the router made it two full days without the slowdown. Since I was curious, I disabled syn cookies again and a few hours later, the slowdown happened. Re-enabling syn-cookies on the fly did not resolve the issue.
But, as I said, this is a potential red herring, it may have been coincidence that the router stayed up a bit longer this time.

Nevertheless I have re-enabled syn-cookies and rebooted the router. As always, rebooting fixed the issue immediately.

Still no other idea what causes this. There are no TX/RX errors, the firewall isn’t going nuts, CPU/memory resource is low etc. Still don’t know where else to look.

Any more tips would be greatly appreciated!

I would try another router between ISP and CRS (use CRS like a switch only) to rule out the possibility, that the problem is on ISP side.

Just guessing again …

Is rebooting the ISP router instead also solving the problem?

Seems like it is coming from the internet, but you have no tools to see it. A (root kit) hacked router will not show the offending traffic or heavy loaded resources. It will bypass any protection of your router. You could be a botnet victim. Rebooting brings you out of that loop, temporarily.
Only a device between the ISP router and the MKT can reveal that traffic.

I said … guessing!

First off, red herring is confirmed.
Less than one day with syn-cookies enabled and the router is slow again.
So that was a fliuke.

@bpwl, @xvo,
Thanks for those suggestions. I think I have another router somewhere, I’ll try to stick that between the CRS and the ISP.
I have no access to the ISPs router, so can’t reboot that.

For a general setup overview, this is simply my home router.
The setup is as following:
ISP → DOCSIS modem → CRS → LAN