I have a MikroTik hAP ac2 as router of a small network. In general it has been working fine with the exception that sometimes (often in specific scenarios) we got connection reset issues.
The issue has been going for months and I have tried many things without success. I want to try again, this time with your help.
The network has a single router (Mikrotik) which is a PPPoE client (with my ISP) and it is used mainly as DNS, DHCP, VPN (L2TP) server and firewall.
Usual problems which I have identified as connection reset:
When connecting to the VPN (from outside that network), sometimes the connections drops immediately. After retrying, the connection succeed.
From inside the network, while using a system and requesting some service from a remote server (e.g. downloading a file) in which it takes about 1 minute or more to prepare the data before receiving it, the browser fails with “connection reset” error. However, requesting the same service from a computer outside the network to the same server, finish without any issue.
When connecting through SSH to a remote server outside the network (or from outside to inside the network), the terminal freezes or gets disconnected often. Specially if a command is executed which expects at least 1 or 2 screens of data ( for example listing a directory with many files ). However, most of the times, if I keep pressing some key (like space), even though it get stuck for few seconds, the output is returned completely and it doesn’t disconnect. It also happens more often when its waiting for a password, and after entering it, it gets disconnected. If I leave the terminal open (without entering any command) it can last for hours without getting disconnected, so I don’t think its related to “keep alive” or a “time out” issue. In the same way, if the commands entered return few lines (an no much delay), it has no problems.
In general we don’t have any issues in the local network or browsing sites, its mainly the above issues.
This is my firewall configuration:
0 D ;;; special dummy rule to show fasttrack counters
chain=forward action=passthrough
1 ;;; defconf: accept established,related,untracked
chain=input action=accept connection-state=established,related,untracked
2 ;;; Enable Mikrotik SSH
chain=input action=accept protocol=tcp dst-port=22 log=yes log-prefix=""
3 ;;; allow IPsec NAT
chain=input action=accept protocol=udp dst-port=4500
4 ;;; allow IKE
chain=input action=accept protocol=udp dst-port=500
5 ;;; allow l2tp
chain=input action=accept protocol=udp dst-port=1701
6 ;;; PPTP
chain=input action=accept protocol=tcp dst-port=1723
7 ;;; allow sstp
chain=input action=accept protocol=tcp dst-port=443 log=no log-prefix=""
8 ;;; defconf: drop all not coming from LAN
chain=input action=drop in-interface-list=!LAN log=no log-prefix="NOTLAN"
9 ;;; Block WIN Input LAN
chain=input action=drop src-address-list=win_servers dst-address-list=local_network log=yes log-prefix="WIN-IN-"
10 ;;; Block SSH from outside
chain=input action=drop protocol=tcp in-interface=pppoe-out dst-port=22 log=no log-prefix="NOSSH"
11 ;;; defconf: accept ICMP - PING
chain=input action=accept protocol=icmp log=no log-prefix=""
12 ;;; defconf: drop invalid
chain=input action=drop connection-state=invalid log=no log-prefix="INVALID"
13 ;;; defconf: fasttrack
chain=forward action=fasttrack-connection connection-state=established,related
14 ;;; defconf: accept in ipsec policy
chain=forward action=accept ipsec-policy=in,ipsec
15 ;;; defconf: accept out ipsec policy
chain=forward action=accept ipsec-policy=out,ipsec
16 ;;; Block Windows Server to LAN
chain=forward action=drop src-address-list=win_servers dst-address-list=local_network log=yes log-prefix="WIN-"
17 ;;; defconf: drop all from WAN not DSTNATed
chain=forward action=drop connection-state=new connection-nat-state=!dstnat in-interface-list=WAN log=no log-prefix="FWD-NODST"
18 ;;; Drop incoming packets that are not NATted
chain=forward action=drop connection-state=new connection-nat-state=!dstnat in-interface=ether1-WAN log=no log-prefix="!NAT"
19 ;;; defconf: drop invalid
chain=forward action=drop connection-state=invalid log=no log-prefix="FWD-INVALID"
Whereas MTU-related issues are responsible for most mysterious behaviours like this one, as you say that you can keep the connections alive by generating extra data, I’d rather expect some firewall along the path between the client and the server to have unusually short timeouts. So if there is nothing but the Mikrotik between the client and server between which you experience the issue, what’s the output of /ip firewall connection tracking print? Bear in mind that there may also be firewalls on the client and/on server themselves.
This is happening with several servers. I have control of those servers, so I can assure you that those servers are not blocking anything special. Also I have tried with several computers inside the local network with the same results. If there was some firewall causing that issue at the server side, it would happen the same when I’m connected outside the local network that I have talked about. In summary, these issues only happen in that network, and mikrotik is the only device that has changed (before it didn’t happen).
same like the default values on my own hAP ac² except the max-entries which is 444768 in my case. What RouterOS version do you run?
What means “has changed”? Did you have a router from a different vendor before, or a different model of Mikrotik, or a different RouterOS on the hAP ac²?
Besides, I have noticed that you have rules for IPsec but at the same time you have unrestricted fasttracking rule in forward chain. Do you use IPsec policies to directly route transit (i.e. forwarded) packets via IPsec or you only use IPsec to encrypt transport packets of tunneling protocols? Fasttracking used to be incompatible with IPsec policies forwarding packets with older versions of RouterOS and mysteriously isn’t any more with 6.44.3.
same like the default values on my own hAP ac² except the max-entries which is 444768 in my case. What RouterOS version do you run?
The version is the last one (stable): 6.44.3.
What means “has changed”? Did you have a router from a different vendor before, or a different model of Mikrotik, or a different RouterOS on the hAP ac²?
Before, we had another mikrotik router (hEX RB750Gr2), and it was working fine. However, we were not using VPN. The settings were exported from that one into the new one. Things didn’t go well, so I reset the configuration in the new device and just copy the configuration that I needed (like firewall, PPPoE settings, DHCP and DNS). After that I added the VPN. I’m not sure if the VPN setup has anything to do with this issue, but its the main difference with my previous setup. Removing the VPN at this point would be difficult as it is being frequently used.
Besides, I have noticed that you have rules for IPsec but at the same time you have unrestricted fasttracking rule in forward chain. Do you use IPsec policies to directly route transit (i.e. forwarded) packets via IPsec or you only use IPsec to encrypt transport packets of tunneling protocols? Fasttracking used to be incompatible with IPsec policies forwarding packets with older versions of RouterOS and mysteriously isn’t any more with 6.44.3.
I’m using IPsec to encrypt L2TP protocol. I don’t have much experience with those protocols… I follow some guides to set it up. I tried to keep it simple. The main VPN usage is to connect to the local network from outside (no forwarding to other networks or anything more complex).
I couldn’t find the reason. I reset all the configuration and added everything back manually except for the VPN settings (I will do that other day). So far the issue seems to be gone.
I have a similar issue of getting TCP RST for http connections. Then I discovered it maybe a issue of insufficient power via PoE. My ethernet cable maybe too long and too thin there maybe an issue of powering the device. After I plug the device directly to power and the issue went away.