Page 1 of 1

Packet loss just on 443 port

Posted: Tue Sep 10, 2019 7:43 pm
by fer05
I have a bad problem with my web system, well, come on ...

We provide customers with a web management system where we have thousands of simultaneous customers. On customers who have any Mikrotik equipment, we are experiencing issues with errors and slowdowns due to the packet losses we have been experiencing ONLY on port https 443.
Drops happen more during peak hours, we realize about 3 to 7% losses with the paping tool (in ALL customers using mikrotik) already in the others, no losses occur or they are irrelevant (0 ~ 1% loss)

NOTE: ICMP ping is not lost.

On two of these clients, we configured another router as an edge, and the problem resolved.
Further reinforcing it might be something in Mikrotik.

Where to start? Has anyone been through this?

Re: Packet loss just on 443 port

Posted: Tue Sep 10, 2019 10:46 pm
by pe1chl
Did you ping with 1400-1500 byte packet size?
It is quite normal that default-size ping packets get through without problem, but larger packets as used with TCP links (1500 bytes) get dropped on bad links.

Re: Packet loss just on 443 port

Posted: Tue Sep 10, 2019 10:57 pm
by andriys
Feels like a potential PMTUD issue.

Re: Packet loss just on 443 port

Posted: Wed Sep 11, 2019 8:00 pm
by fer05
Feels like a potential PMTUD issue.
We were very confident that it was something MTU related, so we reduced the MTU to 1280 on our side and on the client side as well, but showed no improvement.

Is there anything more specific about mikrotiks that may be influencing?

Re: Packet loss just on 443 port

Posted: Wed Sep 11, 2019 8:01 pm
by fer05
Did you ping with 1400-1500 byte packet size?
It is quite normal that default-size ping packets get through without problem, but larger packets as used with TCP links (1500 bytes) get dropped on bad links.
Yes, in some cases we realized that the MTU was not reaching 1500, then we decreased to 1280, but the problems continued.

Re: Packet loss just on 443 port

Posted: Wed Sep 11, 2019 9:33 pm
by andriys
The proper way to deal with the PMTUD issues is not to change MTU on either side, but rather to make sure you do not drop (block) ICMP messages that should not be dropped. A rather widespread workaround is to use TCP MSS clamping on the router (which some people consider an ugly hack- and for a reason).

Re: Packet loss just on 443 port

Posted: Sat Sep 14, 2019 3:07 pm
by msatter
Change MSS

It is a well known fact that VPN links have smaller packet size due to encapsulation overhead. A large packet with MSS that exceeds the MSS of the VPN link should be fragmented prior to sending it via that kind of connection. However, if the packet has DF flag set, it cannot be fragmented and should be discarded. On links that have broken path MTU discovery (PMTUD) it may lead to a number of problems, including problems with FTP and HTTP data transfer and e-mail services.

In case of link with broken PMTUD, a decrease of the MSS of the packets coming through the VPN link solves the problem. The following example demonstrates how to decrease the MSS value via mangle:
/ip firewall mangle 
add out-interface=pppoe-out protocol=tcp tcp-flags=syn action=change-mss new-mss=1300 chain=forward tcp-mss=1301-65535
From what I have read th DF flag is used but is DF it always active on sync packets?

Re: Packet loss just on 443 port

Posted: Sat Sep 14, 2019 3:27 pm
by pe1chl
The proper way to deal with the PMTUD issues is not to change MTU on either side, but rather to make sure you do not drop (block) ICMP messages that should not be dropped. A rather widespread workaround is to use TCP MSS clamping on the router (which some people consider an ugly hack- and for a reason).
The problem is that even when you do not drop ICMP messages yourself, there may be other places in the network towards the server you connect that do that...
From the problem description above it is not clear to me if there potentially are other parties involved, but it is quite common for those that have smaller MTU than 1500 to experience random problems when visiting sites (many sites work OK, some do not) due to incompetent system administrators elsewhere in the network.

And even when that is all OK, not all operating systems show reasonable behavior when ICMP frag needed is received. Some do just retransmit the current segment in fragments, but then they continue sending large segments immediately or quickly thereafter. As there usually is some rate throttling on ICMP messages, this results in extreme throughput problems on such connections.

As ugly as it is, the TCP MSS clamping workaround is the most effective workaround in place.
I am guilty of "inventing" it in 1995, but I did not publish it widely and did not patent it... does anyone know of others implementing this before august 1995?
(I noticed that e.g. in Cisco IOS this appeared much later)

Another possible "ugly hack" is to clear DF on large packets entering a VPN or other tunnel:
add action=clear-df chain=postrouting comment=\
    "clear DF on large packet to GRE" out-interface-list=gretun packet-size=\
    1477-1500 passthrough=yes
After all, it is the DF flag (that is being used to implement PMTUD) that is causing all the trouble. Without this, the packets are just fragmented and routed on. PMTUD was implemented to avoid the inefficiency of fragmentation, but unfortunately it often causes the frustration of non-working applications. You select what you prefer.

Re: Packet loss just on 443 port

Posted: Mon Sep 16, 2019 8:05 pm
by fer05
The proper way to deal with the PMTUD issues is not to change MTU on either side, but rather to make sure you do not drop (block) ICMP messages that should not be dropped. A rather widespread workaround is to use TCP MSS clamping on the router (which some people consider an ugly hack- and for a reason).
The problem is that even when you do not drop ICMP messages yourself, there may be other places in the network towards the server you connect that do that...
From the problem description above it is not clear to me if there potentially are other parties involved, but it is quite common for those that have smaller MTU than 1500 to experience random problems when visiting sites (many sites work OK, some do not) due to incompetent system administrators elsewhere in the network.

And even when that is all OK, not all operating systems show reasonable behavior when ICMP frag needed is received. Some do just retransmit the current segment in fragments, but then they continue sending large segments immediately or quickly thereafter. As there usually is some rate throttling on ICMP messages, this results in extreme throughput problems on such connections.

As ugly as it is, the TCP MSS clamping workaround is the most effective workaround in place.
I am guilty of "inventing" it in 1995, but I did not publish it widely and did not patent it... does anyone know of others implementing this before august 1995?
(I noticed that e.g. in Cisco IOS this appeared much later)

Another possible "ugly hack" is to clear DF on large packets entering a VPN or other tunnel:
add action=clear-df chain=postrouting comment=\
    "clear DF on large packet to GRE" out-interface-list=gretun packet-size=\
    1477-1500 passthrough=yes
After all, it is the DF flag (that is being used to implement PMTUD) that is causing all the trouble. Without this, the packets are just fragmented and routed on. PMTUD was implemented to avoid the inefficiency of fragmentation, but unfortunately it often causes the frustration of non-working applications. You select what you prefer.
Very well, I was connected to one of the clients, where I created the MSS mangle rules, but I was not successful. Losses continued only to the web system address. (2% to 3% losses)

The ping to the gateway is always <1 as well as the route pings to our datacenter are always stable <30

Any new ideas?

Re: Packet loss just on 443 port

Posted: Thu Oct 03, 2019 7:54 pm
by fer05
any ideas?

Re: Packet loss just on 443 port

Posted: Thu Oct 03, 2019 10:11 pm
by pe1chl
You provide zero information about your network configuration and details of findings, and reply to all suggestions with "no, its not that".
Then at some point the inputs will cease. You are on your own.

Re: Packet loss just on 443 port

Posted: Thu Oct 03, 2019 11:02 pm
by fer05
You provide zero information about your network configuration and details of findings, and reply to all suggestions with "no, its not that".
Then at some point the inputs will cease. You are on your own.
Can you give me a contact email? As we deal with public and often "sensitive" information, I do not have total security to explain in the forum. I would love a help, I can provide you the export of the equipment, for a more careful analysis.

Re: Packet loss just on 443 port

Posted: Thu Oct 03, 2019 11:39 pm
by CZFan
Then maybe it will s time to hire a certified consultant