We have launched new BGP CHR server instance (v6.37.1) in December. We've obtained 60 days P unlimited licence.
Everything has been working like a dream during trial period. No any packet loss, no hangs, throughput was perfect, BGP sessions uptime was >30 days. All perfect.
Then, we decided to buy P1 licence, as our router interfaces are 1GBit only for now.
After installing licence, some of our customers started to complain of VPN disconnections and hangs.
We've found packet loss, which were caused by CHR instance. Actual loss was between 0.5 and 5%. For example:
--- 188.8.131.52 ping statistics --- 1000 packets transmitted, 987 received, 1% packet loss, time 11987ms 489 packets transmitted, 480 received, 1% packet loss, time 5968ms 1000 packets transmitted, 996 received, 0% packet loss, time 11987ms 1000 packets transmitted, 996 received, 0% packet loss, time 11987ms 1000 packets transmitted, 994 received, 0% packet loss, time 12948ms 1000 packets transmitted, 993 received, 0% packet loss, time 11996ms
Loss rate rised again after crossing 80mbit+.
Our conslusion was - there is either some underlying queue on CHR instance, which dropping packets, or HW problem.
We 've carefully examined ESXI stats and Mikrotik interfaces/vlans stats: no drops, no errors. Nothing.
Then, I have created new CHR instance, installed same 6.37.1 version, obtained new P unlimited trial for 60 days, imported all settings from P1 licenced server.
Voila, everything backs to normal, no VPN drops, no LOSS!
--- 184.108.40.206 ping statistics --- 1000 packets transmitted, 1000 received, 0% packet loss, time 13303ms 1000 packets transmitted, 1000 received, 0% packet loss, time 13238ms 1000 packets transmitted, 1000 received, 0% packet loss, time 12970ms
1. There have to be some underlying, hidden queue in P1 and most likely P10 licence, which limiting interface speed to licence level speed, and is most likely improperly configured and dropping packets, even when our interface rates are much smaller than 1GBit ceiling.
2. Mikrotik engineers should NOT place any underlying queues on bottom of CHR. As for me - this is unprofessional and should never happen. Interface limitation should be done by software/driver modification, so that interface should _not negotiate speeds above licenced ones_. For example by shadowing "advertisement link speeds" and allowing to link only speeds below 1000mbps fdx for example, and applying apporiate patch for driver or driver interface between winbox and driver. This should be best way, as it's not harming and influencing router passed traffic any way.
3. I have created support ticket, however it's not processed (and most likely will not be processed) because they ask for supout.rif. Sorry, i can't run old buggy P1 instance on production server.
4. Besides of packet loss experienced on P1 licenced instance, we didn't had any other problems with VoIP or games (at least customers didn't reported that). However, Apollo Games machines UDP VPN tunnel seems to suffered a lot (probably also a design fault on this machines - seems like UDP over UDP, and they don't use TCP on bottom VPN layer maybe).. So you may experience loss on your instance, and even don't know that.