I have an EoIP tunnel between a Mikrotik CCR1009-7G-1C-1S+PC (site A) and a RB4011iGS+RM (site B) router.
I have run out of the ideas what to check, why I experience low bandwidth: cca 28 Mbps instead of 1 Gbps.
Is there any idea what can be wrong? If needed I can attach config of both side.
About the config in a nutshell:
- SiteA has 1Gbps symmetric fiber connection. SiteB has 1Gbps / 25 Mbps coaxial. I want to see near to 1 Gbps file copy from site A to site B.
- I set EoIP tunnel as a VLAN trunk.
- IPSec passwrod is set on the EoIP tunnel.
- There is an additional NAT bridge to deal with dynamic IP addresses at both side.
- SiteA has a PPPOE Internet conenction with 1480 MTU (automatically set).
- MTU of EoIP tunnel set manually to 1300. If I let it set automatically, both end have different automatic MTU value which is definitely strange.
What I have checked so far:
- Per core CPU utilization on both site: while I'm copying a file from a samba server at site A to another Linux machine at site B, one CPU core reach 90-100% for 1-2 seconds then go back under 10%. I see this repeating continuously. Overall CPU utilization is always under 15%, however, I also have a user manager (radius server) on my site A router. I think this is fully okay, right?
- Set lower MTU on the hosts where I copy from and to (on the samba server and on the samba client Linux hosts): EoIP has 1300 MTU with 1250 MSS clamp so I have set 1250 MTU then I restarted smbd on the server host. Nothing changed to 1500 MTU was set back on both hosts. This means for me that not the MTU is the problem, or if it is even the problem then I can't see why this huge bandwidth loss just because of the MTU (how to calculate this loss based on the MTU value?).
- if I try samba copy from the same samba server but inside Site A NOT via EoIP tunnel, it is as expected: 640-720Mbps (80-90 MBps) which is quite near to 1 Gbps. This says for me that NOT the samba server itself is slow.
- I checked hardware documentations on mikrotik.com. In case of CCR1009-7G-1C-1S+PC, docs say that 133.5 Mbps (26,6 MBps) speed with the smallest 64 byte packet (?) length and AES-256-CBC + SHA256 on IPSec. 97.4 Mbps (12.175 MBps) was given for the same for the other router (RB4011iGS+RM). In reality, I experience 24-32Mbps (3-4MBps) speed which is anyway much slower (cca 70% drop), especially because I feel I should check speeds in the 512 byte column of the docs which is even faster.
- I have checked switch stats too. In case of CCR1009-7G-1C-1S+PC, "/interface ethernet switch print stats" has empty output. I guess it is because there is no dedicated switch ship in it. Right? In case of the other router, which is a RB4011iGS+RM router, I see something like this (not so meaningful for me):
[admin@siteB] > /interface ethernet switch print stats name: switch1 switch2 driver-rx-byte: 14 521 336 835 23 698 153 721 driver-rx-packet: 19 510 463 22 653 094 driver-tx-byte: 26 127 861 597 13 200 576 097 driver-tx-packet: 23 026 151 19 287 656 rx-bytes: 14 404 274 057 23 562 247 761 rx-packet: 19 510 463 22 653 368 rx-too-short: 0 0 rx-64: 0 274 rx-65-127: 1 033 305 3 608 906 rx-128-255: 8 540 998 1 088 897 rx-256-511: 544 965 225 193 rx-512-1023: 441 236 112 681 rx-1024-1518: 8 234 017 17 423 493 rx-1519-max: 715 942 193 924 rx-too-long: 0 0 rx-broadcast: 139 66 641 rx-pause: 0 274 rx-multicast: 91 742 56 665 rx-fcs-error: 0 0 rx-align-error: 0 0 rx-fragment: 0 0 rx-length-error: 0 0 rx-jabber: 0 0 rx-drop: 0 0 tx-bytes: 25 805 507 167 12 930 548 913 tx-packet: 23 026 405 19 287 656 tx-broadcast: 112 309 172 tx-pause: 254 0 tx-multicast: 1 111 218 176
- I checked what is network throughput if I copy a file via apache2 on top of HTTPS with NO EoIP tunnel (so via the public IP). I experienced this way 160-240Mbps (20-30 MBps) throughput. It is still under the expected 1 Gbps, but I can accept this as this was even much higher sometimes (close to the the 1Gbps with 2 threads). I feel this is also okay.
- Then, finally I have checked bandwidth with the Mikrotik own bandwidth tool. Bandwidth server was run on the SiteA router, client on the SiteB router. If I set remote-udp-tx-size and local-udp-tx-size to 1250, I got a correct speed, but otherwise.
[admin@SiteB] > /tool bandwidth-test address=192.168.0.254 dire ction=receive remote-udp-tx-size=1250 local-tx-speed=1250 protocol=udp status: running duration: 15s rx-current: 449.0Mbps rx-10-second-average: 445.8Mbps rx-total-average: 399.6Mbps lost-packets: 4908 random-data: no direction: receive rx-size: 1250 connection-count: 20 local-cpu-load: 26% remote-cpu-load: 48% [admin@SiteB] > /tool bandwidth-test address=192.168.0.254 dire ction=receive protocol=udp status: running duration: 6s rx-current: 0bps rx-10-second-average: 0bps rx-total-average: 0bps lost-packets: 0 random-data: no direction: receive rx-size: 1500 connection-count: 20 local-cpu-load: 0% remote-cpu-load: 2% [admin@SiteB] > /tool bandwidth-test address=192.168.0.254 dire ction=receive remote-udp-tx-size=1250 local-tx-speed=1250 protocol=tcp status: running duration: 13s rx-current: 0bps rx-10-second-average: 0bps rx-total-average: 48bps random-data: no direction: receive connection-count: 20 local-cpu-load: 0%
$ ping 192.168.0.6 PING 192.168.0.6 (192.168.0.6) 56(84) bytes of data. 64 bytes from 192.168.0.6: icmp_seq=1 ttl=64 time=17.8 ms 64 bytes from 192.168.0.6: icmp_seq=2 ttl=64 time=20.1 ms 64 bytes from 192.168.0.6: icmp_seq=3 ttl=64 time=25.6 ms 64 bytes from 192.168.0.6: icmp_seq=4 ttl=64 time=14.3 ms 64 bytes from 192.168.0.6: icmp_seq=5 ttl=64 time=17.3 ms ^C --- 192.168.0.6 ping statistics --- 5 packets transmitted, 5 received, 0% packet loss, time 4006ms rtt min/avg/max/mdev = 14.339/19.033/25.573/3.756 ms