I have an EoIP tunnel between a Mikrotik CCR1009-7G-1C-1S+PC (site A) and a RB4011iGS+RM (site B) router.
I have run out of the ideas what to check, why I experience low bandwidth: cca 28 Mbps instead of 1 Gbps.
Is there any idea what can be wrong? If needed I can attach config of both side.
About the config in a nutshell:
- SiteA has 1Gbps symmetric fiber connection. SiteB has 1Gbps / 25 Mbps coaxial. I want to see near to 1 Gbps file copy from site A to site B.
- I set EoIP tunnel as a VLAN trunk.
- IPSec passwrod is set on the EoIP tunnel.
- There is an additional NAT bridge to deal with dynamic IP addresses at both side.
- SiteA has a PPPOE Internet conenction with 1480 MTU (automatically set).
- MTU of EoIP tunnel set manually to 1300. If I let it set automatically, both end have different automatic MTU value which is definitely strange.
What I have checked so far:
- Per core CPU utilization on both site: while I'm copying a file from a samba server at site A to another Linux machine at site B, one CPU core reach 90-100% for 1-2 seconds then go back under 10%. I see this repeating continuously. Overall CPU utilization is always under 15%, however, I also have a user manager (radius server) on my site A router. I think this is fully okay, right?
- Set lower MTU on the hosts where I copy from and to (on the samba server and on the samba client Linux hosts): EoIP has 1300 MTU with 1250 MSS clamp so I have set 1250 MTU then I restarted smbd on the server host. Nothing changed to 1500 MTU was set back on both hosts. This means for me that not the MTU is the problem, or if it is even the problem then I can't see why this huge bandwidth loss just because of the MTU (how to calculate this loss based on the MTU value?).
- if I try samba copy from the same samba server but inside Site A NOT via EoIP tunnel, it is as expected: 640-720Mbps (80-90 MBps) which is quite near to 1 Gbps. This says for me that NOT the samba server itself is slow.
- I checked hardware documentations on mikrotik.com. In case of CCR1009-7G-1C-1S+PC, docs say that 133.5 Mbps (26,6 MBps) speed with the smallest 64 byte packet (?) length and AES-256-CBC + SHA256 on IPSec. 97.4 Mbps (12.175 MBps) was given for the same for the other router (RB4011iGS+RM). In reality, I experience 24-32Mbps (3-4MBps) speed which is anyway much slower (cca 70% drop), especially because I feel I should check speeds in the 512 byte column of the docs which is even faster.
- I have checked switch stats too. In case of CCR1009-7G-1C-1S+PC, "/interface ethernet switch print stats" has empty output. I guess it is because there is no dedicated switch ship in it. Right? In case of the other router, which is a RB4011iGS+RM router, I see something like this (not so meaningful for me):
Code: Select all
[admin@siteB] > /interface ethernet switch print stats
name: switch1 switch2
driver-rx-byte: 14 521 336 835 23 698 153 721
driver-rx-packet: 19 510 463 22 653 094
driver-tx-byte: 26 127 861 597 13 200 576 097
driver-tx-packet: 23 026 151 19 287 656
rx-bytes: 14 404 274 057 23 562 247 761
rx-packet: 19 510 463 22 653 368
rx-too-short: 0 0
rx-64: 0 274
rx-65-127: 1 033 305 3 608 906
rx-128-255: 8 540 998 1 088 897
rx-256-511: 544 965 225 193
rx-512-1023: 441 236 112 681
rx-1024-1518: 8 234 017 17 423 493
rx-1519-max: 715 942 193 924
rx-too-long: 0 0
rx-broadcast: 139 66 641
rx-pause: 0 274
rx-multicast: 91 742 56 665
rx-fcs-error: 0 0
rx-align-error: 0 0
rx-fragment: 0 0
rx-length-error: 0 0
rx-jabber: 0 0
rx-drop: 0 0
tx-bytes: 25 805 507 167 12 930 548 913
tx-packet: 23 026 405 19 287 656
tx-broadcast: 112 309 172
tx-pause: 254 0
tx-multicast: 1 111 218 176
- I checked what is network throughput if I copy a file via apache2 on top of HTTPS with NO EoIP tunnel (so via the public IP). I experienced this way 160-240Mbps (20-30 MBps) throughput. It is still under the expected 1 Gbps, but I can accept this as this was even much higher sometimes (close to the the 1Gbps with 2 threads). I feel this is also okay.
- Then, finally I have checked bandwidth with the Mikrotik own bandwidth tool. Bandwidth server was run on the SiteA router, client on the SiteB router. If I set remote-udp-tx-size and local-udp-tx-size to 1250, I got a correct speed, but otherwise.
Code: Select all
[admin@SiteB] > /tool bandwidth-test address=192.168.0.254 dire
ction=receive remote-udp-tx-size=1250 local-tx-speed=1250 protocol=udp
status: running
duration: 15s
rx-current: 449.0Mbps
rx-10-second-average: 445.8Mbps
rx-total-average: 399.6Mbps
lost-packets: 4908
random-data: no
direction: receive
rx-size: 1250
connection-count: 20
local-cpu-load: 26%
remote-cpu-load: 48%
[admin@SiteB] > /tool bandwidth-test address=192.168.0.254 dire
ction=receive protocol=udp
status: running
duration: 6s
rx-current: 0bps
rx-10-second-average: 0bps
rx-total-average: 0bps
lost-packets: 0
random-data: no
direction: receive
rx-size: 1500
connection-count: 20
local-cpu-load: 0%
remote-cpu-load: 2%
[admin@SiteB] > /tool bandwidth-test address=192.168.0.254 dire
ction=receive remote-udp-tx-size=1250 local-tx-speed=1250 protocol=tcp
status: running
duration: 13s
rx-current: 0bps
rx-10-second-average: 0bps
rx-total-average: 48bps
random-data: no
direction: receive
connection-count: 20
local-cpu-load: 0%
Code: Select all
$ ping 192.168.0.6
PING 192.168.0.6 (192.168.0.6) 56(84) bytes of data.
64 bytes from 192.168.0.6: icmp_seq=1 ttl=64 time=17.8 ms
64 bytes from 192.168.0.6: icmp_seq=2 ttl=64 time=20.1 ms
64 bytes from 192.168.0.6: icmp_seq=3 ttl=64 time=25.6 ms
64 bytes from 192.168.0.6: icmp_seq=4 ttl=64 time=14.3 ms
64 bytes from 192.168.0.6: icmp_seq=5 ttl=64 time=17.3 ms
^C
--- 192.168.0.6 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4006ms
rtt min/avg/max/mdev = 14.339/19.033/25.573/3.756 ms