Hex E50UG

Eight years old MMIPS device works fine without any special queue change out of the box, yet not even year old ARM device requires it to work properly?? And if queue change is “required” for normal operation, why is it not the default? This feels like red herring/waste of time.

Edit: All other linked threads about this complaint show very low CPU usage along with the low bandwidth results - this is not a queuing problem.

Please try and hopefully confirm. Then default config could then be updated as well…

As Normis stated, if you are experiencing issues, then send a supout to MT, and the more data received, will lead to faster identification of the issue and the possible resolutions.

As I stated earlier, I don’t have a device at my disposal to test, so I’m only speculating.

Regarding whether it’s a queuing issue: it has all the tell tale signs of one. TCP requires that a certain (some proportion of the window size) packets be able to be queued. For ports connected through the switch, reports show that the device functions as its predecessor i.e. correctly. The switch chip has a hardware buffer - I tried looking up the exact size, but was not successful in short order, but usually these chips have around 1Mb (megabits) of buffer, which is on the order of 100 packets. This is usually necessary to maintain Gb performance. Standalone MACs usually have some fraction of this and rely on the host for queuing.

Regarding CPU load, that’s exactly what is expected: because the TCP stack sees packet loss, the window size is reduced. So even though the device doesn’t run out of power, speeds are reduced. That’s the usual outcome of insufficient buffering.

Also, at least one user confirms much higher speeds when configuring cake queues (which are actively managed, and thus are able to absorb these bursts.)

If the driver works correctly, the interface should report tx drops. In TCP testing only a minimal number of losses would get reported. And again, only if the driver reports these correctly.

Again, I’m not sure about any of this, but someone who actually has a device they can play around with could easily test this.

EDIT: Of course there’s the other possibility, that it’s actually the switch side that’s dropping the packets… In that case some further diagnostics is needed.

I have checked it now and it did not solve my problem. download speed is not more than 100 Mbit, while upload speed is up to 600 Mbit. The problem is observed only on ether1 and with pppoe.

Please send supout to MT, to get this fixed asap.

Anav, it seems like Liiina already did that? Should he create a second ticket for same issue, you think?

If I have time later tonight I will factory default my E50UG, reproduce problem, and generate supout after speed test. Then I’ll try different factory default MikroTik (maybe L009 or hAP ax lite), confirm problem does not happen on that one, generate second supout and send both to MikroTik support. Perhaps they will be able to compare/diff the two of them to see something…

Edit: I will test E50UG against L009, since it’s port 1 is set up and same as E50UG - it has it’s own 1Gb/s dedicated path to CPU, very similar. Also, both are ARM with similar clock speed.

I have executed the factory default testing against an L009 and E50UG. Here is the setup:

300/300 Mbps symmetric internet service, provided by Verizon FiOS, FTTH.
Testing client is 2020 MacBook Air M1 laptop.
L009UiGS-2HaxD-IN was upgraded to 7.19.1 (package/routerboard) and reset to complete factory defaults.
E50UG was upgraded to 7.19.1 (package/routerboard) and reset to complete factory defaults.
Not a SINGLE setting was modified on either router, not even the default password.

The same ethernet cables and even the same power brick source were used for testing L009/E50UG.

Test #1 - MacBook Air M1, hooked directly into provider ONT via dongle/ethernet cable.

   Speedtest by Ookla
      Server: PhoenixNAP Global IT Services
         ISP: Verizon Fios
Idle Latency:    13.32 ms   (jitter: 0.58ms, low: 12.70ms, high: 13.87ms)
    Download:   306.56 Mbps (data used: 325.6 MB)
                114.65 ms   (jitter: 31.78ms, low: 12.78ms, high: 168.55ms)
      Upload:   341.12 Mbps (data used: 397.9 MB)
                 21.55 ms   (jitter: 1.16ms, low: 9.51ms, high: 50.92ms)
 Packet Loss:     0.0%

Test #2 - MacBook Air M1, plugged into L009UiGS port 2, L009UiGS port 1 is plugged directly into provider ONT.

   Speedtest by Ookla
      Server: Verizon
         ISP: Verizon Fios
Idle Latency:     4.36 ms   (jitter: 0.34ms, low: 4.13ms, high: 4.56ms)
    Download:   309.38 Mbps (data used: 312.2 MB)
                112.79 ms   (jitter: 31.42ms, low: 3.69ms, high: 157.20ms)
      Upload:   344.00 Mbps (data used: 401.9 MB)
                  7.66 ms   (jitter: 0.74ms, low: 5.90ms, high: 20.38ms)
 Packet Loss:     0.0%

Test #3 - MacBook Air M1, plugged into E50UG port 2, E50UG port 1 is plugged directly into provider ONT.
NOTE: I ran this test three times, because the results are pretty crazy…

   Speedtest by Ookla
      Server: Verizon
         ISP: Verizon Fios
Idle Latency:     9.03 ms   (jitter: 0.20ms, low: 8.81ms, high: 9.44ms)
    Download:   309.34 Mbps (data used: 292.8 MB)
                110.19 ms   (jitter: 30.95ms, low: 8.31ms, high: 163.19ms)
      Upload:     9.82 Mbps (data used: 6.4 MB)
                 46.99 ms   (jitter: 44.30ms, low: 5.72ms, high: 570.53ms)
 Packet Loss:     0.0%

   Speedtest by Ookla
      Server: Verizon
         ISP: Verizon Fios
Idle Latency:     8.81 ms   (jitter: 0.64ms, low: 8.44ms, high: 9.20ms)
    Download:   309.55 Mbps (data used: 273.9 MB)
                106.45 ms   (jitter: 28.09ms, low: 8.37ms, high: 161.82ms)
      Upload:    12.36 Mbps (data used: 8.1 MB)
                  8.30 ms   (jitter: 4.43ms, low: 6.35ms, high: 180.93ms)
 Packet Loss:     0.0%

   Speedtest by Ookla
      Server: Cox - Nova
         ISP: Verizon Fios
Idle Latency:     8.82 ms   (jitter: 2.18ms, low: 8.68ms, high: 13.05ms)
    Download:   306.94 Mbps (data used: 329.8 MB)
                113.20 ms   (jitter: 35.49ms, low: 8.99ms, high: 496.65ms)
      Upload:    11.31 Mbps (data used: 17.4 MB)
                 11.90 ms   (jitter: 23.08ms, low: 9.58ms, high: 624.44ms)
 Packet Loss:     3.1%

Uhh… yeah… I can only get 12Mbps upload speed on my E50UG… queue setting? Please….

I took supout.rif files from each router directly after testing and will open a ticket with both if @Normis thinks it can help.

EDIT: I went ahead and created a support ticket with both supout.rif attached…SUP-189386
@Normis - can you please have this straightened out, and restore our faith in both your products and your support team? Thanks in advance…

I seem to be aggravating people. This is a frustrating issue, and denying it or saying that everything is fine is genuinely not my intention. But…

I’ve actually attempted to look at what NIC and driver are used in Openwrt and other general Linuxes, and seem to have found the one. (With Mediatek/Airoha and others, with all the IP sharing, cross-licensing and mergers, sometimes one can be a bit confused as to what exactly is going on.)

As it turns out, the used NIC has a smallish packet buffer of 100 Kb (kilobits), and it is causing problems for other people as well. This means that software egress queuing is strictly necessary. It was also found that unless TX pause is enabled, hardware buffer overflows tend to happen, which can limit ingress TCP to below 40 Mbps (even when full gigabit is available.) With TX pause the ingress bandwidth is still limited, instead of the usual 900-950 Mbps, because of the pausing, this NIC can only sustain somewhere upwards of 800 Mbps over a Gb line - while not ideal, this is probably tolerable. This is a hardware limitation. (And if using the port for a WAN connection, I would also enable RX pause - some terminal equipment really likes their pause frames not to be ignored.)

1- 2x E50UG from difrent source and buy date- nothing change
2- try script from forum - nothing change
3- ticket to support - waiting
4- update/downgrade - nothing change
5- ETH1 must be WAN

@Szczepan: Since you seem to be having problems on the download side, the suggestion for you was to enable flow control:

/interface/ethernet/set [ find name=ether1 ] tx-flow-control=on rx-flow-control=on

Again, just worth a try. The openwrt guys had good luck with it.

Personally I can only dream about a 300/300 internet connection, but going with the good ol’ rule of thumb of 512 bytes packets with 25 firewall rule, I would look at the hex (RB750GR3):

Routing 25 ip filter rules 265.2

and would judge it as “too weak” (though we have seen that in real world it can manage this kind of connection).

Then I would check the hex refresh (newer, faster processor, 8x storage, 2x memory, for the SAME price) and see:

Routing 25 ip filter rules 498.1

and reasonably believe that it is a winner.

The note that the reported speeds are - unlike with the old model - only (maybe) reachable with finely tuned (undocumented) flow control (not in default config), and queues (not in default config) on wednesday nights, if there is full moon, on months with a R somehow escaped the product page and brochure :open_mouth: .

If I had bought one of those devices, I would probably demote it to a 4 port switch, but I wouldn’t be happy at all :frowning: .

One could conclude that MT does not conduct UPLOAD tests at all when producing throughput tests on their charts.
I say this because one might assume they use ether1 for testing and this issue would have been discovered long ago, prior to distribution.

@lurker888 I do appreciate any/all input about this problem, but my counterpoint to you is this:

If NIC buffer is problem, needs queues and flow-control or whatever, then how do you explain mirolm’s report of “it works fine for me”? He did not indicate he had to do anything at all special…

Or it could be that they only look at cumulative number (the Xena tests are supposed to be run on all ports) and don’t notice when a single port performs significantly different than the rest of ports? If cumulative routing throughput of hEX refresh is way less than sum of all ports’ theoretical max, then a misbehaving port can hide in bushes.

Support team just asked me “if I use port 2-5 for wan, do speeds improve?”…..

Well, yes, like every other person with this problem (I’ve seen at least a dozen reported, who knows how many not) if I use port 2 as my WAN the speeds definitely improve. But, that’s not really the point here…

When it comes to buffers timing is pretty important. In certain cases timing of the link partner might be just right not to expose local problems. E.g. if link partner is only capable of sending e.g. 800Mbps of traffic without (micro) bursts, then a problematic interface could manage to fetch ingress frames fast enough not to experience Rx drops. Even traffic with much lower average throughputs but with (micro) bursts might cause Rx drops … and drops in TCP communication mean severe degradation of end-to-end throughput.

I wonder what buffers on L009 are, seeing as that model gets me full throughput even though it’s a “weaker” CPU.

EDIT: I’m starting to suspect built in phy of EN7562CT is just cheap piece of crap.

@codelogic: The explanation is quite simple. He has hard-limited the download traffic to 560 Mbps with a codel queue. Apparently the device handles that fine.

More generally, TCP is known to suffer from this sort of meltdown style problem. It’s quite common to see e.g. a 2.5GbE port switched to a 1G port and have TCP speeds (in the 2.5->1 direction) anywhere from 100-300-700 Mbps. The same measurement repeated with either degrading the 2.5GbE port to 1GbE, or inserting a queue that limits bandwidth to 900 Mbps results in near-gigabit performance. This is the kind of problem happening here.

And different premises devices tickle these problems differently.

I’m not exactly sucking this idea out of thin air, the openwrt guys simply added a remark to the effect “to achieve best performance enable flow control”. And people on the relevant forum did complain that it should read “if you don’t want atrociously slow performance…” (somewhat understandably).

So basically I don’t really doubt that adequate results can be achieved by carefully increasing an evenly spaced UDP flow through the device - like how Mikrotik’s tests are documented to be done.

If flow control doesn’t help the device achieve normal download speeds, then probably the driver has a broken flow control implementation. (I saw no sign of this, the driver released under the GPL had no obvious such problem.)

The L009 contains a fairly decent switch chip that is used to interface to all of its ports. Of course problems don’t manifest there (as they also don’t on the switch-connected ports of the Refresh.)

I’m not trying to make excuses for the designers of the chip, but obviously the thing is built down to a price and intended for a fairly specific use case. If some things have to be done in software to work correctly, that’s not what the designers of these things would consider abnormal. Add to that, that as the MT7621, this part also has a fairly powerful (and in its class well designed) packet processor for NAT/PPPoE/PPPoE-over-vlan/etc. scenarios, and it was probably expected that this would handle the packets directly, instead of the CPU.