PCC (per-connection-classifier) hash not good with ports ?

I could not find description of how PCC include addresses and ports in its hash, but from my testing it appears that they they include port numbers inside data to be hashed, something like:
PCC_hash= Hash( seed? & src_addr & dst_addr & src_port & dst_port) , where ‘&’ is string concatenation

Problem with that approach is when we want to use PCC for load balancing (most obvious use). When we use PCC to load balance over two connections, most logical advantages should be for processes that open multiple connections to download or upload something , and those multiple connections have all other data same except for src_port , which tend to increment by one ( so for example, if you use speedtest.net , it will open connection on 6 consecutive ports ).

If we have two similar links, in above example we would like Speedtest.net to use 3 connections on one ISP link and 3 connections on another. We would use something like per-connection-classifier=both-addresses-and-ports:2/0 to ensure src_port is included in hash.

So, what is the problem? Due to how Mikrotik calculate hash, very often N connections would NOT be split evenly with N/2 going to ISP1 and N/2 going to ISP2. In our speedtest case, very often we would have 4 connections to one ISP and 2 to other. That is because both even and odd ports have equal chance to produce even hash. So when speedtest uses ports 52900, 52901,52902,52903,52904,52905 ( 3 even and 3 odd), hash function can end up with 4 odd and 2 even numbers, or even all odd numbers in rare cases - and then all connections would go over single ISP. But in many cases it would split unevenly.

This problem gets only worse for 3 or more connections - chance that PCC will split 6 consecutive source ports evenly across 3 outgoing ISP links (per-connection-classifier=both-addresses-and-ports:3/0 , :3/1, :3/2 ) is even lower.

Suggested solution: Change PCC_hash to add ports after hash is calculated ( or at least add src_port afterwards). For example:
Better_PCC_hash= Hash( seed? & src_addr & dst_addr & dst_port) + src_port , where ‘+’ is integer addition

In this way, when we have source ports as even-odd-even-odd-even-odd ( like they are almost always when same app use multiple connections ) , we are guaranteed to have similar alternating odd/even hashes ( with guaranteed 3 odd and 3 even hashes ). Same logic goes for 3 or more connections - hash would be evenly split across links if source ports were consecutive.

Every source port is a new connection so why not use NTH with connection-state=new then? So you have a even spread. And with each spreader you can don’t have to filter NTH/PPC on the last line.
Lets say you use PPC with 5 connections then you have 0/5 1/5 2/5 3/5 and you can omit 4/5.

While I understand that load balancing can be done without PCC ( for example, with NTH), my post was about potential issue that PCC has with load balancing ( and easy way to fix it ).

Are you looking for better speedtest results in lab conditions, or real life improvements? Because I don’t think it would help much with the latter.

I don’t know what exactly speedtest does, but it makes sense that it would open several connections, each transferring same amount of data. So if nothing else is connecting elsewhere, it will use consecutive port numbers and what you’re proposing would help.

But for regular use, one connection will be downloading 1kB, another 10MB, often each from different servers with different speed, etc. So the result will be similar to what you have now, statistically it will work, with many connections over long time the distribution will be more or less equal. But short term with few connections will still be highly unbalanced.

I’m looking for real life improvements, in cases where real life applications use multiple connections. Some examples:

  • Chrome with #enable-parallel-downloading , to speed up any downloading via browser.
  • Steam by default uses dozens of connections when installing games
  • Speedtest.net by default use 6 connections for download/upload each ( this is on boundary between lab and real life usage)
  • BitTorrent client ( even with single peer they can use multiple connections )

And effect of PCC splitting that I mentioned in my initial post could especially be felt in Chrome case, which appears to open only 3 connections when #enable-parallel-downloading is enabled ( by default, Chrome only uses one connection for download). With those 3 connections (which are in consecutive ports like 55900, 55901, 55902) it is quite probable that 3 PCC hash values would all end up even or all end up odd - thus making chrome download over same ISP even when #enable-parallel-downloading is used. If Mikrotik uses actual hash function, then in around 25% of cases (000,111 out of 8 combos) Chrome would download slowly over single ISP even when we load balance to two ISPs.

In Speedtest.net case it is less problematic, since they use 6 connections - and while ideal would be 3:3 split across two ISPs, even 4:2 is not so bad (although it can be if one ISP is limiting per connection ), and only bad case would be rare 6:0 split .

I’m not sure about BitTorrent client - with many peers they certainly can use many connections and any “uneveness” wouldn’t be a problem. But with single peer they are probably limited to small number of connections ( there are only 8 listening ports by default), in which case it would have same impact as for Chrome downloads.

In Steam case it is practically not an issue, since it appears that Steam uses dozens of connections (at least for large games I installed for tests), so even bad/uneven split due to PCC hash will still end up with enough connections over both ISPs.

I’m sure there are other apps which use parallel downloads, and especially vulnerable to uneven hashes are those that open relatively small number of parallel connections (Chrome, single peer torrents…). My point was that making PCC hash numbers consecutive for consecutive source ports could have real life benefits for anyone who uses PCC to load balance on Mikrotik.