I don't understand how adding more options of fields to feed into the hashing algorithm solves your problem. Connections are specific to a TCP 4-tuple consisting of the two IP addresses and ports. Using the connection identifier would mean each connection would potentially be on a different circuit, potentially breaking secure sites.
That's why we use "src address only" as classifier. But OK, I will to explain better.
We use "src-address only" in PCC (other options make ´breaking´ connections and are thus not recommended.)
Now, client (always same IP) will always produce same hash output and if the result is a routing decision his traffic will always get same route out of my router.
Like your article explained:
quote: the hash function is fed 1.1.1.1 as the source IP address, 10000 as the source TCP port, 2.2.2.2 as the destination IP address and 80 as the destination TCP port. The output will be 1+1+1+1+10000+2+2+2+2+80 = 10092, the last digit of that is 2, so the hash output is 2. It will produce 2 every time it is fed that combination of IP addresses and ports. unqoute.
Now, if we would add a new digit "1" produced by conn tracker for new connections coming from same client and this new digit is the same as long as the IP of this client has connections alive, even in waiting state.
Then, if client doesn't produce any connections anymore for some time, and old ones all have timed out in connection tracker, the ROS has to monitor connection activity for that source address (so router needs to maintain a base of srce addresses) and then next time when this srce address produces new connections again it will get a new digit "2".
So, example as in variation to your lines of your article:
the hash function is fed 1.1.1.1 as the source IP address, 10000 as the source TCP port, 2.2.2.2 as the destination IP address and 80 as the destination TCP port and a one (NEW) digit "1". The output will be now 1+1+1+1+10000+2+2+2+2+80+1 = 10093, the last digit of that is 3, so the hash output is 3. It will produce 3 every time it is fed that combination of IP addresses and ports.
Now, after some time out the last digit will be changed into "2". The hash output will be changed into "4" for otherwise identical connections so the PCC will have a different output and mangle can then give it different WAN routing mark and connection of client will start to use new different WAN port!
In this scenario time out setting of connn tracker becomes a factor too.
The shorter timeouts are the more often a new digit will be produced for certain src address and the better load balancing probability will be achieved.
Resuming:
We are looking to give hash function different digit that is based on time factor (or any other factor not related to other 4 conn tracker parameters).
What do you think of this so far?