I’ve been using PCC for load balancing for a few years, and as the IPS for each connection speeds changed over time, I’ve used several different PCC ratios. Recently I’ve upgraded to a 1Gbps + 2Gbps and for the first time I’m using a 1:2 ratio.
Before (with ratios like 3:10, 3:5 and others), I was always able to test the “combined” bandwidth but running multiple simultaneous download or upload transfers, and I could get really close to the sum of bandwidth of both connections.
With a 1:2 ratio I can never achieve even close to the sum of the bandwidth. In fact I can only reach exactly the same bandwidth of the faster one (2Gbps).
So you have 3 times one Gbps speed and it is divided as 1/3 and 2/3. Then you should only connection-mark one third (0/3) of the total traffic and not touch the 2/3 traffic by using PCC.
You are so only marking the traffic going through the 1 Gbps connection and all unmarked traffic should then be routed to the 2 Gbps connection.
Also note that traffic is can be ignored when the receiving end expects to the traffic related is coming from the same IP address. This as AmmO wrote in the his earlier posting.
I’ve tried two routers that (iIMHO) fully capable of handling in excess of 3GB:
A RB009, with a 10G port connected to my internal network, the 2.5G port connected to 2G ISP modem and the 1G ISP modem is connected to a 1G port.
A x86 based router (R86S box) with 2 SFP+ cages and 3 2.5G ethernet ports. Also connected at 10G to my network and both modems connected to 2.5G etherrnet ports.
So, router performance is not an issue. And the interesting fact is that the result is exactly the same with both routers (I think the R86S is way faster, but it runs very, very hot).
I do use a single hash (for 0/3) and use both-addresses-and-ports.
And I think I’ve tried 2:4, but I will try interleaving 0/6, 2/6 and 4/6 goes to the same ISP. Maybe.
I see where you are coming from, although we are using different routers. The RB5009 has all the Ethernet ports attached to the Marvell switch-chip. So there is no difference on using any of the Ethernet ports on the RB5009. And as I posted, I’ve tried with a x86 router (a R86S mini system) and the results were the same.
Is the traffic VPNs/tunnels/etc – e.g. are connections going to small set of destinations on WANs? Or, is the there a lot of general internet traffic flowing (e.g. lots of connections with many different destination IPs). Basically be good to know if the issue is hashing not creating a suitably random distribution… Or… if there is some config issue elsewhere.
I usually use ECMP, which round-robins connections (e.g. 2 routes to ISP1, 1 router to ISP, all with same distance= to get a 2:1 ratio)… but PCC should work, but the math will always calculate same dst wan for particular app/service is why I ask about #connection and concerned about hashing. While in ECMP: more connections, more randomness.
I do not think there is any such bottleneck, as I can test the 2Gpbs connection independently and it works fine at the nominal bandwidth. I observe the same for the 1Gbps ISP.
I get the same result even when there is no other network traffic. So the problem seem to be with the hash distribution. As for any other configuration issue, I’m reviewing it constantly in search for any problem.
One thing I may try next weekend is to use a very simple config and see if the issue persists.
If I understand ECMP correctly it will use a “per-src-dst-address combination load balancing” so one will never be able to fully use all the bandwidth for a multiple connections download. Not really easy to accomplish at such speeds, but if downloading from a CDN that is close it may work. With PCC, having the (random) port in the equation may alleviate the issue.
There is no reason for PCC not to be working. The full 3Mbits/sec less overhead and some losses should be available for connections. Suspect a config setup issue???
That’s were it gets odd. I will have another 600Mbps around (connected to another Mikrotik. If I use that link, along with the 1Gbps and a 3:5 ratio, it gives me 1.6Gbps. The only config changes are in the PCC rules to setup the 3:5 ratio. Nothing else.
Also:
a) Back to the 2Gbps, I can get the full bandwidth when running without PCC (as the only connection).
b) Same with the 1Gbps.
c) The tester system is connected to the Mikrotik (RB5009) on the 10Gbps
d) I’ve tested the 10Gbps with an iperf3 container on the Mikrotik and the tester-> Mikrotik gives around 5Gbps (becomes CPU bound on the RB5009).
All that said, it really looks very strange.
The only test I’ve not performed is to get a space RB5009 and load it with a very simple config (with PCC) and see what happens. But that’s is a more complex endeavor. Will see if I can do it over the weekend.