I guess what you guys missed is that I’m using this to great effect already, I just want to get it included in Mikrotik to remove an extra piece of hardware and simplfy design.
The internet itself is slightly lossy. Pick ANY connection and ping essentially any off-net resource and wait. Their will be a packet lost at some point and there will be a high latency packet at some point.
If I have a circuit that is doing 40ms to an EC2 instance, I can tune my FEC parameters to queue 40ms of packets and if anything is lost or just late beyond the buffer, rebuild. That random packet that takes 160ms is because of packet loss and another link in the path rebuilding it in it’s EC code (most likely), but I don’t care and I’d be just as happy to drop that late packet as get it late because FEC will allow me to rebuild it at 80ms per the example above.
LTE is completely irrelevant to the conversation. It’s the equivalent of saying you have a perfect driveway with no cracks and if there are cracks they are instantly rebuilt and claiming that means you’ll have a smooth ride to work everyday as a result. The path between A and Z is unknown and you’re focusing on the path from A-B only, the rest of the path could change at any point, and there are points [B-Y].* in between that could be saturated at any point along the way causing buffer bloat and latency spikes, even outages that have to be routed around.
If we make some assumptions:
80% of packets will arrive on time on at least ONE of the links on the network.
That 80% will be spread somewhat evenly over the 100%, ie packet loss and latency spikes are generally random
OR
A link fails causing a number of sequentially dropped packets and a failover event.
So, if I plan on 20% FEC redundancy I should be able to rebuild missing packets from the random loss/jitter AND handle the lost packets that would cause a failover event.
There is no ‘double fec’ because for VoIP needs, all EC in the middle is useless if I have end-to-end FEC. I’d rather forget that packet and rebuild it with RS than wait for it. That makes my latency predictable, just a function of the % of packets queued in my FEC buffer. If typical latency is 40ms, FEC will increase that to say 55ms
Any argument specifying any specific links in the path is moot, it’s the aggregate of all links, most of the completely unknown. MOST of the data path wont have any substantial EC.
Right now over a tinyfecvpn link I can get 65ms from a site through to an EC2 instance consistently with not more than about 20ms jitter (FEC takes ~20ms to rebuild a packet) and I can failover from cable to LTE without loosing a packet. Latency moves up to about 160ms with that same 20ms jitter until the cable comes back and it gracefully moves back down (no packets lost in the route update). I can yank the ethernet cable out of the cable modem and I get something like 65ms,65ms, 84ms, 65ms, 85ms, 180ms, 160ms, 160ms. Looking at a pcap of the SIP and RTP traffic, it matches with no lost packets in the sequence. Plug back in and wait for my tunnel to come back up and it drops right back down to 65ms. Customers on the phone don’t lose a call or lose a bit of audio. They might be able to perceive that latency move up but it’s still within comfortable VoIP margins.
The problem is the data path is more complex than necessary. phone > computer(tinyfecvpn) > mikrotik w/ dual wan and dual tunnels configured for failover > internet > CHR in EC2 > instance w/ tinyfecvpn > PBX
Secondly, I’m taking a hit on MTU putting a tunnel in a tunnel.
This works, and it works very well. It’s just missing in routeros.