Having trouble with an issue that seems to have been ocurring for some time and getting stuck. Setup is attached diagram. STP is used to fail over to the secondary bridged connection in the event preseem is down (preseem is the primary path). I do have secondary IP's on the bridge interface of the CCR2004, but I don't see ICMP redirects or anything, all the routes point to their respective subnet gateway.
I'm seeing a large number of tcp retransmits across the entire network. When doing icmp traces, the CCR2004 has a 40%+ loss from a client perspective, CPU is only at 3-5%. I attempted to packet capture at sfp-sfpplus12, and saw all kinds of duplicate packets, though now that I'm thinking about it, it could have been due to streaming it to wireshark, though the stream should have been sent out a completely different interface to an off-net server, so I don't think that's the case.
Doing the same capture by mirroring port ether2 on switch1 to ether6 (has a capture box connected), I don't see duplicated packets but I do see dup acks and tcp retransmits. This also goes for traffic from the "customer nets". What's also a little odd is that the dup acks in the wireshark are micro-seconds after the ack, so it's almost like there's a loop somewhere, but if there is I can't find it as all the root/designated ports appear to be correct when I checked those.
I have verified MTU everywhere possible, I can send full 1500 byte frames to the internet without fragmentation. This seems to affect streaming to some extent, but more of an issue is I have a couple of customers using Zscaler for work and they can barely function. They have 25Mbps of throughput on their service plan, but can only get a max of 1-6Mbps of throughput with Zscaler connected using TLS/TDLS tunneling.
Admitedly, this architecture is overly complex, and I'd like to get some of this split out onto separate routers, but for now I'm stuck with what I have due to budget constraints.
Curious if anyone has any ideas on what to look at? Loss seems to be the same whether through preseem or native, so that doesn't seem to be an issue.