Hi everyone.
I have a RB3011 with gigabit FTTH.
This Gigabit line is recent and I am still optimising the router for this setup.
Anyway, I found out that by having no mirror setup I could reach every time 920Mbps+
If I would setup a mirror from eth1 to eth2 I can’t go over 750Mbps!
CPU won’t go over 40-50%.
How can I find out if this is a bad implementation, or what is actually happening?
Does this mirror only send to eth2 both ingress and egress?
How does it handle if I’m sending 1Gbps full-duplex? Does it drop half the packets?
You are hitting 100% CPU usage in one core that why you see 50% total usage (the 2nd core is idle). So you are CPU bound.
What does /tool > profile shows when you get 50% cpu usage?
I haven’t used mirroring in MikroTik, so I don’t know how it behaves when the destination port is congested. In Cisco 3750s for example it just drops the packets on the destination port without interfering with the live traffic.
Sorry, I mean’t core and not CPU! I was in fact watching through profile.
No core will go higher than 50%.
It really looks like it’s interfering with live traffic. I’ve sent a support email but I’m not sure if they’ll respond.
Have you tried v6.40.x? Does it exhibit the same behavior? (in case you are hitting some weird bug with the new bridges/hw offload implementation on 6.42rc)
Just out of curiosity, have you tried this using the 2nd switch group instead? (eth6-eth10)
I don’t see how it could help based on the block diagram of 3011, but it may be worth a try just to rule out possible culprits.
Also, if you remove the vlans/bridges and just keep it simple (only mirroring), does it still behave the same?
I remember on a CCR having VLANs under VRRP interfaces would result in huge packet drops and increased cpu usage on high traffic/pps. Speaking of which, do your interfaces show any dropped packets or any other errors?
I haven’t… When I had older versions I still didn’t have the Gigabit.
I also haven’t tried the second switch chip. It would be weird since it would indicate that the chip that can do Line rate speeds, has a problem, but might be worth to take a look.
No dropped packets or errors of any type…
I guess I’ll have to test some different scenarios and check its behaviour…
Unfortunately I have to wait for a weekend when there’s nobody home because I could call this a home production router
It gives Phone, Alarm, Internet, and TV so I don’t want to keep it down for too long.
Well, you have yourself identified that there is a physical limitation. The mirror destination has a bandwidth of 1 Gbit/s and the mirroring function of the switch chip itself (so no CPU involved hence no CPU load) can only mirror both directions of the mirrored port (mirror source). So if the summary traffic in both directions on the mirror source exceeds the bandwidth of the mirror destination, “something” has to happen. The possibilities are
to drop only the mirrored frame, so the source frame gets through to the destination but you cannot see its copy on the device connected to mirror destination
to drop the source frame (affecting traffic heavily, so no switch actually does this)
to delay forwarding of the source frame until its copy can be sent out via the mirror destination. As the holding queue cannot be endless, you can delay only a limited number of frames, so if they keep coming, you have to start dropping them anyway. But lucky frames are not dropped, and if most traffic is TCP which accomodates to delays, this strategy may result in lossless slowdown as delaying one frame of a flow causes its corresponding ACK to come later too, and the source stops sending until the ACK arrives. It is actually more complex as not every packet must be ACKes but this is the principle
What exactly the 8337 in particular is doing is hard to say but your observation suggests that it uses the “delay and hope” strategy.
Some more advanced (read: more expensive) switches can mirror each direction separately, avoiding this issue completely.
Thanks for the reply.
Yes, in fact that’s what appears to be happening.
I just thought someone could be touching the same limitations as I did and it was strange because there was about 150Mbps lacking for the actual Gigabit line-rate speeds that would lead to (as you explained) said slowdown.
So, I guess this will not be a solution I want to keep since full Gigabit is always a nice-to-have.
Sadly I’ll have to change the configuration on some machines that were running without IP only as sniffers.
Also, do you know if Mangle “sniff-tzsp” option does have the same behaviour as the Packet Sniffer sniffer-server that kills “Fasttrack” and “Fastpath” when running?
I didn’t see any info regarding that in the Wiki but..
Bad news, sure it does. Fasttrack is fastpath combined with connection tracking, and consists in minimization of CPU processing of the packets. Sniffing is higher in the stack than fastpath and it doesn’t matter whether the sniffed packets are stored locally or prepended with a TZSP header and sent out (well, it does in terms that locally stored packets do not occupy the bandwidth on the CPU port of the switch twice).
Even if it wouldn’t break the fast-something, CPU sniffing would limit the throughput even more than port mirroring because the frames from the switch would need to go to the CPU (and back in case of tzsp-sniffing) rather than being mirrored locally on the switch chip.
Oh, that’s sad. I was really expecting it not to happen, since there was no mention in the Wiki =(
Well, it would also limit but I guess it would be bonded to the CPU’s processing power, which appears to be able to handle about 60% more than what I’m putting it up to (with gigabit).
But given that it breaks the fast-technologies I guess it also has certain problems.
As you can see on the block diagram in one of the earlier posts, the CPU ports of the switch chip are also only 1 Gbit/s to each CPU core, so you’d have the same problem again plus the CPU processing power.
Oh you’re right! Got it.
I guess keeping it like this will still be the best solution overall than
On my second post there’s an image that shows I am doing mirroring via the switch-chip.
Even though, as sindy stated, it appears that the source traffic is delayed, thus causing the slowdown, so that its mirror can be sent.