Impossible to get more than 5.5GBit on a switch to switch link. Tx Drops.

I’m having issue with TX Drops on a link between CRS328-24P-4S+RM and CRS309-1G-8S+IN using two opton SFPs and a Cat 6 copper cable. I have two PCs one each plugged to each switch.

Both switches have only a couple of clients that generate little traffic. The CPUs are 2-3% most of the time. Configuration is almost all on defaults. The only changes are enabling flow control on these interfaces and setting queues to multi-queue-ethernet-default.

Am I right in thinking if the issue is TX Drops this is essentially queues filling up and overflowing before the channel is ready? As mentioned I’ve tried setting the queue strategy to multi-queue-ethernet-default instead of the default hardware only. Perhaps a slight increase in speed was observed, but I’m not sure.

I read somewhere these switches have only 2MB of hardware queue between all interfaces. 5Gb with 8800byte packets is about 78k packets per second. I know these switches are capable of forwarding this because if I plug three clients into one switch and I generate traffic with iperf from two of them to one I get close to 10GBits of traffic. So why is it so bad when we have an uplink between two switches? There is about 20Mbit of other traffic on this interface.

The problem always stays with the sending switch when I swap the traffic direction. This prooves the issue is on the sending side.

Can anyone give me some advice, is it possible to resolve this, or are these switches simply not meant to be used like this and I need to buy something else?

What’s the length of Cat6 cable connecting both switches?

Though length is still unknown, the cable has been described on the other thread “slightly questionable quality cable”:
http://forum.mikrotik.com/t/10g-link-works-fine-for-a-day-then-breaks-until-interface-disabled-enabled/183570/1

@ Luk5566
Cat6 cable at 10 Gb speed is usually considered viable for 164 feet or 55 m, but that is theory and assuming “very good quality” cat6 cables. and in practice I have seen tests/reports that put the limit for 10 Gb speed to much shorter distances, like 100 ft or 30 m.
And of course there may be other specific issues like patch cables, terminations/connectors or RF interference, all things that can contribute in a drop of the speed on the connection, so a cable issue is a possibility.

Indeed it is slightly questionable because there are two inline keystone couplers, not because of length It is below 30m (somewhere in the region of 25m total length it is difficult to say down to a meter as it is going through the building under the plaster and so on), but there are no FCS errors except one or two in a 24h period. That is very different than thousands of FCS errors I got on a definitely questionable cable.

There are also almost no RX errors (over days there may be 20 perhaps). So I highly doubt this has anything to do with the cable.

I’ve done some experiments. If I set up iperf3 and set bandwidth to 2.5GB I can run it continously and get 0% drop. If I increase this to 4Gbit I get a drop of 10% every 20th block of 300MBytes transferred. If I go beyond 4.5Gbit I get 10% loss on udp packets.

I have another client on the CRS309-1G-8S+IN but it has only a 2.5GB network card. That client can run iperf to the server and get 0% drop (this is on the same switch - the server is connected with a half a meter cat 8 patch cord, the client with a 6m cat6 cable). If I run two servers on the client across on the CRS328 again I can run 2.4Gbit~2.45Gbit from the client and get 0% drop. I can also run 2Gbit from the server at the same time and also get 0% drop. Only when I increase on the server to 2.5Gbit I start getting packet drop.

So this is a bit different I can achieve solid 4.45Gbit with no ocassional drop if 2.45G comes from one client and 2G from another on CRS309. But If I try sending entire 4.5G from one I definitely do get
drop.

I can see where this drop occurs. The Tx drop counter increases in front of my eyes in Webfig:

I might try getting a long cable - like 20m and wind it through some windows and connect these switches directly tomorrow just temporarily, but I think all this effort will be wasted. If this was a cable issue I would get FCS errors.

Edit, here is teh traffic section of the sending side with two clients at once:

Interestingly the TX Queue drops is not increasing as much as TX Drops above. I’m not sure why

Here is a single client sending at 4.5Gbit following the previous test:

and as you can see the Tx Queue Drops have not increased by much while after these two tests the Tx/Rx stats above now look like this:

This is after 60GBytes were transferred (2X20GB plus 1X20GB)

You can see 7k extra tx drops. The first Rx/Tx stats was screenshotted before the first test was started.


Edit2: I thought to run again and add the RX side:

This one had its counters reset few hours ago as well.

i really dont understand your topology

my 5 cents on this are:

i think when you mix different port speeds the chances of droping frames increases

on CRS-3xx/5xx switches i never had the need to change queue type on interfaces

Newer Switch-QoS menu can be helpful if you know what you are doing

Dont forget to maintain your current firmware (on system routerboard menu) on the same version than routeros

What is there not to understand?

Client1 ↔ switch1(CRS328) < - > switch2(CRS309) < - > Client 2
_________________________________________^–> Server 1

There are two switches connected with 10Gbit, switch 1 has one client, switch 2 has two devices a server and client.

The problem was happening before second client runnign at 2.5Gig was added, when everything was at 10 gig.

I also did the following experiments. I used pktgen to run 10gig traffic out the client 1 network card directed at server 1. I set ingress/egress limit in webfig Switch->Ports on the port client 1 is connected on (funnily enough the egress limit on the uplink port had no effect - no idea why). So I set the egress limit, as long as it was below 4Gbit the uplink between switches was not experiencing any Tx Drops. the moment I raised it above 4G it started a trickle of Tx drops, until above 5G it was essentially a flood. If I removed the limit the uplink would run at around 4.7Gbit dropping everything else.

So I would like to ask, is anyone on this forum using two CRS switches with at least one 10G client connected to each switch and a 10G uplinmk between switches sucesfully running over 5Gbit of traffic between them? Because after finding countless “me too” threads going as far ago as 2011 about this issue on Mikrotik I start to think I bought a wrong brand of switches. I’m starting to think these switches are simply not designed to have any sort of switching topology at 10G. You’re supposed to use one and that’s it (although I haven’t confirmed it is able to forward traffic at 10gig even within one switch as of now).

Edit: While I can imagine some physical layer issue happening only above a certain packet rate like my 5Gbit, the result I guess would be visible in RX errors, not in TX drops. I’m not aware Ethernet layer 1 has any ACKs for packets. So it should be firing the packets as fast as the interface allows and if the cable was bad the other side should see errors.

Also, this is perfectly repeatable one way or the other. (with iperf3 you decide which way you send traffic).

I also thought maybe iperf is doing something funny, so I set up an NFS mount and I tried to write a 40GB file full of zeros. It maxed at 3.5Gig.

Does two switches connected at 10Gbps on another switch count? If yes, I had two CSS326 connected to one CRS (here Your CRS) switch, using 10Gbps fiber.
Each of the CSS326 had several 1Gbps clients connected. I stressed test it, forcing traffic to pass through clients on one CSS326 to the second CSS326, crossing the CRS328 10Gbps fiber adapters.

I didn’t have 20 clients to reach 10Gbps, but I had 12 of them, and I did got 6 Gbps throughput.
Try another cable. At the very least we will know if it is it the problem or no. Doesn’t even have to be a big one: couldn’t you just (temporarily) put one switch next to the other? Just to test this.

No, there is a direct connection between two switches, but thank you for describing your case.

If yes, I had two CSS326 connected to one CRS (here Your CRS) switch, using 10Gbps fiber.
Each of the CSS326 had several 1Gbps clients connected. I stressed test it, forcing traffic to pass through clients on one CSS326 to the second CSS326, crossing the CRS328 10Gbps fiber adapters.

I didn’t have 20 clients to reach 10Gbps, but I had 12 of them, and I did got 6 Gbps throughput.
Try another cable. At the very least we will know if it is it the problem or no. Doesn’t even have to be a big one: couldn’t you just (temporarily) put one switch next to the other? Just to test this.

Interesting, although it is a different use case. Having a dozen 1G clients combine to 6G over an uplink is a bit different. I hope someone shows up who did get 10G or close to it between two CRS switches. So I can focus on troubleshooting and not regret for throwing out the boxes these switches came in so I can’t really return them and buy something else.

I’ll be making a test cable. Its wasteful to cut 20 meters of cat6 just for a test, but I’ll do that later.

Before anyone mentions “micro bursting” I even wrote a python scriupt to generate packets at nice even intervals of time to achieve the exact Gbits with minimal burstiness and I still have the Tx drop and the status chart shows ocassional bursts up to 9Gbit from lets say 5G when I’m running 5. I suspect these switches have some issue that even if they receive traffic that is perfectly micro burst free they somehow “bunch it up” if it is to go over an uplink. As a result the uplink gets nothing, nothing, nothing, suddenly 20GB and drops, then nothing again and so on. Why do I think that? Look at my screenshots. where are the spikes on the graph coming from? There is a total of 30Mb of traffic on these switches in total in addition to my test.

If putting in Mikrotik SFPs or trying a new cable doesn’t fix this I guess the only way is to mirror the ports and see on a network capture what and why is it dropping. But then the question is, will I be able to capture 10 Gbit and even more importantly will the switch actually mirror all traffic on 10 G interfaces.

Also, regarding cables. I researched 10G Ethernet layer 1. There is no ACK mechanism on TX packets. So a sending switch/SFP has no way of knowing the TX packets being sent experiences crosstalk or other issues being received.

However the only other possibility that comes to my mind is the crappy SFP overheating or running out of TX power. It is running at about 52C at baseline and this temp doesn’t change when I send lots of traffic. Perhaps the SFP is experiencing momentary overheating or too much power demand when it runs on a long (20 to 28m? ) cable (just like in RF a long cable may present a different, not ideally matched load, may be not enough of an issue to cause faults but may put more of a power demand on TX). If the SFP asserted TX fault while the switch is sending a packet the switching fabric (if was designed by sane people) would register a Tx drop. But is it possible for this to happen only at some high packet rate and the temperature is not raising. I seriously doubt it.

I can get above 5G from my workstation, connected to a CRS305, which is connected to a CRS309, both from the internet and from my NAS, which are both connected to the same CRS309.

So, yes.

How much above 5G? are we talking 5.1 or more like 8G?

Do you have linux on them perhaps? If so could you run iperf3 -s on one and iperf3 -c X.X.X.X on the other where X.X.X.X is the IP of the other one? I’m very interested what speed you’d get.

Also I wrote a python script to try few things. And if I try sending packets at max line rate (10Gb) it has no packet loss as long as it is done in bursts of only 200 packets and then there is a break which keeps entire bandwidth below 4.7Gbit or so. If I increase the burst beyond 200 packets to 300 for example (these are 8800 byte packets) it will drop. Likewise if the delay between bursts gets under 1ms.

I sure hope these are crappy SFPs or even the cable. Replacing these switches when I can’t just return them is the last thing I’d want.

SFP’s (copper) become VERY hot, 52° is like “warming up”, there are reports of people running SFP’s in the 80°-90° range, while there may be issues with the SFP, it is not overheating.

Definitely sounds like it is cable or SFP related.

I have a device connected to SFP+ on a CRS310-8G+2S+ which is in turn connected via the second SFP+ to a CRS309-1G-8S+ used as a core switch.

Second device is connected to SFP+ on CRS328-24P-4S+ which is in turn connected to the same CRS309-1G-8S+ as above. The two devices can push 9.8Gbps between each other.

This setup is all fibre though with Cisco SFP+ modules.

But that is fibre… not copper. Still, thank you for mentioning your setup. At least it prooves the switch is capable of it over fiber.

Although copper 10G SFPs place far more demands on the switch (power and thermal). I’ve observed the voltage reported by the SFP dropping from 3.222V to 3.18V during such transfers…

My latest theory is that the SFP+ blips TX Fault when it can’t handle the susteined traffic. Those of us who know about RF can imagine many situations where mismatched load can cause issues during TX. But if that is really the case why is it not resulting in a single FCS error on the RX side during testing? Only a couple per 24h period…

Now the Microtik modules I ordered are specced at only 25m (although some sellers claim 30m). What is really frustrating is why the switch doesn’t tell us where is this TX Drop coming from. Is it getting the TX fault from the SFP? Is it overflowing its buffers? Who knows.

Edit: It would be a massive pain to replace with fiber, but not entirely impossible. Alternatively I could put one of those dual SFP+ media converters in the middle of the cable run (plugging copper modules into both ports). This would essentially halve the length.

I also wonder, lets say the SFP+ starts sending a packet. and 10% in it experiences a power loss or some other physical issue. Surely the RX side would detect this as an incomplete packet, right? Does anyone know? It seems if it even starts transmitting it should log something on the RX side.

OK, I made up a 25M patch Cat 6 patch cord and wound that around the building and through windows in the most janky way. As a result I discovered large part of the problem is caused by the cable, but not all of it.

I managed to get sustained 7~8Gbit with iperf3 amd 9.2Gbit when running two connections (one to the server on 10G and the other to the client on 2.5Gbit). However the txdrops are still happening but at three orders of magnitude slower rate. See this for example:

I’m still puzzled on the TX only errors. I’ve ordered Microtik copper SFPs which should arrive tomorrow hopefully so maybe these will give me some better diagnostics.

I’ve ordered some repeaters I will need to for something else to test as well as fiber SFPs and a premade cable. Installing ti will be massive pain. It will have to go inside gutters, and pavements will have to be dismantled to dig under - a nightmare, a repeater if it worked would be much better short term.

Either way I think the mystery is largely solved.

So, most of the issue is the cable, which then can be demoted from “slightly questionable” to “definitely bad”.

Only for the record, many, many years ago, Jerry Pournelle had a column on Byte (the magazine) where he tested lots of new hardware that at the time was at the edge of speed and performance, very often featuring SCSI connections and he had a saying to the effect of “When you have a problem with a SCSI device, it is the cable, check carefully the cable, the connectosr and the terninators, and if they pass all tests, it is anyway the cable, change it.” :wink:

I have no idea about the complexity of passing a fiber in your building, but it is IMHO worth it if you want reliable 10 Gb conneciton, and you will be ready for future increases in speed, a fiber takes a fraction of the space in conduits, so while replacing the current cat6 with a beefier cable (cat7 or cat8) may be difficult as they are the same size or larger than cat6 and usually stiffer, so they tend to be difficult to pull on curves, usually fiber (not pre-terminated of course) can be easily pulled.
The cost of a couple proper on site splice/termination with pigtails, made by a professional on call, should be (it depends on your country of course) around or below 150$ or so, surely cheaper than extensive masonry work.

That’s a lot of RX Pause too, maybe the client can’t keep up?

plenty of deployments forwarding much more than that with CRS 3xx/5xx switches

good point

I’d very happy if I had conduits… This is a residential building - a house with two side sloping roof ( walls made of 25cm ceramic cavity blocks, an air gap and 20cm styrofoam followed by plaster - it took me half a day with an 80mm diamond hole saw to drill through it once). This is a very strong but brittle material. On the inside there is about half inch of classic sand based plaster on it. Back when the house was built things like diamond tipped notchers for ceramics were very rare around here so no conduits unfortunately.

The cable will be going at the edge of the roof where it will be invisible due to its color (I already have some cctv cables there), inside a rain guter to get below ground from the roof level. And then underground until it reaches the other building.

I’ve decided to put in 12 strand single mode fiber in TPU sheathv (AirFlow SM 12J G657A2), it will be installed in a conduit where it goes underground. Just to future proof it.

Thank you for the info. It is good to know.

I’m not sure what are these RX Pause counts are telling me. Did this side receive a pause request from the other side? That would be most logical I guess.

In case of an uplink the client is the switch on the other side. Why would is sent a pause request? Probably full switching buffers.. They may be full because the client has sent a pause on its port.

Yes, it is possible the client can’t actually process more than 9Gb maybe 9.2Gb in very good conditions. I guess traffic may very well burst to full line speed from time to time so that would cause these pauses. I’m not that bothered by the last half gig, but if I was I would definitely try to tweak some settings on the client to fix it.

I can reach 7.7G (out of 8 ) to an (100G-connected) external iperf server. Pretty much as good as it’s going to get. To my NAS I can reach about 9G. But that’s over multimode fiber, not 10GBASE-T (except a 2m cat6 cable between my CCR1036 and the ISP’s XGS-PON ONT).