Very poor bandwidth with SMB

I am using both 2216 ROSE and 2116 with ROSE software. With both, the bandwidth with SMB is terribly poor. At the beginning, the bandwidth is high for a short time, but then drops rapidly. Looking at the utilization of the individual CPU cores, it appears that SMB is only running on one core, which is utilized at over 95%.

Since mainly Windows 11 computers access the storage and there is no native support for NFS 4.x in Windows, this is problematic. MikroTik also only supports NFS 4.x and higher versions. Using WSL 2 on every computer is not a solution either.

What has been your experience and have you found a simple solution for this?

To be honest, I am dissatisfied with the performance of ROSE software so far. In terms of features, it all sounds very good, but when you take a closer look, it's something that a Raspberry Pi can do better and faster. Poor performance with SMB, no support for NFSv3, etc. I can only hope that some improvements are made to the software as soon as possible, otherwise it would be better to buy a decent storage system and a smaller MikroTik.

I have found useable results with an SMB container that has access to the filesystem/disk created in ROS. I made an ARM64 container on my Mac Studio using Alpine Linux and Samba, copied it to the router (tested on 2116’s and the RDS2216), bridged the container’s veth to the router’s main bridge, and mapped the directory to the container.

Performance there was far superior to the built-in SMB. @normis has reported zero issues with his Mac to the built-in SMB, but I can’t get it to work well at all (haven’t since ROSE came out).

FWIW, ROSE’s NFS does work well (6-9Gbps on file transfers). I’ve tested on 10Gbps, 25Gbps, and 40Gbps NICs.

Experimental TrueNAS ARM64 as a Qemu VM in a container on the RDS2216 works too, but the throughput is severely hampered by the three layers of networking involved. I’d love to get that working at the 10Gbps level too, but I don’t know enough of MikroTik’s custom container implementation to optimize it.

Presently, I’m experimenting with setting up TrueNAS in front of the RDS. I export the drives via NVMEoF (TCP) to the TrueNAS server, create ZFS pools with the NVMe drives, and share those via SMB or NFS. It’s doing marginally better than NFS directly to the RDS (better writes for sure). In testing, I’m seeing TrueNAS pushing/pulling 16-20Gbps depending on the drive configuration. I’ve also had it fronted by three different TrueNAS instances, getting up to 36Gbps total to/from the RDS.

Their FAQ for the RDS2216 says the CPU is the bottleneck and tops out at around 50Gbps.

@sirbryan

Thank you very much for your reply and the alternatives you suggested. Please don't get me wrong, but this is a workaround that shouldn't be necessary for a good product. I've known for years that the CPU is a bottleneck in some areas at MikroTik. However, it's not necessarily the CPU, but the software that only supports one task on one core. In my case, one core is fully utilized and the other 15 do almost nothing. If SMB supported multiple cores as multicore, this problem would not exist. But if you sell such storage-specific hardware, you should also ensure that the software is optimized for it. Why do I need a storage system that supports BGP with multiple cores, for example, but only runs SMB with one core? Why do I need a storage system that only supports NFSv4 and above, but not NFSv3? A quick search on the internet revealed that NFSv3 dominates worldwide and NFSv4 is a growing standard. So it would be more advisable to support NFSv3 and NFSv4 as storage systems so that customers can easily migrate storage and then switch to NFSv4 later.

Or is ROSE another so-called banana product that only really ripens once it reaches the customer? Don't get me wrong, I'm generally very satisfied with MikroTik routers, but there are some things and decisions that make it difficult to understand why they don't just do it right.

For example, why should I care about the routing and switching performance of ROSE Storage? These values can be found directly in the data. However, nowhere is it specified what performance SMB, NFS, NVME-over-TCP, etc. offer via a 25G or 100G interface.

I agree, but if you want it to work here and now, I’ve presented an option. This is a users’ forum first and foremost. Feature requests and support issues are more likely to be addressed via a support ticket.

I don’t know if you truly read my response or not. Running SMB in a container, for me, unlocked far more throughput. I didn’t even think to look at CPU consumption, so I don’t have numbers there. Either way, something is up with their specific SMB implementation. I’ve posted and complained about it too a number of times.

Good questions, perfect for the support team. And, the beauty of containers is you can add that functionality now, if you want.

Also good question. It’s got to start somewhere. Many people alive today never had to deal with older Windows, Mac, and Linux releases with half-baked features… oh, wait…

These additional specs certainly would be helpful. I’ve spent weeks running a variety of tests to squeeze the most out of my RDS2216. The most I’m able to do is 10Gbps to/from a single disk or RAID array over NFS or NVMEoF. As I mentioned earlier, NMVE over TCP seems to be the most efficient use of this thing, with a sporty box in front of it doing the rest of the heavy lifting.

The only specs I’ve found regarding storage come from the FAQ https://mikrotik.com/product/rds2216:

Is PCIe Gen3 with 1-2 lanes per drive fast enough?

Each U.2 drive in the RDS is connected with 2× PCIe 3.0 lanes (16 Gbps per drive), and the entire disk plane has a 16× PCIe 3.0 connection to the CPU (128 Gbps total).

In practical use, CPU performance will be the limiting factor before PCIe bandwidth becomes a bottleneck, especially with multiple drives handling parallel workloads. When writing large files over NVMe-TCP, the system can sustain up to 50 Gbps continuous write speeds.

Additionally, most SSDs can’t fully saturate their theoretical interface speeds due to NAND flash limitations, meaning even high-end Gen4 drives won’t always see a real-world advantage over Gen3 in practical workloads.

As for drives, PCIe Gen3 U.2 SSDs remain widely available and are ideal for enterprise workloads where endurance, capacity, and cost-efficiency matter more than peak sequential speeds.


Can the CPU handle 200G networking and 20 NVMe drives with PCIe 3.0?

The RDS isn’t just storage—it’s also a high-speed router. The network interfaces are designed for routing, virtualization, and compute workloads, not just disk access. While based on the bestselling CCR2216 router, the system has been fine-tuned for performance across networking, storage, and compute tasks.

Additionally, the latest RouterOS includes optimizations like ROSE storage enhancements, improved multi-threading, and better RAID/disk management, ensuring efficient workload distribution and seamless operation of NVMe storage and high-speed networking.

For US$1600-$1800, we’re getting a bunch of NVMe drives slapped onto half of a CCR2216 (which, if you don’t need all the ports, is actually a pretty good deal for just the router).