Advice me: Hardware for high performance routing

I am considering putting a Mikrotik into my cloud as core router. Now, the CCR series is nice - but most are so low throughput it is not funny. My cloud backbone runss on 200g per server (2x100g) backboned in a Mellanox switch. Before anyone starts complaining about internet speeds- this is not the point here, we talk totally “in data center backbone transfer speed”. I can live with a lot less on the internet, but the internet does not copy VM’s between domains in the same data center.

What would one advice as router between networks in this scenario?

My requirements are:

  • Routing IPV4 and IPV6
  • limited firewall capability (not all subnets accessing all other subnets, something that CAN be done with routing tables, not the real firewall). Mostly I hava “hub and spoke” network - one central ip range and then client ranges and all must be able to reach the central one, but not go further.
  • VPN Termination a BIG plus.
  • Reverse Proxy for HTTP a small plus
  • As much routing performance as possible. Given that the indivisual networks CAN run 200g throughput, I can live with 10g routing, but I can not live with the central file server being behing a 1g connection. Not if that includes install images that may see concurrent use.

My alternatives are so far - but I am VERY open to something else:

  • Just use Microsoft Network Controller plus a Mikrotik as VPN Termination for clients
  • Use CHR, but I find no information how much throughput has been recorded there using proper Hyper-V virtualization and high end network cards.Seems noone even tried it out with more than 10g - and that is not exactly high throughput in a cloud environment these days.
  • Use a dedicated box with an X64 license 64 bit, but I seriously think I will have to do that using 10G - or does Mikrotik offer drivers for Mellanox 100G cards?
  • Use ???. The https://mikrotik.com/product/CCR1072-1G-8Splus - CCR 1072 - looks nice and can hange 10xSFP+, which translates into 10x 10g - which theoretically can be used to do a lot of dedicated 10G with a 100g/10G breakout (i.e. dedicate 10G for the central network, leave another 10G for distribution to end user points, and use a 3rd one for VPN. A LOT more ports than I think to need now, but if that is what it takes - it is not bad. The only alternative routers sadly offer max 2x SFP+ (CCR1036-8G-2S+). THAT one would actually work perfectly (alloning me to use 8x1G as “breakout” for stuff like KVM, control networks etc.) and having 2x10G. Any known issues with this one?

Hyper-V is slow. For 10G+ speeds you should run on bare metal - at least. Also for those speeds I think you have to either put your money big players (e.g Cisco, Juniper, Arista etc. with custom ASICs), or well chosen and tuned x86_64 hardware.

CCR1072-1G-8S+ can handle some impressive loads of traffic but not for single TCP-sessions.

How often do you peak above 20G transfers?

10G should be quite rare - basically that happens when someoone copies VM images over, install iso images etc.. We also run a 10g connection up to our offices main distribution point, but go on with 1g from there, so this is not one tcp session. And then we have the “in vm backup” for some crititical virtual machines. General VM backup is done using the high performance switch and again does not leave the switch, but some vm’s are doing backups from within (mostly database servers where we want the ability to restore individual databases). I would be ok with 1g from that one :wink:

So, generally - I am acutally QUITE ok with 10G as maximum between the subnetworks, as well as running network controllers and vpn termination. I would not be ok with breaking down at 5g or so.

The “real” 10g+ traffic happens all the time, but it is not router relevant (i.e. it never leaves the Mellanox switch that handles the hyperconvergent cluster). All SSD buffered reads and writes are actually distributed (think: Raid 6, except instead of hard discs every disc is a machine with a LOT of hard discs and SSD and some buffer SSD). And yes, the Raid 6 calculations are actually done on the ethernet controller :wink:

Funny enough RIGHT NOW we are using an old 1g mikrotik.1100AH - that is AH, not AHx4. But so far the servers are connected to the offices with 1g only. Which now goes to 2x10g - one for offices, one for internet (which also hits the main distribution due to cabling issues). So, on that side the “real” traffic for the router is limited to 10g office and 10g internet (which then is limited actualyl to 400mbit for the moment, but we run a separate optical to the endpoint). And the rest is “rarer”.

Picture a sotware developer running a separate subnet for projects, and one central one for archive, control, dns, gateways etc. - there is not a lot of traffic crossing subnets at super high speed. Simply not need. I jsut hate to get hit with a “stupid” limit that I overlooked. And the CCR1072 is not a 100USD device where I just sit down and say “damn, ok, lets just look at it”.

I have good experience wit h Mikrotik for anything not too extreme (i.e. Mellanox level) and running “various stuff” (the VPN endpoint and AP controller comes to my mind - makes things really nice for a low cost), as well as initial firewall.

As setup we have a chain.

  • We have a CRS328-24P-4S+RM as main office distribution in the basement, from which 2x10g go to our inhouse data center
  • We plan to put this switch in on the datacenter as “center” of the networks
  • We have a Mellanox high performance switch as “cloud core switch” handling all the extreme performance switching needs.
    (ok, and whe ae some other parallel small stuff there, too - like every server having a 1g KVM port etc.).

This middle machine must be able to handle as much as I can justify. Mellanox CAN do routing, but then I need to limit the subnets :wink:

Did you tried CHR on VMWare for example?
it’s worth a try i think - you can attach as more CPU cycles/network cards as you want..

It acutally is an IDEA- albeit on Hyper-V (Vmware free zone, no intent to change this).

The amazing thing is that I can try it with virtual adapters connected to the Mellanox 100G cards and see how it goes.

  • It gives Mikrotik time to come upü with new products
  • It will allow me to define exactly whether this is good enough or not and
  • It is 95 USD investment for 10G with a 250 USD upgrade for unlimited should this be needed. This is small money compared to a CCR. Less as I acutally have a License waiting ( 1g) that can be upgraded.

Yeah, I will go virtual for now. Smartest decision, particularly for the graphs. Then I can see how much of areal problem we have - and ask again if it does not work.

According to this: https://wiki.mikrotik.com/wiki/Manual:CHR#Free_licenses
you can test with unlimited licence for 60 days :slight_smile:

and i have CHR but on VMWare for Ipsec VPN - works very well.