Oh dear.
Okay, to get the easy shot out of the way - SwitchOS cannot do any routing, in hardware or in software.
But if it’d be more comfortable for you - and you’ll decide you don’t need any of the other ROS features that SwOS is missing - sure, why not.
I personally run ROS on all to have the same look-and-feel on the devices, even if I don’t really need anything more, but SwOS is basically “as simple as it can be” and that can be quite convenient.
Next thing - yes, the RB4011 is a 100% software router. With strong high speed quad core CPU, it can do a lot in software.
It has hardware acceleration for IPsec encryption (at only some combinations of the myriad of IPsec settings possible), but everything else is just CPU work and optimizations in config (FastTrack, etc).
The RB4011 has two simple-minded hardware switches each responsible for 5 ports; there’s a few small simple things you can order them to do in hardware, but in “typical” operation they only CPU-free operations will be forwarding packets between two ports on the same chip in the same VLAN, and VLAN tagging/untagging.
A huge number of MT devices, large and small, are like this (although most of the time usually with just one layer2 switch chip).
It’s basically only some of the new gear with Marvell chips which have the L3HW functionality, which is… tricky to use but if it can be used for one’s needs gives quite the massive performance boost.
Yes, it is quite common on big networks to have a “core switch” which does the basic L3 but very fast for “internal” networks, with some simple filters/ACLs and so on, at hardware “wire speed” or nearly so, then a separate router or these days “next generation firewall” doing the adwanced layer 3 stuff with connection tracking, inspection, NATs, dynamic routing protocols, etc etc etc.
You didn’t say much what is the load on “your network”, but if the RB4011 handles it without bottlenecking on a single core (watch /tool/profile during various high-traffic times?), it’s… fine to just keep doing that?
To give you an example, at one of my customers, I have a CRS317 (which has quite a bit more L3HW capacity than the CRS328) doing L3HW routing for the LAN with a lot of internal traffic (servers and NAS connected with 10Gb ports) and an RB1100x4 which has the same 4-core Annapurna CPU as the RB4011 doing WAN, NAT, and a ton of VPN terminations, with complex firewall rules.
But for a few years prior the 1100 was doing the job alone paired just with “dumb” CRS326 switches, until its 1Gb ports - not the CPU - became the bottleneck for the LAN part.
Regarding the “what vlan to use for what” - it really depends if you ask from point of view of convenience, or security, for example.
“Untagged management”* makes it easy to reach something if it falls to some kind of default mode; but because it makes it easy for you, it makes it easy for everyone else just as well.
(* I kinda guess you actually mean “vlan 1 for management, and leaving that as default/untagged on trunks”…)
In any case, again, something a lot of “security” forgets is that #1 step is to decide what you want to defend from.
There’s a nice term for it, “define your threat ” but with another noun which I can’t recall.
Lemme just tell you that at my main job I still have some sites doing the same because of how much risk and man-hours it would be to re-organize the management layer just-because.
New sites, and full-teardown refreshes are set up more modern, with isolated management zones of various trust levels and stuff, but what’s old-and-running stays until top management decides it’s a good use of time and money.
Maybe you really should do the change because you’re running a powerplant.
Maybe “better is the enemy of good-enough” and it can stay until you’ll replace all hardware to lay down your future 25Gb network all over the place and configure that fresh.