On my Hex Poe , I have enabled Fasttrack. Got some increase but not a huge amount.
I would be interested in seeing the config. I have personally tested Fasttrack vs. non-Fasttrack on RB951G, which has slower and older CPU in it than hEX PoE does, and can confirm I get results very close to what MikroTik boasts on their Fasttrack page (which uses an RB2011, which has exact same CPU in it as 951G does):
Disclaimer: last time I benchmarked Fasttrack, I did it on ROS 6.x. I have not actually re-run these tests on 7.x on the same hardware. Perhaps I should do that...
In any case, this is the result I got when I tested this on the 951G last time.
Without Fasttrack:
With Fasttrack:
Since I got over twice as much throughput on my lab 951G than you did even without Fasttrack on your lab hEX PoE, it makes me think that something else is going on here... (or maybe you are testing with 7.x, and firewall/NAT performance as a whole has gotten worse on 7.x? ...not impossible!)
I still suspect there has got to be a method to configure a customer located Mikrotik router to do CGN-NAT ( NAT-444 ) with fw jump tables ( using the WAN IP address ) , and get much faster throughput when compared to Mikrotik's normal NAT.
By definition, you can't have "NAT444" without two layers of NAT (4-to-4, then 4-to-4 a second time). Putting that aside, I'm still not exactly sure what you are proposing or how it's going to improve performance. If you are thinking that going from a single rule to multiple rules chained together with jumps is going to improve performance, I am very skeptical. The particular NAT "action" that you are using with multiple rules would have to be worlds more efficient than the single rule that I presume you are already using, and I just don't see that as being likely.
edit - note , when I configured nat-444 , I used a ton of jump tables to optimize the /21 CGN-nat.
( 1/2 , 1/4 , 1/8 , 1/16 , 1/32 ) which resulted in fewer nat lookups sequential steps when traffic was inbound to the customer.
Yeah that part make sense: each rule executed is uses CPU and adds [marginal] latency.
For sure "less" rule look-ups is better than "more". But it just begs the question, what exactly is being compared to what? When I read what he wrote, it sounded like he was saying that he implemented CG-NAT one way, it was slow, so he improved performance with the jump tables. Okay, fair enough. But assuming I'm reading him correctly, I think he thinks that if he simply applies a similar strategy to the NAT config of a simple residential gateway, he can reap similar performance improvements. That just doesn't follow, since you are comparing apples to oranges when pitting a CGNAT appliance to a dinky consumer router. More specifically, if you are already down to ONE NAT rule...how is multiplying them but then adding jump tables supposed to make up for going from 1 rule to >1 rule, exactly? The only way to make up the difference is if "action=masquerade" for example (assuming that's even what he is using at the moment in his lab tests) is WAY more inefficient than whatever other action he would use in this hypothetical new-way-to-do-NAT. I don't buy it...but would be happy to be proven wrong.
I've never test it, but my hunch be "masquerade" be good to avoid. If WAN IP changes, it has to traverse/parse/remove connection tracking & not sure of it's internal logic on check if IP is changed, etc., etc. Basically masquerade "does more stuff", whether its significant IDK but it "heavier" than a src-nat.
I have never noticed "action=masquerade" to result in measurably less performance than an equivalent "action=src-nat, src-address=<WAN IP>", though like you I haven't sat down to rigorously test it (I've never had reason to). My understanding is that "action=masquerade" is basically identical to "action=srcnat" but where src-address is picked up for you from the address assigned to the egressing interface. In fact, if you have multiple interfaces that you need to NAT traffic out of, you can likely find a way to address all of them with a *single* "masquerade" rule (just don't match on out-interface, or add all interfaces to a list and match against the interface list!), versus multiple separate src-nat rules, so if anything, "masquerade" not only results in simpler config, but potentially better performance due to fewer rules! If your theory about what extra work it is needing to do to prune the contents of the connection tracking table if the IP ever changes is accurate, I would think that should only cause a slowdown whenever such a change happens, not perpetually for all traffic flows at all times.
Again, happy to be proven wrong on any/all of this... and I'll try to dedicate some time to labbing up a few scenarios to see if I can get some data to either back or contradict these assumptions.
EDIT -- EDIT -- EDIT:
I just saw one of your edits to an earlier post, where you said this:
edit note - on my Mikrotik CHR routers doing 8 live IP address to a full /21 natted interface required several thousand ( around 40-k ) lines of configuration lines of jump tables.
40,000+ NAT rules??? Yikes! I understand it's skipping and jumping a lot, but I still have to think that is going to take its toll.
How many lines/rules did you have in your previous RouterOS CGNAT config without your jumps?
I just found your older post describing your config
over here on this thread.
First, all that NAT444 means is just that there are two layers of NAT44 back-to-back. This means there are two NAT gateways. This is referring to the customer CPE doing NAT (so for example 192.168.1.0/24 to a single 100.64.x.x IP on the customer WAN), and then to your CGNAT appliance doing NAT a second time (from the customer 100.64.x.x to your public IPs).
So, technically, you are doing "NAT444" just by having a CGNAT box, regardless of how it is configured.
Often NAT444 is a term used in the context of CGNAT for obvious reasons, but has picked up additional meaning over time, because if you can limit the port#s that the customer router uses, then you can do precise, stateless 1-to-1 mapping of ports on your CGNAT gateway. Then you don't technically need to do connection tracking on the CGNAT box. So for example, if you configure customer A router to limit its NAT of 192.168.1.0/24 to WAN 100.64.1.1 ports 5000-5250, and customer B router to limit its NAT of 192.168.1.0/24 (separate LAN) to WAN 100.64.1.3 ports 5250-5500, then your CGNAT box just knows that 5000-5250 on a specific public IP always belongs to customer A (100.64.1.2), and 5250-5500 on the same public IP always belongs to customer B (100.64.1.3). Then traffic flowing in both directions through the CGNAT box can have both IPs and port#s re-written without consulting a connection tracking table at all. That will allow you to reduce your CPU usage and increase the scalability of your CGNAT *considerably*, because all of the little residential/customer routers are doing all of the connection tracking for you, and the CGNAT box can then do zero.
Unfortunately, RouterOS doesn't support this kind of static, connection-tracking-less IP+port mapping (as far as I can tell). Even if they did, to make use of it requires that you have absolute control over *both* NATs that are happening (not just your CGNAT box, but also 100% of customer routers), such that you can precisely control the port numbers that are used end-to-end through your network, all the way down to the customer router...that way, the core CGNAT box and all of your customer routers can be working in tandem to scale up the efficiency and performance of your CGNAT.
Second, I don't think you need all of that jump table craziness in order to achieve NAT that is equally as performant as what you are getting right now. You can achieve this by switching to "action=netmap" instead.
I understand one of the requirements is that when you get a subpoena or DMCA notice, you need to be able to trace back to a specific customer from the public IP and port#. Both this and short NAT rule lists are possible using "action=netmap". You use of jumps is just a workaround for the way that "src-nat" usually works. But "action=src-nat" is not the correct tool for the job in this case.
Mapping a /29 to a /21 is a 256-to-1 ratio of private-to-public IPs. That's quite a high ratio. The lower the ratio, the fewer NAT rules you can get away with of course.
Anyway, with "netmap", you only need as many rules as you need private IPs mapped to one public IP. This is true regardless of how many total private and public IPs you have...it's only the ratio we care about. With a 256-to-1 ratio, you only need 513 rules: 256 rules for TCP traffic, 256 rules for UDP traffic, and one rule at the end for "catch-all". I have been testing this myself over the last few weeks, and performance seems quite good so far (right now playing with about 300,000 tracked connections and ~5-6Gbit/s of traffic running bare-metal on an old i5-9500, but with a NAT private-to-public ratio of 64:1...the PC hasn't reached its limit and the CPU is not peaking out; however, I have also tested a CCR1036-2S+ and it seems to be able to handle considerably more).
I have attached an example that maps 100.64.0.0/21 to the fictitious 200.200.200.200/29. Just search-and-replace 200.200.200.200/29 with your block of 8 (and of course add whatever other matchers are relevant to you, like out-interface etc.). In this example, traffic seen coming from 200.200.200.200:1024-1275 would have come from 100.64.0.0, seen from 200.200.200.201:1024-1275 comes from 100.64.0.1, etc.
You do not have the required permissions to view the files attached to this post.