"several million PPS" is quite a lot. For example: CCR 2216 (which is more or less MT's flagship device) hits its perfromance ceiling at around 2.9Mpps (when routing and firewalling with CPU ... not when L3HW offload is effective). See
official test results ... see row "Routing -> 25 ip filter rules"
So I'd say you're hitting performance limit of your x86 router.