It seems likely AES-NI instruction support will be available when Mikrotik do a 64-bit x86 build. AES-NI aside, we'd see a 15% performance increase (due to correspondingly higher IPC), which is important on low-end Atom boxes.
performance-wise properly implemented AES-NI - had Bigger boost than 15%.from 120% to 400%, up to ~180% average.
on Atom its almost 1/3-1/2(depend "generation" of) lower than "mainstream" CPU's, both because AES-NI implementation and tiny/weak CPU but still hard to not notice.
sadly both intel(eariler in beta) and AMD(in last versions of releasd of SDK/tools)withderw anything non-rinjael/AES in support. originally there was things like (partially-implemented, but still "Faster than without AES-NI")TwoFish, Serpent and incomplete CAST and BlowFish and other things alike(including DES and 3DES. no GOST and Stribog/Grasshopper)
since AMD FX and Cabini - AMD had better support of AES-NI, but in Broadwell and Skylake that gap was somewhat reduced between vendors.
so far most AES-Ni implementations rely on asm fine-tuning and Through profiling, making it Very platform-specific thing in each arch, which add another potential exploitation thus and speculations about.
but so far there wasn't any "free beer" in chiper so AES-NI "works" only when something Essential sacrificed ptimistically engineering and coding stuff. i cannot deny Real improvements in that and Potential of offoding/boosting it, but its remain Very computation-intentsive even with help of ASIC offloaders(FPGA-alike, GPU-alike, FPU-alike and SIMD-merger - much slower despite bigger flexibiity. and bigger "silicon-wise"(but that depends on strategy).
for particular details - its may be reasonably to study Particular Fab/company elements library and SDK to estimate numbers for decision making SoC/chips.
on MIPS, ARM, PPC - benefits smaller than x86 not cause stronger silicon limitations, but on "generally-weaker" chips tided to AES-NI, cause "non-offloaded"(and which cannot be(efficiently)offloaded)portion of - still remain bottleneck in crypto. so old tiny 32-bit chips suffer more than 64-bit with fat l1, l2 catche, wide FPU/SIMD and other offloaders/accelerators.