chr - HyperV locked to one core per interface (Mellanox)

We are tuning HypeV to host CHR to have the capacity of 40ghbps to forward traffic.

In our confiugration dual Xeon with 32 cores @2.9ghz we are able to pass 100gbps to loopback interface using all cores.
When I target a remote IP the bandwidth test is not able to pass 7-8gbps with very low cpu usage on MT but one core at 100% on system side.
How to spread the load on multiples core about the phisical interface (mellanox connectx-5) on system side? Little bit lost abouut it.

regards
Ros