CCR One root queue problem

I have about 10000 simple queue and about 600Mbps bandwidth in my CCR-1072.
if i set only one parent to the queues the cpu will goes to 100% but without parent the cpu is about 10%.
i currently use one queue tree for total but it is not a good solution since we will not have the HTB benefits.
How can i use HTB with simple queue ?

Important i need it please help me

Replay from Mikrotik support:
One queue structure can be handled only by 1 CPU core, so HTB structure is not advisable for use in CCR. many single level simple queues is the best option on CCR.

but it is not my answer. So how can i limit total bandwidth? i want to have the benefit of max-limit and limit-at to use give users more speed if we have empty bandwidth? it is so important for me. please help me.