Simple queues and Memory usage

I work for a WISP which uses about 1000 simple queues to limit bandwidth for our wireless customers. I know it’s better to use PCQ / Queue trees vs. 1000+ simple queues but we use simple queues for usage monitoring (byte counters) and trouble shooting (quickly can look up if a customer is using all their bandwidth). We hit about 90Mb/s traffic at peak times with ~30% cpu load, so the hardware has no troubles with this config.

I came up with some mangle rules to mark high priority (web+email+voip) and ‘other’ (bittorrent, P2P) traffic. I thought I could implement simple queues with priority to achieve this but I ran into memory problems quickly. I setup a test router and added 10000 simple queues (each with no children) and memory usage was very low. As soon as I started adding child queues memory started to drop by 4MB(!) for each queue. Obviously I don’t plan to use 10000 queues in production but I wanted to see what the memory usage would be like.

Example:
Queues with no children:
-Q1
-Q2
-Q3
-Q4
=(memory use negligible)

One child per queue:
-Q1
\C1
-Q2
\C2
-Q3
\C3
-Q4
\C4
=(free memory drops by ~16MB)

I can’t figure out why I can add 10000+ simple queues (just for testing) into the router with no problem, but if I setup ~300 queues, each with one child (600 total), I run out of memory (2GB total) ? Is there something special about parent-child queues that I am missing? Adding more children to each queue doesn’t seem to increase memory usage. In other words, the number of children or parents don’t matter, but the total number parents who have children do…if that makes sense.

Anyone have any idea about what is going on? Thanks!