Community discussions

 
dottxt
just joined
Topic Author
Posts: 15
Joined: Sun Feb 02, 2014 5:53 pm

Poor routing performance on CCR

Sun Jul 06, 2014 1:17 am

Hello,

We have 2 CCR's in production right now, running ROS v 6.15, and when we receive DDOS traffic to hosts on our network in the 600k PPS range, we are seeing %50-75 CPU usage. At 1G, the systems lock up and restart. IF the systems are supposed to route many millions of packets per second, then I'm thinking we have it configured wrong. Perhaps fastpath might be off? There are no firewall rules or queues, and the interface queues are set to hw-only.

Any hints on getting these to work as advertised?

Thanks
You do not have the required permissions to view the files attached to this post.
 
User avatar
Kreacher
Member
Member
Posts: 359
Joined: Wed Sep 25, 2013 3:58 pm
Location: Hogwarts

Re: Poor routing performance on CCR

Sun Jul 06, 2014 4:09 am

Hello,
We have 2 CCR's in production right now, running ROS v 6.15, and when we receive DDOS traffic to hosts on our network in the 600k PPS range, we are seeing %50-75 CPU usage.
Ok and what is wrong with this now?
600k PPS = 50% and 1 million PPS = 100% CPU usage and there it will be stopping then.
At 1G, the systems lock up and restart. IF the systems are supposed to route many millions of packets per second, then I'm thinking we have it configured wrong.
More than 1 G a port is perhaps not able to perform if this is a GB LAN or SFP Port.
And yes the CPU is able to perform many million packets per second in the theorie
for sure! Likes a 10 GBit/s SFP+ slot, but you will in real life never see really 10 GBit/s
throughput owed to the circumstance, that this a only theoretical counted packets made
under laboratory environments or so called test labs.

In the real life the entire you will never see a real throughput like this!!! Never I assume.
Because this is related to the many more circumstances, the OS must be run on the
CPU cores, the funktions and options given by RouterOS and the WAN ports or interfaces must
be running over the CPU cores and this is all in all narrow down the things the CPU is able to do
more and more and more.

As an example: The TileGx8072 platform from Tilera with 72 Cores is able to perform
and process 65000 different VLANs, but fot his action and I mean only for this
action it will be using 70 from 72 CPU Cores!!!! So the entire rest of 2 CPU Cores are
for the rest of the OS and all other configured things! And backwards or in real life
this this means, if many things are configured in RouterOS the system stops performing
or is slowed down massively at 1000 VLANs only. And also this is for the smaller CPU types
and platforms matching and including the PPS packet processing.
Perhaps fastpath might be off? There are no firewall rules or queues, and the interface queues are set to hw-only.
This might not be pointed to any kind of function done in hardware, hard coded or either
in software mode, more then how many CPU cores are programmed for what to do.
By using the CCR1036 it would be matching something like this, 16 core for the WAN
interface and 20 for the entire rest of the system, but if 36 cores are able to perform
millions of packets over all the 36 CPU cores, this could not be reached if 20 core are
only for the OS. I hope you understand what I mean.
Any hints on getting these to work as advertised?
Not really.
Kindly regards
Kreacher ♬

--------------------------------------
Karma points must not be paid by you
 
mgob1985
newbie
Posts: 28
Joined: Wed Jun 18, 2014 3:02 pm

Re: Poor routing performance on CCR

Fri Jul 11, 2014 6:57 pm

Part of this is probably also the fact that it isn't forwarding these packets but rather having to dump them, and that can be costly since there is no fastpath entry etc for them so you're processing 600kpps in software essentially.
 
dottxt
just joined
Topic Author
Posts: 15
Joined: Sun Feb 02, 2014 5:53 pm

Re: Poor routing performance on CCR

Fri Jul 11, 2014 7:09 pm

The lack of outbound packets is via a null route (blackhole), not a firewall rule, so I would think the fastpath should still work. The CPU usage was double what it was in the screenshot prior to the null route being added, so even if a blackhole avoids fastpath, it still wasn't forwarding with the expected performance piror to the null route being put in place.

Who is online

Users browsing this forum: No registered users and 12 guests