IPv6 DoS on CCR-1036-8G-2S+EM

Hi,

I’d like to share conclusions on the issue we were dealing with for few weeks.

On our network we use CCR-1036-8G-2S+EM at the edge. Network isn’t big so this router perfectly fits to our needs. Some time ago we noticed unexpected and odd behaviour of the router that was making our network totally unavailable for short period of time, about 5 mins.
The log had information about all connected ethernet interfaces going down. We have 3 different switches and a server directly connected to the router and all those connections were going down like when the cable is pulled off. This symptom is often described as port flapping.

Because we run BGP with our peer ISP port flapping was causing downtime on our network due to BGP session loss. Also we noticed the router is very unresponsive during the event.

Since we haven’t much clue on what is going on we took the shortest route and updated RouterOS from 6.19 to 6.34 and firmware to 3.27.

This changed the situation a bit. Port flapping stopped and BGP session wasn’t interrupted any more. However, network outage still did happen. Our network was going down one, two times per day for 3-5 minutes.

After a lot of investigation and packet sniffing we finally found an external (Internet) host that is sending a large number of IPv6 packets causing router to generate IPv6 Neighbour Discovery packets at high rate. That ‘high rate’ is questionable here - we noticed this specific host is sending packets in range of 80-150 kpps. The regular IPv4 traffic at 100kpps isn’t really high rate. This router can handle such rate with no issue.

What is specific about this DoS is that host is just iterating over one or the other of our IPv6 /64 networks. By iteration I mean this host is sending single UDP or TCP SYN packet to each IPv6 address within that /64. It is like a scan for alive hosts on the network. Such scan can be created with alive6 tool from THC-IPv6 toolkit.

Because of the rate of such scan the CCR-1036 was getting totally congested cutting out all IPv4 and IPv6 traffic. It was hardly possible to operate the router with Winbox, CLI or serial during the event.

It’s worth to mention that we have absolutely no firewall rules for IPv6 enabled. There is about 9 rules on the list but all of them are disabled.

Surprisingly it was IPv6 firewall that helped us to mitigate the problem. First we tried to limit packet rate at which router is sending to multicast addresses in order to limit excess of ND packets thrown on the network. This didn’t give good results. Then we put blocking rule for the /64 network where that offending host is located after multicast rate limit. This seems to make the problem less invasive but issue wasn’t completely solved. Finally we moved this block to the top of the chain and apparently problem is gone. We still see this host sending a large amount of packets in short time but they are all dropped now.

The sad fact here is blocking an /64 network doesn’t solve the problem.

Here comes questions to Mikrotik users:
Did anyone faced similar problem with CCR-1036/1016/1072?
In case our performance level is too low could it be we have damaged unit?
Could it be related only to the fact the router is forced to generate ND packet for each probe and the rate is just too high for this task?

If anyone is interested with replicating this problem we can assists or arrange a test lab.

Sorry for long post.

Cheers,
P