High CPU on networking process with GRE tunnels

Hi,

I’m migrating from a Cisco setup that is currently working to a Mikrotik CCR1036-12G-4S. We got this router because it was the most expensive, so we thought it would work fine. My problem is that I added 6 GRE tunnels from remote datacenter and I am routing IP ranges through these tunnels. I’m not talking about 1 or 2 /27s…some tunnels have 15 /24s through them. All-in-all, there are about 9,000 IP addresses that are being tunneled. The IPs are bound to internal servers via their public IP and traffic goes out through the Mikrotik on the default gateway. I can run with 2 or 3 GRE tunnels with 5-6 /24s just fine. But when I enable all the GRE tunnels, I get lost packets, the ping times shoot up to 4-7 seconds and the router boots my connection.

I ran /tools profile and saw that whenever this happened, the “networking” process was consuming 40%-60% CPU (across all CPUs). No other processes are consuming anywhere near that CPU percentage.

I thought that maybe the problem was that I had exceeded some internal routing table limit (like 4096, which seems common for an internal routing table of 4k). Then I thought it was bandwidth, but we were only sending out like 5-10 Mbps. Could it be something with routing the packets out? I enabled proxy-arp on the WAN interface (ether8) to handle the outgoing traffic from the server.

Can this router board even support what I am doing? Is there a hardware switch I can add to the router board or even buy a different one?

Before I spend hours trying to debug this ghost issue, I’d like to know if I am just attempting to do too much on this router.
mikrotik.jpg
Here are my versions and such. I can send the supout.rif if anyone needs it.

[admin@MikroTik] > system package print
Flags: X - disabled 
 #   NAME                    VERSION                    SCHEDULED              
 0   routeros-tile           6.26                                              
 1   system                  6.26                                              
 2 X wireless-fp             6.26                                              
 3 X ipv6                    6.26                                              
 4 X wireless                6.26                                              
 5   hotspot                 6.26                                              
 6   dhcp                    6.26                                              
 7   mpls                    6.26                                              
 8   routing                 6.26                                              
 9   ppp                     6.26                                              
10   security                6.26                                              
11   advanced-tools          6.26



[admin@MikroTik] > /tool profile
NAME                    CPU        USAGE
www                     all           0%
snmp                    all           0%
console                 all           0%
graphing                all           0%
ssh                     all           0%
dns                     all           0%
firewall                all         0.2%
networking              all        34.5%
tftp                    all           0%
gre                     all           0%
logging                 all           0%
management              all           0%
routing                 all           0%
idle                    all        64.7%
profiling               all         0.2%
bridging                all           0%
unclassified            all           0%



[admin@MikroTik] > /system routerboard print
       routerboard: yes
             model: CCR1036-8G-2S+
     serial-number: 529E04D0F21E
  current-firmware: 3.22
  upgrade-firmware: 3.22



[admin@MikroTik] > interface gre print
Flags: X - disabled, R - running 
 0   name="Tunnel249" mtu=1476 
      actual-mtu=1476 local-address=173.232.117.249 
      remote-address=108.178.8.34 dscp=inherit clamp-tcp-mss=no 
      dont-fragment=no 

 1   name="Tunnel250" mtu=1476 actual-mtu=1476 
      local-address=173.232.117.250 remote-address=173.44.243.244 
      dscp=inherit clamp-tcp-mss=no dont-fragment=no 

 2   name="Tunnel251" mtu=1476 actual-mtu=1476 
      local-address=173.232.117.251 remote-address=37.228.130.25 dscp=inherit 
      clamp-tcp-mss=no dont-fragment=no 

 3   name="Tunnel252" mtu=1476 
      actual-mtu=1476 local-address=173.232.117.252 remote-address=5.83.140.3 
      dscp=inherit clamp-tcp-mss=no dont-fragment=no 

 4   name="Tunnel253" mtu=1476 actual-mtu=1476 
      local-address=173.232.117.253 remote-address=213.21.225.14 dscp=inherit 
      clamp-tcp-mss=no dont-fragment=no 

 5 X  name="Tunnel254" mtu=1476 
      actual-mtu=1476 local-address=173.232.117.254 
      remote-address=146.185.242.2 dscp=inherit clamp-tcp-mss=no 
      dont-fragment=no

Here is a sample of the pings as the CPU creeps up then backs back down (normal ping times are 46 ms):

64 bytes from 23.90.60.146: icmp_seq=3947 ttl=52 time=46.724 ms
64 bytes from 23.90.60.146: icmp_seq=3948 ttl=52 time=47.792 ms
64 bytes from 23.90.60.146: icmp_seq=3949 ttl=52 time=46.921 ms
64 bytes from 23.90.60.146: icmp_seq=3950 ttl=52 time=45.765 ms
64 bytes from 23.90.60.146: icmp_seq=3951 ttl=52 time=47.210 ms
64 bytes from 23.90.60.146: icmp_seq=3952 ttl=52 time=49.981 ms
64 bytes from 23.90.60.146: icmp_seq=3954 ttl=52 time=690.835 ms
64 bytes from 23.90.60.146: icmp_seq=3953 ttl=52 time=2562.783 ms
64 bytes from 23.90.60.146: icmp_seq=3955 ttl=52 time=1549.045 ms
64 bytes from 23.90.60.146: icmp_seq=3957 ttl=52 time=510.710 ms
64 bytes from 23.90.60.146: icmp_seq=3956 ttl=52 time=1995.245 ms
64 bytes from 23.90.60.146: icmp_seq=3958 ttl=52 time=706.535 ms
64 bytes from 23.90.60.146: icmp_seq=3959 ttl=52 time=748.073 ms
64 bytes from 23.90.60.146: icmp_seq=3960 ttl=52 time=419.606 ms
64 bytes from 23.90.60.146: icmp_seq=3961 ttl=52 time=466.671 ms
64 bytes from 23.90.60.146: icmp_seq=3962 ttl=52 time=1102.575 ms
Request timeout for icmp_seq 69500
Request timeout for icmp_seq 69501
64 bytes from 23.90.60.146: icmp_seq=3966 ttl=52 time=671.800 ms
Request timeout for icmp_seq 69503
64 bytes from 23.90.60.146: icmp_seq=3967 ttl=52 time=1711.755 ms
Request timeout for icmp_seq 69505
64 bytes from 23.90.60.146: icmp_seq=3969 ttl=52 time=1067.613 ms
64 bytes from 23.90.60.146: icmp_seq=3968 ttl=52 time=2260.470 ms
Request timeout for icmp_seq 69508
64 bytes from 23.90.60.146: icmp_seq=3972 ttl=52 time=1160.803 ms
64 bytes from 23.90.60.146: icmp_seq=3971 ttl=52 time=2760.430 ms
64 bytes from 23.90.60.146: icmp_seq=3974 ttl=52 time=801.476 ms
64 bytes from 23.90.60.146: icmp_seq=3975 ttl=52 time=1709.493 ms
64 bytes from 23.90.60.146: icmp_seq=3977 ttl=52 time=960.410 ms
64 bytes from 23.90.60.146: icmp_seq=3976 ttl=52 time=2642.457 ms
Request timeout for icmp_seq 69515
64 bytes from 23.90.60.146: icmp_seq=3979 ttl=52 time=1164.111 ms
64 bytes from 23.90.60.146: icmp_seq=3980 ttl=52 time=722.044 ms
64 bytes from 23.90.60.146: icmp_seq=3981 ttl=52 time=239.143 ms
64 bytes from 23.90.60.146: icmp_seq=3978 ttl=52 time=3446.835 ms
64 bytes from 23.90.60.146: icmp_seq=3982 ttl=52 time=802.285 ms
64 bytes from 23.90.60.146: icmp_seq=3983 ttl=52 time=335.949 ms
64 bytes from 23.90.60.146: icmp_seq=3984 ttl=52 time=359.406 ms
64 bytes from 23.90.60.146: icmp_seq=3985 ttl=52 time=44.808 ms
64 bytes from 23.90.60.146: icmp_seq=3986 ttl=52 time=303.564 ms
64 bytes from 23.90.60.146: icmp_seq=3987 ttl=52 time=221.595 ms
64 bytes from 23.90.60.146: icmp_seq=3988 ttl=52 time=185.180 ms
64 bytes from 23.90.60.146: icmp_seq=3989 ttl=52 time=1464.084 ms
64 bytes from 23.90.60.146: icmp_seq=3990 ttl=52 time=464.706 ms
64 bytes from 23.90.60.146: icmp_seq=3991 ttl=52 time=48.219 ms
64 bytes from 23.90.60.146: icmp_seq=3992 ttl=52 time=973.814 ms
64 bytes from 23.90.60.146: icmp_seq=3993 ttl=52 time=92.268 ms
64 bytes from 23.90.60.146: icmp_seq=3994 ttl=52 time=44.831 ms
64 bytes from 23.90.60.146: icmp_seq=3995 ttl=52 time=44.633 ms
64 bytes from 23.90.60.146: icmp_seq=3996 ttl=52 time=425.471 ms
64 bytes from 23.90.60.146: icmp_seq=3997 ttl=52 time=257.915 ms
64 bytes from 23.90.60.146: icmp_seq=3998 ttl=52 time=48.591 ms
64 bytes from 23.90.60.146: icmp_seq=3999 ttl=52 time=47.437 ms
64 bytes from 23.90.60.146: icmp_seq=4000 ttl=52 time=44.998 ms
64 bytes from 23.90.60.146: icmp_seq=4001 ttl=52 time=45.254 ms
64 bytes from 23.90.60.146: icmp_seq=4002 ttl=52 time=46.293 ms
64 bytes from 23.90.60.146: icmp_seq=4003 ttl=52 time=45.440 ms
64 bytes from 23.90.60.146: icmp_seq=4004 ttl=52 time=47.513 ms
64 bytes from 23.90.60.146: icmp_seq=4005 ttl=52 time=45.615 ms
64 bytes from 23.90.60.146: icmp_seq=4006 ttl=52 time=44.836 ms
64 bytes from 23.90.60.146: icmp_seq=4007 ttl=52 time=45.666 ms