Page 1 of 1

5.x routing cache bug (?) - dropped packets, lost network

Posted: Fri Apr 22, 2011 9:51 am
by glucz
I reported a problem to mikrotik earlier in which routeros 5.x looses network connectivity every day or so - depending on load.

Support told me that the reason for this was that my routing cache filled up. They suggested that my users were running p2p or the router is DDOS-ed. However my suspicion is that this is an actual bug and I would like to see if anyone else is experiencing this - so that maybe a fix for this could be seriously considered.

The visible effects are that memory usage on the router constantly increases and the following command:
[admin@MikroTik] > ip route cache print
cache-size: 29621
max-cache-size: 32768

shows a gradually increasing routing cache usage.

Once the max size is reached the router will start dropping packets and will eventually go into total silence. The router will never recover by itself as I would expect from a p2p or DOS event, as the cache entries expire etc ..., but will instead require a reboot. Additionally, I don't see the many thousands of connections that I would expect from DOS or p2p. This is mostly 30-40 users generating 2000-5000 connections. When I downgrade to 4.x, the problem goes away.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Wed Apr 27, 2011 11:49 pm
by seany
Hello,

This sounds exactly like what is happening to a friend. We'll try to reproduce on 5.2!

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Thu Apr 28, 2011 10:24 am
by glucz
There is a route fix in 5.2 . I hope that it was in response to this bug report. I upgraded 4 routers yesterday

So far I'm up to here:

[admin@MikroTik] > ip route cache print
cache-size: 2596
max-cache-size: 65536


I'll just wait and see what happens.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Thu Apr 28, 2011 12:39 pm
by glucz
[admin@MikroTik] > ip route cache print
cache-size: 15239
max-cache-size: 65536

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Thu May 05, 2011 4:52 pm
by krakenant
I am starting to have issues with routes now. I have two routes (at least)that multiple x86 units, two on 5.2 and two on 5.0, won't find. The route is there, I can export or print and it is there, but if I do a print or a find where I specify a dst-address, the query returns nothing.

It broke two of my scripts. They were working until yesterday when I added some mangle rules and a couple of routes based on the routing marks of those mangle rules.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Thu May 05, 2011 5:33 pm
by Chupaka
so it was working, and then became broken without upgrade/reboot/etc?

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Thu May 05, 2011 6:10 pm
by krakenant
They were all working with 5.0. Then I added some mangle rules and a couple of routes that used those mangle rules. Since then my scripts that were working, no longer work due to being unable to find an active route with a dst-address of 0.0.0.0/0 despite there being one and nothing with that route changing and no new routes being added.

These are x86 boxes, I decided to update two of them to 5.2 and the issue still remains.
Two of them have not been upgraded/rebooted etc.

Here is one of the queries that is failing and the default route as well as the two routes that I added.
/ip route find dst-address=0.0.0.0/0 active=yes
[admin@00:60:E0:4C:A5:28] > ip route export
# may/05/2011 10:08:25 by RouterOS 5.0
# software id = C2PW-3RZV
#
/ip route
add disabled=no distance=1 dst-address=172.27.0.0/16 gateway=USER_VRRP2_SECONDARY pref-src=172.27.0.2 routing-mark=VRRP2 scope=10 target-scope=10
add disabled=no distance=1 dst-address=172.27.0.0/16 gateway=USER_VRRP1_PRIMARY pref-src=172.27.0.1 routing-mark=VRRP1 scope=10 target-scope=10
add comment="Default Route" disabled=no distance=10 dst-address=0.0.0.0/0 gateway=x.x.x.x scope=30 target-scope=10

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Sat May 07, 2011 10:04 am
by babbage
[admin@MikroTik] > ip route cache print
cache-size: 15239
max-cache-size: 65536
Glucz,
Confirm to have the same issue. I scare to upgrade to 5.2 because of possible new bugs! Let me know if your issue is fixed. It's a month I am in touch with MK and they are still unable to update me on the result.

I have diffetent x86 servers and 3 of them have this issue. All different configs!

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Sat May 07, 2011 9:13 pm
by glucz
The problem is still present in 5.2

I have been sending supouts to support to help their work .. maybe things will improve in 5.3? I unfortunately need SSTP, so I must keep 5.2 on a few servers.

This also gave me the opportunity to test a few scenarios and found the following:
I have a demo PPTP/L2TP profile that is time and bandwidth limited. So routerOS cuts the user after some minutes. These dynamic server interfaces will also create fifo interface queues to manage the individual bandwidth limits. I believe that one of these causes the stale entries in the cache.

I have disabled this demo profile on 2 servers, so all disconnections are "clean" and the regular user accounts don't create interface queues. On these routers the cache numbers stay within reasonable values.

Here are the actual numbers:
SERVER 1 with ROS 5.2 and active demo profile (no p2p):
uptime: 2d10h33m55s
cache-size: 16499
max-cache-size: 32768

SERVER 2 with ROS 5.2 active demo profile (no p2p):
uptime: 5h41m1s
cache-size: 5059
max-cache-size: 16384

SERVER 3 with ROS 5.2 active demo profile (no p2p):
uptime: 19h5m5s
cache-size: 7834
max-cache-size: 65536

SERVER 4 with ROS 5.2 DEACTIVATED demo profile (with p2p!!!):
uptime: 2d22h3m9s
cache-size: 1215
max-cache-size: 32768

SERVER 5 with ROS 5.2 DEACTIVATED demo profile (no p2p):
uptime: 2d10h32m52s
cache-size: 200
max-cache-size: 65536


As you can see my uptimes are rather low, so even where the cache problem is not present other lockup problems and lingering OpenVPN problems require reboots

As a comparison here is another server running ROS 3.X
[admin@MikroTik] > system resource print
uptime: 31w13h50m28s
version: "3.27"

I have no routing cache information, but uptime is much better

Same with 4.x ... but I did have to reboot this recently, but usually it is very stable
[admin@MikroTik] > system resource print
uptime: 1w2d15m14s
version: "4.17"
cache-size: 196
max-cache-size: 16384


I don't know if its important or not, but I also run a script that removes invalid server interfaces and the associated IP addresses and queues. I don't know where they come from, but from time to time I just happen to get a whole bunch of them, then nothing for weeks. This is a RouterOS problem present since the late 2.x and early 3.x versions - ever since I started working with RouterOS. Maybe these removes are not clean on 5.x ?



GL

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Sun May 08, 2011 7:30 am
by babbage
I am doing the same, sending different sutpout files to help them resolve the issue. I have 2 tickets open for a long time.
I think tt's not related to interface or queue issues, cause I have a core router with only ethernet interfaces and problem happens there too.
I am impatiently waiting for the fix... I love the new 5.x's PCQ burst feature...

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Fri May 13, 2011 5:19 pm
by OndrejSkipala
Hi, I have the same problem on RouterOS 4.x (lastly on 4.16). My RB433 stops responding to IP traffic, only MAC works (but sometimes not very well). No ping, no routing. When I reboot it, works fine again. This happenes after 112 or 113 days on all routers. Also one of my RB600 collapsed in this way just after 72 days. When I connect to the device via MAC and try to ping, it says "timeout No buffer space available".

I don't think this is a DOS attack, because it happenes on all routers after 112 days, but not in the same moment. Also no attack is present according to the monitoring. I was writing several times to Mikrotik support and after "upgrade to new version", "increase queue sizes", "it is a DOS attack"... Maris from the support found out that the /ip/route/cache seems to be full. Althoug there was very little load. Memory is not full (a lot of space), disk the same, CPU 10%.

So I looked into a router that was running for 112 days and I was quite certain it is going to fail today as the others. And the cache was almost full (just a few entries). After 30 minutes, it happened - full cache, lost connection, no responding to IP traffic.

So it must be someting with this cache. The only thing that runs regularly on that router is 5 ICMP echo messages every 15seconds from our server and Traffic Flow enabled, that sends data to the same server. Not much of a traffic, but it is someting that is common to all our routers.

Do you also use something that regularly creates some kind of traffic?

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon May 23, 2011 1:47 am
by reddrinker
Hi,

We are experiencing a similar issue with RB1000's with OS 5.2 however when I look at the route-cache it looks ok.

[admin@MikroTik] > ip route cache print
cache-size: 2423
max-cache-size: 65536

When I re-boot the router it all works ok but only for a few hours in our case, the we see symptoms as below creep back in, and another re-boot is required.
HOST SIZE TTL TIME STATUS
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 56 128 0ms
192.168.1.8 timeout
192.168.1.8 56 128 0ms
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 timeout
192.168.1.8 56 128 0ms
192.168.1.8 timeout
sent=20 received=3 packet-loss=85% min-rtt=0ms avg-rtt=0ms max-rtt=0ms

I have downgraded another RB1000 to 4.17 and also got a RB110 which came with 4.17 to try and see if the problems follows.

I am very new to the RouterOS so any guidance in trying to narrow down this issue would be appreciated.

Thanks in advance.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon May 23, 2011 8:54 am
by mm690
SHAME ON mikrotik for removing my post about this issue.!!!!

I was going to paste the link to a previous thread i created about this problem, but MT has deleted it I would imagine due to its title.

It was titled " Petition to remove 5.xx as stable release!"

I had some good info in there as to how to re create the problem etc.

Im frigging mad!

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon May 23, 2011 9:56 am
by normis
if you want a problem to be solved, follow this advice:
http://forum.mikrotik.com/viewtopic.php?f=2&t=45259

and also this:
http://forum.mikrotik.com/faq.php#f0r0

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon May 23, 2011 5:10 pm
by glucz
In case the original thread was removed and others experience the route-cache problems, lost pings etc ... change your tarpit actions to drops

The problem is possibly due to a diffeerent route cache / tarpit implementation in the new linux kernel used by ROS 5, so this may be present in 5.3 or other future versions.

GL

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Wed May 25, 2011 9:50 pm
by goshawk
Hi,

I have something simmilar problem with my fresh installed 5.2 on a PC.

I have multiple VLAN-s on the same ethernet interface and sometime when
I delete an IP address on a VLAN the route policy generated automatically
for that IP address not deleted. I can't delete it too because its not a static route.

After that not only that Network will be unreachable but others too.
Reboot always solves the problem.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Wed May 25, 2011 9:56 pm
by brainy
Hi,

I have something simmilar problem with my fresh installed 5.2 on a PC.
5.3 was released today.

Maybe someone who is affected can try if the problem is fixed in 5.3?

Would be nice to know.

Regards,
Joerg

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon Sep 26, 2011 8:36 am
by OndrejSkipala
Hi guys, I was trying to solve this problem with Mikrotik support for a hundred times and finally a fix was made. Version 5.6 of RouterOS really seems to work with route-cache. Now, it is not getting unreasonably big (it was growing and growing), so I dont need to restart my routers every 110 days like I had to (otherwise they would stop responding to IP traffic). Good job Mikrotik.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon Sep 26, 2011 7:36 pm
by jrecabeitia
Dear, I have a RB1100 who upgrade to v.5.7
The result was very bad.
The RB1100, stopped working. The memory is consumed until it stops and does not respond. (single ping)
This occurred within a few hours.
I went back to version 5.6
Note that changing the values ​​of PCQ was stabilized but not enough to function normally.
Apparently, there are records in memory, which should fly obsolete. Honestly, I can not give more information about this bug.

I have other RB433, RB493 which have identical configurations in OSPF and only differ in that they do not use queues of any kind. These work well with v.5.7

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon Nov 28, 2011 7:10 am
by patmcq
:shock:
I have a similar issue with a 1200 (Release 5.8 ). I am trying to replace an HP router (dl7012?!) with RouterOS. I have an 80mb connection with 100mb on the way. I service about 60 customers\connections all 100mb Ethernet, many of them "Nat" their connection.

I replaced the HP with the 1200 and everything was fine... for about 2 minutes. Router would not respond to pings, to and through the device. CPU was about 2-7%.

I am not doing anything fancy at the moment. No Masquerade, no Mangle, No VPN, nothing. Just routing between two Ethernet ports. I tried this late on a Sunday night with literally no load and no traffic.

Any ideas.

Patrick

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Tue Dec 11, 2012 4:37 pm
by farzin
hi , i know this is an old topic , but 2 days ago , i have upgraded my x86 router from version 5.22 to 6rc5 and i have faced the exact same problem .
with 200 online pptp users my ip route cache of 65K becomes full and router stops responding to ip packets. i can login to it via mac and send commands via terminal but winbox or ping do not work . the current solution that i have found is that max-cache size is related to memory of the device . i have installed more memory to it and now i have 1M cache size . but it is going to be used soon too . the hanging of every 4 hours ( 65 k cache size is fully used in this time ) is increased to 1M but in 1 day 150K is used till now and i think every 4 5 days i must reboot this routers.

this is in the case that memory / cpu /hdd is free and only ip route cache is filled. i also do not have any tarpit action , i have nat or syn firewall rules but they were normal in version 5.2x and 5.1x and i have this problem from the day that it is upgraded to version 6.
i also removed every package that i dont need from the mikrotik , even i removed ip v6 package from it and i only have 7 packages remained in it with no good result . and also there is no user manager or radius internal logging in this router. this is just a pptp server and gateway routing in it . when online pptp clients increase to more than 100 useres , the firewall connection do not show anything more , and ip route cache increases very fast till filling it and router hanging.
before such condition , mikrotik sees connections in ip --> firewall --> connection , and ip cache increase / decrease as the users get online and disconnect .
but after sometime this just goes to an increase of ip route cache and system hangs.
i can downgrade to version 5.22 again but isnt there any real solutions yet for this kernel bug ?
this is when the router was working normally 

/ip route cache print
      cache-size: 2075
  max-cache-size: 65536

decreased > /ip route cache print
      cache-size: 1935
  max-cache-size: 65536
 increased> /ip route cache print
      cache-size: 2831
  max-cache-size: 65536
decrease> /ip route cache print
      cache-size: 2395
  max-cache-size: 65536

> /ip route cache print
      cache-size: 3644
  max-cache-size: 65536

>increasing started with out decreasing any more after more than 100 onlines ! 
 /ip route cache print
      cache-size: 32804
  max-cache-size: 65536

>this is my current status after 1 day uptime of router and increasing the memory to 1 GB of installed ram to the router. ( it will fill in the next 2 3 days i think )
 /ip route cache print
      cache-size: 120995
  max-cache-size: 1048576

 /ip firewall connection print
Flags: S - seen reply, A - assured 
 #    PR.. SRC-ADDRESS           DST-ADDRESS           TCP-STATE   TIMEOUT

nothing is here anymore ! connection tracking is active .
in this menu , down in winbox status it shows : 0 items out of 8430 , max entries 524288

so any helps would be appreciated.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Wed Jan 02, 2013 2:25 pm
by mrz
hi , i know this is an old topic , but 2 days ago , i have upgraded my x86 router from version 5.22 to 6rc5 and i have faced the exact same problem .
with 200 online pptp users my ip route cache of 65K becomes full and router stops responding to ip packets. i can login to it via mac and send commands via terminal but winbox or ping do not work . the current solution that i have found is that max-cache size is related to memory of the device . i have installed more memory to it and now i have 1M cache size . but it is going to be used soon too . the hanging of every 4 hours ( 65 k cache size is fully used in this time ) is increased to 1M but in 1 day 150K is used till now and i think every 4 5 days i must reboot this routers.

this is in the case that memory / cpu /hdd is free and only ip route cache is filled. i also do not have any tarpit action , i have nat or syn firewall rules but they were normal in version 5.2x and 5.1x and i have this problem from the day that it is upgraded to version 6.
i also removed every package that i dont need from the mikrotik , even i removed ip v6 package from it and i only have 7 packages remained in it with no good result . and also there is no user manager or radius internal logging in this router. this is just a pptp server and gateway routing in it . when online pptp clients increase to more than 100 useres , the firewall connection do not show anything more , and ip route cache increases very fast till filling it and router hanging.
before such condition , mikrotik sees connections in ip --> firewall --> connection , and ip cache increase / decrease as the users get online and disconnect .
but after sometime this just goes to an increase of ip route cache and system hangs.
i can downgrade to version 5.22 again but isnt there any real solutions yet for this kernel bug ?
this is when the router was working normally 

/ip route cache print
      cache-size: 2075
  max-cache-size: 65536

decreased > /ip route cache print
      cache-size: 1935
  max-cache-size: 65536
 increased> /ip route cache print
      cache-size: 2831
  max-cache-size: 65536
decrease> /ip route cache print
      cache-size: 2395
  max-cache-size: 65536

> /ip route cache print
      cache-size: 3644
  max-cache-size: 65536

>increasing started with out decreasing any more after more than 100 onlines ! 
 /ip route cache print
      cache-size: 32804
  max-cache-size: 65536

>this is my current status after 1 day uptime of router and increasing the memory to 1 GB of installed ram to the router. ( it will fill in the next 2 3 days i think )
 /ip route cache print
      cache-size: 120995
  max-cache-size: 1048576

 /ip firewall connection print
Flags: S - seen reply, A - assured 
 #    PR.. SRC-ADDRESS           DST-ADDRESS           TCP-STATE   TIMEOUT

nothing is here anymore ! connection tracking is active .
in this menu , down in winbox status it shows : 0 items out of 8430 , max entries 524288

so any helps would be appreciated.
Please contact support at mikrotik.com with supout file generated at the time when route cache is full.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Thu Nov 28, 2013 11:13 pm
by ujin
This issue can be reproduced on RouterOS 6.3-6.4-6.5-6.6 as well!!!

I had already contacted to support and sent supout file, but every time I get reply to update/upgrade RouterOS again and again from support team. They always asked me to send supout file from newest RouterOS and have not provided any real fixes yet!!!!

I use different versions of RouterOS for different customer network projects and observe this bug on all of these RouterOS(6.1 - 6.6) on the following platforms: mipsbe, tile, x86.

Dear Miktorik support, please do not forward me and other engineers the standard messages about system upgrading but MAINTAIN THIS ISSUE instead. Because this issue BLOCKS our customers' network operability and we would need to discard using MikroTik in our projects (for you information, currently it is more than thousand devices).

Thank you for your understanding.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Fri Nov 29, 2013 12:44 am
by farzin
hi , i don't know the exact cause of those issues . but i have this issue on some versions and they get fixes in later versions and then they appear again in another update ! i some how used to it !

for example in my current mikrotik version 6.4 i didn't have this issue. and on some version 5 like 5.24 it is working fine . but on some other versions like RC the bug exists.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Fri Nov 29, 2013 1:52 pm
by mitja2847
Hi, we have same problem on 386 ROS Running v.6.1 Last incident just yesterday. :(

I have noticed, that you can accelerate this problem if you make a log rule into firewall and allot packets would be caught by that rule. Mikrotik I hope it helps you to solve the problem soon...

Can anybody confirm bug-free +6.x version ??

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon Dec 02, 2013 12:17 pm
by ujin
This bug is not fixed in all versions including 6.6.
And also does not depend on firevall, but depends on the configuration of dynamic routing (OSPF).

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Tue Jan 21, 2014 6:23 pm
by rmmccann
I just learned today that I am suffering from this bug as well.

I'm on x86, ROS v6.7. I am not using any Dynamic Routing Protocols. I have some EoIP tunnels, SSTP, PPTP and static routes but that is about it. I have an IP6to4 tunnel as well to tunnelbroker.

Can watch my route cache fill up by the minute. Router at the opposite end of my network is identical and does not have this problem, but also does not have SSTP, PPTP or a 6to4 tunnel running.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Tue Jan 21, 2014 8:52 pm
by rmmccann
I just learned today that I am suffering from this bug as well.

I'm on x86, ROS v6.7. I am not using any Dynamic Routing Protocols. I have some EoIP tunnels, SSTP, PPTP and static routes but that is about it. I have an IP6to4 tunnel as well to tunnelbroker.

Can watch my route cache fill up by the minute. Router at the opposite end of my network is identical and does not have this problem, but also does not have SSTP, PPTP or a 6to4 tunnel running.
I have narrowed my problem down to my 6to4 tunnel with Tunnelbroker.net. If I disable the tunnel interface the route cache is instantly purged and back to more reasonable levels. I believe it has to do with me having unblocked IRC access on my tunnels, as once I requested they reblock IRC I am no longer watching the route cache continue to increase. I am going to watch it over the next few days and see if this trend continues or if it is just coincidence.

Re: 5.x routing cache bug (?) - dropped packets, lost networ

Posted: Mon Mar 10, 2014 2:53 am
by timk
I am hitting this bug too, running RouterOS 6.10.

I have three RB2011s with similar configs, however the problematic one is running an L2TP/IPSec VPN and the other two are not.

More details here:
http://forum.mikrotik.com/viewtopic.php ... 40#p413940

Cheers