Community discussions

 
Fraction
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 82
Joined: Wed Jan 16, 2013 9:42 pm
Location: Helsinki, Finland

RB2011UAS-2HnD stops responding spontaneously

Mon Mar 17, 2014 2:56 pm

Hi,

I have been suffering with a quite annoying problem with RB2011UAS-2HnD (ROS 6.10) last few weeks. This happened first time at same day than I upgraded from ROS6.9 to 6.10, so I'm not sure is it related to ROS-version or RB or what. I didn't change my configuration at that time (which has been working almost untouchable over year now).

Anyway, my problem is that my RB stops responding to any network-traffic (including ping to localhost address from device itself) occasionally. This happens maybe once per week and reboot resolves the issue to the next time.

When the device is jammed I can connect to it via serial and it seems to be working as expected, but ping to even its own localhost address gives only respond: "132 (No buffer space...".

Because the device is in unmanned location, I set up Watchdog timer to watch address 127.0.0.1 and it works as some kind of workaroud, it reboots the device and soon everything is working again.

Am I only one suffering this? I found some quite old threads with same symptoms but not actually any resolutions..
 
makuk66
just joined
Posts: 2
Joined: Fri Dec 20, 2013 6:13 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Tue Mar 18, 2014 12:20 pm

I have a RB2011UiAS-RM that seems to lock up regularly, with 6.10 and an older version before that. Without the watchdog, it ends up not routing or allowing login and the touch-screen becomes unresponsive but the ethernet ports still flash. With the watchdog it duly resets. This seems to happen at least once a day, sometimes more often. Monitoring health shows ok temps and voltage, monitoring the OS shows free memory and a partly idle CPU, the logs show nothing particularly dubious. I'm currently experimenting with excluding functionality (I was using Traffic Flow and Web Proxy), to see if I can make it more stable. I'm not set-up for serial right this minute, but that's a good idea.

Any other suggestions appreciated.
 
JanezFord
Member Candidate
Member Candidate
Posts: 263
Joined: Wed May 23, 2012 10:58 am

Re: RB2011UAS-2HnD stops responding spontaneously

Wed Mar 19, 2014 7:59 pm

Hello,

I have experienced the same issue as Fraction described. The unit was on remote location and was rebooted by a helper using LCD display and pin number. Another time I believe the same happened on one of our rb450g. All systems run latest firmware and v6.10.

Edit: The first unit was RB2011UAS-RM.

JF.
 
Majklik
newbie
Posts: 35
Joined: Fri Dec 23, 2011 10:20 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Thu Mar 20, 2014 11:55 am

This problem with "No buffer space available" and stopped IPv4 communication is not ROS6.10 only related. I see it on some my routers (RB800, RB450G) for whole ROS6 line. If you look on "/ip route cache print" then you see that is full, so IPv4 stops communication (IPv6 works). In my configuration this problem is related to the SSTP server operations, if it is disabled then I have not this problem. The cache is full filled after few days. But after update to the ROS6.10 is the cache full after 12~24 hours. There is configured firewall that limits connections to the SSTP server only from allowed addresses. Only three SSTP clients connected to the affected routers and during whole routers uptime there are around 20~50 connections attemps made to the SSTP server (TCP/443 port).
I've other systems with similar configuration where I do not see this problem (RB1100AHx2, ROS6.5), on these systems is SSTP server unprotected with firewall and there is about 100 connected clients full time.
On affected systems I use this scheduled script to repair this state with reboot:
:local act [/ip route cache get cache-size]
:local max [/ip route cache get max-cache-size]

if (($max-$act)<=2048) do={
  /system reboot
}
 
JanezFord
Member Candidate
Member Candidate
Posts: 263
Joined: Wed May 23, 2012 10:58 am

Re: RB2011UAS-2HnD stops responding spontaneously

Fri Mar 21, 2014 3:23 am

TNX for sharing info and your script with us Majklik. I don't use SSTP on any of those devices that suffered from this issue.

JF.
 
Majklik
newbie
Posts: 35
Joined: Fri Dec 23, 2011 10:20 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Fri Mar 21, 2014 12:06 pm

This problem with route cache was there long time ago, with different ROS versions too and different confgirutations.
It is pity, that ROS do not allow show the contents of this cache and flush it. On linux this can be done with "ip route show cache" and "ip route flush cache".
This problem probably definitively will be solved when ROS switch to the linux kernel 3.6 because in this version was the route cache removed from the kernel with these arguments:
"The ipv4 routing cache is non-deterministic, performance wise, and is subject to reasonably easy to launch denial of service attacks."
 
uldis
MikroTik Support
MikroTik Support
Posts: 3425
Joined: Mon May 31, 2004 2:55 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Sun Mar 23, 2014 7:10 pm

make sure that you all upgrade the RouterOS to the v6.11 and as well upgrade the RouterBoard firmware to the latest available (at least v3.11 or newer) - that will guarantee that the gigigabit ethernets don't have temp hangs.
 
JanezFord
Member Candidate
Member Candidate
Posts: 263
Joined: Wed May 23, 2012 10:58 am

Re: RB2011UAS-2HnD stops responding spontaneously

Sun Mar 23, 2014 7:59 pm

make sure that you all upgrade the RouterOS to the v6.11 and as well upgrade the RouterBoard firmware to the latest available (at least v3.11 or newer) - that will guarantee that the gigigabit ethernets don't have temp hangs.
Hello Uldis

I have already upgraded my devices to latest software/firmware. This "temp" hang lasted for about 12 hours before I could get someone to reboot my device (remote location)...Until this bug is confirmed fixed I prefer using watchdog (ping gateway or localhost) or Majklik's script just to be on the safe side.

JF.
 
timk
just joined
Posts: 14
Joined: Wed Sep 05, 2012 3:33 am

Re: RB2011UAS-2HnD stops responding spontaneously

Tue Mar 25, 2014 6:46 am

I have also hit this problem on an RB2011 running L2TP/IPSec VPN server as described here:
http://forum.mikrotik.com/viewtopic.php?f=14&t=78107

I obtained this info via the serial console, all other networking is unavailable:
uptime: 2d20h16m
version: 6.10

cache-size: 16384
max-cache-size: 16384
Rolling back to 5.25 fixed the issue, I haven't had a chance to test 6.11 but MikroTik support said they could re-create my issue and would try have the fix included in it.

Cheers
 
Majklik
newbie
Posts: 35
Joined: Fri Dec 23, 2011 10:20 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Tue Mar 25, 2014 9:07 am

Yes, I've still this situation witn ROS6.11 on the RB800 and RB450G. No more that one day of the uptime. Disabled SSTP eliminates this problem. But there are used many other services so this bug is combination with something another (VRRP on all interfaces, bonding, bridges, VLANs, GRE/IPsec, SIT tunnels, OSPFv2/v3, BGP).
 
User avatar
NathanA
Forum Veteran
Forum Veteran
Posts: 801
Joined: Tue Aug 03, 2004 9:01 am

Re: RB2011UAS-2HnD stops responding spontaneously

Tue Mar 25, 2014 3:48 pm

So, I have run into this bug recently, too. This post should probably be re-titled and moved to a different sub-forum, because this is not an RB2011 problem or even a RouterBoard problem. We are experiencing this bug with x86 RouterOS, too.

Ever since upgrading one of our x86 boxes to 6.11 from 5.26, the router has stopped responding once a day on account of this bug, and has needed to be rebooted. We are not running SSTP on this box, although we do run an L2TP server on it. I am not 100% convinced that L2TP is what is causing the Linux route cache to balloon in our case, though: usually after a reboot, the route cache size stays pretty low and doesn't change much, even if I rapidly make several L2TP connections/disconnections to it in the span of a few seconds.

I once was able to catch it in the act of failing, though, before it had completely done so: I logged into the box, peeked at the route cache, and it was increasing at the rate of about 20-30 per second, and sometimes faster (100+). In the span of a few minutes, it had reached 24K entries out of a maximum (on this box) of 32K. Nothing that I attempted to do managed to stop the rate of growth, and this is on a box that has a pretty simple configuration, has a very small routing table, and doesn't participate in any dynamic route exchange/forwarding protocols. So it was very strange to see this behavior. After a reboot, the route cache got up to a little > 100, and then stayed there. Part of me wonders if it is a bot mounting some kind of DoS/intentional route cache poisoning attack on vulnerable Linux boxes.

I don't know yet what causes the route cache to go into a tailspin, but since this router's utility is so small and isn't configured to do much, I'm hopeful that if I try to replicate in a lab environment, I will eventually be able to find the trigger. I'll file a report with MikroTik if I am able to do so. Hopefully they'll be able to do something about it, even if the problem ends up being in Linux itself rather than anything RouterOS-specific.

Very much looking forward to RouterOS 7, which will surely use a version of the Linux kernel >= 3.6...

-- Nathan

P.S. -- Not sure whether it is wise to mention this or not, but I did at least run across a workaround. I'm not sure what the possible negative effects and implications of this workaround might be, but if you can gain access to the 'devel' account on your specific router (...that's as much as I will say about that...), you can both manually flush the Linux IPv4 route cache as well as tweak the Linux route cache settings to auto-flush old entries at a much faster rate, which should prevent the cache from reaching max-cache-size once it starts going crazy.

This shell command will flush the cache:

echo 1 > /proc/sys/net/ipv4/route/flush

These two commands will dramatically increase the rate at which the route cache garbage collector expires entries, which should help it keep up when the growth rate decides to spontaneously explode:

echo 5 > /proc/sys/net/ipv4/route/gc_interval
echo 5 > /proc/sys/net/ipv4/route/gc_timeout

These changes are not permanent, and will revert to default settings (60 for gc_interval, 300 for gc_timeout) when you reboot the router.

EDIT: This workaround turns out not to always be effective; see my next post in this thread for details.
Last edited by NathanA on Wed Mar 26, 2014 6:42 am, edited 2 times in total.
 
timk
just joined
Posts: 14
Joined: Wed Sep 05, 2012 3:33 am

Re: RB2011UAS-2HnD stops responding spontaneously

Tue Mar 25, 2014 10:51 pm

Nice discovery Nathan!

Have you tried the '-C' option to the Linux route command within the devel login? It would be interesting to see what all the entries are!

Cheers
 
User avatar
NathanA
Forum Veteran
Forum Veteran
Posts: 801
Joined: Tue Aug 03, 2004 9:01 am

Re: RB2011UAS-2HnD stops responding spontaneously

Tue Mar 25, 2014 11:47 pm

Have you tried the '-C' option to the Linux route command within the devel login? It would be interesting to see what all the entries are!
Neither the 'ip' nor 'route' commands appear to be part of the busybox binary that MikroTik ships with RouterOS, which is why I am interacting with the 'proc' virtual filesystem directly. I have plans to build and try a more complete, statically-linked busybox binary that includes the 'ip' command, though I have not checked yet to see how complete busybox's version of 'ip' is.

-- Nathan

EDIT: Update:

I now have a statically-linked busybox binary that includes both 'ip' and 'route'. The busybox version of 'route' doesn't support the -C parameter, but 'ip route show cache' does work, and if I run that on the box in question while the route cache size is going berserk, it doesn't show anything abnormal: route cache size says 10000+ entires, but 'ip route show cache' only shows the 5 or so routes that I would expect to see on this particular router. So that's interesting.

The problem (well, at least, my problem) is definitely related to MikroTik's new PPP code, though. I'll be attempting to put together a step-by-step method of reproducing the bug once I have it 100% nailed down, but right now it appears to be triggered if you are running a PPP-based server (PPTP, L2TP, SSTP, maybe even PPPoE, etc.) and you have several connections to it go up and down. Eventually, after one of the PPPs disconnect, it seems like there might be some kind of race condition that occurs when it tries to tear the PPP interface down. 'dmesg' output shows a huge number of messages like this on my router shortly after the route cache starts spinning out of control:
unregister_netdevice: waiting for ppp1 to become free. Usage count = 1
unregister_netdevice: waiting for ppp1 to become free. Usage count = 1
unregister_netdevice: waiting for ppp1 to become free. Usage count = 1
unregister_netdevice: waiting for ppp1 to become free. Usage count = 1
[...]
...and something just keeps repeating that message over and over again. This is despite the fact that 'ppp1' as an interface no longer exists:
# busybox ifconfig ppp1
ifconfig: ppp1: error fetching interface information: Device not found
...so it's trying to unregister a device that doesn't exist?

More bad news: once it gets to this state, it appears that it is impossible to flush the cache of the entries being added to it (I assume by PPP?). It would seem that something in the PPP subsystem has a lock on those entries and they can't be freed. Tweaking with the route cache garbage collector values doesn't make a difference, either...the number doesn't go down. I know that those proc/sysctl values actually work because I tested them on a MikroTik with a fairly large route cache, but one that wasn't spiraling out of control, and was able to successfully flush the cache and visibly see the garbage collector behavior change. Once this particular bug is triggered, however, the only thing that can cure it is a reboot. I even tried killing the ppp and ppp-worker processes, but although they cleanly exited after being sent SIGTERM, the route cache remained bloated and the "unregister_netdevice" console errors continued apace.

Finally, it may have something to do with MPPE. I notice that after the problem starts, the 'ppp_mppe' kernel module shows that it is in-use by something and cannot be unloaded, even after I have terminated all PPP tunnels and shutdown PPP services:
# lsmod | busybox grep mppe
ppp_mppe 5585 6 - Live 0x90d5d000
This might just be another symptom, though, rather than a cause, especially if people are also running into this problem with SSTP, which should have no use for MPPE.

EDIT 2: Well, now this is interesting. I think I have managed to find a way to reproduce a version of this bug, but now that I've done so, I tried manually flushing the route cache again, and this time doing so has an effect. It will continue to grow and grow on its own even after a flush, but executing the flush actually works this time and clears the cache (temporarily). Weird.
 
Majklik
newbie
Posts: 35
Joined: Fri Dec 23, 2011 10:20 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Wed Mar 26, 2014 10:03 am

I'm thinking too that there is problem relatet to the new PPP package. The problem with route cache is more worse from ROS6.10.
There is one another test, which I reported yesterday ( [Ticket#2014032566001708] ). I have two metarouters, one runs SSTP server with one dead connection (in some configurtion do not works keepalive timeout on the SSTP server side and the server do not close dead connection, this problem was primary reason for this test) and second metarouter is client which is trying connect to the server but connection fail because SSTP server allow only one connection. I see that route cache is slowly filled up until server stop responding totally after hours (if the SSTP server is disabled or is there only one live connection then metaroutet lives days). If I leave it at this state then after few hours metarouter reboots. If is SSTP server disabled and connection closed (before metarouter hangs), after some time is cache flushed.
 
User avatar
NathanA
Forum Veteran
Forum Veteran
Posts: 801
Joined: Tue Aug 03, 2004 9:01 am

Re: RB2011UAS-2HnD stops responding spontaneously

Wed Mar 26, 2014 10:18 am

I came up with a similar test, but one that I run with L2TP instead, which I am preparing a description of for MikroTik at this moment. Rather than limiting the connection to 1, however, I purposefully mismatched the encryption requirement between the server and the client: the server requires encryption but the client refuses it. The client tries to rapidly connect to the server over and over again and this quickly causes the scenario that I described in my last post, where something gets "stuck" trying to tear down one of the old pppX interfaces. It also generates several holds on the ppp_mppe kernel module as well. Interestingly, every time the L2TP client tries to connect again, it actually causes the route cache to be flushed. But if you let the L2TP client repeatedly try and fail to connect for 2-3 minutes, and then disable it, the route cache on the server will have a mind of its own and just grow and grow and grow after this.

-- Nathan

EDIT: Actually, I'm beginning to think there are 2 issues: 1) the explosive growth of the route cache when something in the PPP subsystem gets stuck, and 2) the route cache getting into a state where it cannot be flushed any longer. I know how to make #1 happen...that's easy. However, when #1 is happening, the route cache garbage collector seems to be able to keep up with it, so if #1 is happening but #2 is not, you probably still won't see a crash. The real problem happens when #1 is combined with #2, and that's what I experienced when I tried to flush the cache after it started growing and found that I couldn't...the cache size would not go down when I tried a manual flush, and the garbage collector was not doing anything. I don't yet know how to reproduce that state of things, but I have observed it once.
 
iprob
Frequent Visitor
Frequent Visitor
Posts: 66
Joined: Wed Mar 07, 2012 12:44 am

Re: RB2011UAS-2HnD stops responding spontaneously

Thu Apr 03, 2014 7:52 pm

We are seeing this issue with router x86 machines that have a lot of inbound VPN connections. These are a combination of on-demand L2TP and site-to-site IPSec tunnels. Failure occurs in less than 24 hours.

We've implemented the check for the route cache to automatically reboot.

This issue has been around a long time but clearly was made MUCH worse with the 6.11 release. The 6.11 release is not really usable at this time.
 
iprob
Frequent Visitor
Frequent Visitor
Posts: 66
Joined: Wed Mar 07, 2012 12:44 am

Re: RB2011UAS-2HnD stops responding spontaneously

Thu Apr 03, 2014 9:44 pm

I know it isn't nearly as useful as Nathan's detail information...but here is an odd scenario that happened.

- The MikroTik running routeros x86 crashed after about 16 hours after the upgrade
- Rebooted and one L2TP connection was made
- Crashed again within 45 minutes
- Two L2TP connections made (saw the route cache flush happen when second connection was made). One connection was the same user as the previous session. The second connect was from a different IP/user.
- Route cache memory look stable at this point (see int increase and decrease).

I don't have all the detailed tools you are using to capture the data, Nathan. Thanks for your work!
 
iprob
Frequent Visitor
Frequent Visitor
Posts: 66
Joined: Wed Mar 07, 2012 12:44 am

Re: RB2011UAS-2HnD stops responding spontaneously

Fri Apr 04, 2014 1:08 am

Up time was only 5 hours last go around. What version is the suggested downgrade? I'd prefer not to have to go all the way back to 5.24 since the queues are redone for version 6 and all the ipsec configuration scripts had to be updated with "aes-256-cbc" instead of aes-256. I don't know why MikroTik changes simple things like that which break backwards compatibility with configuration scripts.
 
lotnybartek
Frequent Visitor
Frequent Visitor
Posts: 95
Joined: Wed Apr 16, 2014 3:22 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 3:36 pm

Same problem here using RB2011UAS-2HnD and latest firmware / software.

Happened few times already (I have this router for 3 weeks), always while L2TP/IPSec clients connected (last time crash - 5 clients connected). I can't ping it, I can't login into it (ssh, telnet, winbox, web). Only reboot fix this. Mikrotik - please solve this bug.

For the time being I'll just use Majklik script.
 
makuk66
just joined
Posts: 2
Joined: Fri Dec 20, 2013 6:13 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 3:46 pm

I upgraded to 6.12 13 days ago and have not seen a reboot since.
 
iprob
Frequent Visitor
Frequent Visitor
Posts: 66
Joined: Wed Mar 07, 2012 12:44 am

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 3:52 pm

Have you tried version 6.12? I haven't seen the issue yet with 6.12 although I still leave the automatic reboot scripts in place.
 
lotnybartek
Frequent Visitor
Frequent Visitor
Posts: 95
Joined: Wed Apr 16, 2014 3:22 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 4:30 pm

Yes, I have 6.12 / 3.14.
 
iprob
Frequent Visitor
Frequent Visitor
Posts: 66
Joined: Wed Mar 07, 2012 12:44 am

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 4:36 pm

If you are seeing the same "No buffer space available" then I would recommend contacting support. They reported this bug as fixed in 6.12 and I haven't seen it so far on any of the 26 routers we upgraded to 6.12. Unfortunately, you'll need to be on the router to verify that the problem is the buffer space. I was able to do this pretty easily with my x86 VM's since I could still connect to the console via the VM manager. I couldn't do that with the RB951 models we have because they were remote. You could also try monitoring the route cache available with a script and writing out the values to a persistent file on the routerboard so you can at least get those statistics and read them after the reboot.

Sorry I can't be of more help.
 
Majklik
newbie
Posts: 35
Joined: Fri Dec 23, 2011 10:20 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 4:57 pm

I see this problem with ROS6.12 still on my routers. But I use SSTP. The changelog for 6.12 mentions only L2TP.
This problem is not only PPP specific, "full route cache" can come from others places too because I have this problem on RB1100AH/AHx2 (with ROS6.7) routers where is not PPP used after 100~150 days of the uptime - maybe related to the GRE/IPsec tunnels.
 
iprob
Frequent Visitor
Frequent Visitor
Posts: 66
Joined: Wed Mar 07, 2012 12:44 am

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 5:02 pm

Good point, I only saw the issue when using L2TP. I never did see the issue with site-to-site IPSec tunnels, I'm not running any ppp on any "core" BGP routers and I don't use SSTP.
 
kazi33
just joined
Posts: 5
Joined: Tue Apr 22, 2014 9:08 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 6:42 pm

make sure that you all upgrade the RouterOS to the v6.11 and as well upgrade the RouterBoard firmware to the latest available (at least v3.11 or newer) - that will guarantee that the gigigabit ethernets don't have temp hangs.
Hello Uldis

I have already upgraded my devices to latest software/firmware. This "temp" hang lasted for about 12 hours before I could get someone to reboot my device (remote location)...Until this bug is confirmed fixed I prefer using watchdog (ping gateway or localhost) or Majklik's script just to be on the safe side.

JF.
Hi JF,
I see exactly same issues on my router. It's running 6.7 though, hangs for 12 hours or so, every 4-5 days and then comes back up. Did you have to do something on your configuration to avoid this or still suffering with this ROS issue?

Kz
 
lotnybartek
Frequent Visitor
Frequent Visitor
Posts: 95
Joined: Wed Apr 16, 2014 3:22 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Apr 28, 2014 11:21 pm

:local act [/ip route cache get cache-size]
:local max [/ip route cache get max-cache-size]

# print some debug info
:log info ("Actual route cache size: $act")
:log info ("Max. route cache size: $max")
:log info ("If active route cache size: $act>=14336 reboot required")

if (($max-$act)<=2048) do={
  /system reboot
}
Original script written by Majklik.

I just added some print info so you'd know little earlier that there will be reboot soon.

This version is for RB2011UAS-2HnD with "Max. route cache size: 16384".

I'm a newbie so correct it if there is something wrong.

Mikrotik - please fix this VERY annoying issue.
 
thayward
just joined
Posts: 1
Joined: Thu Apr 24, 2014 8:55 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Thu May 01, 2014 12:56 am

We're still seeing this ballooning route cache issue on 6.12 on routers utilizing ipip tunnels.
 
mcooper06
just joined
Posts: 21
Joined: Sat Mar 23, 2013 7:39 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Fri May 02, 2014 6:36 pm

We are on RB2011UAS using 6.12 and firmware 3.14 - still occuring here.

We have this machine setup to run the following:

L2TP over IPSec for Road Warriors
Site to Site IPSec

I am planning on downgrading to 6.9 today if possible.
 
mcooper06
just joined
Posts: 21
Joined: Sat Mar 23, 2013 7:39 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Mon May 05, 2014 8:29 pm

I downgraded to 6.9 and the issue persisted - I checked and my firmware was still 3.14 showing an upgrade available to 3.10. I applied the firmware (I assume 3.10 is the latest for use with 6.9) and rebooted. More info to follow.

Michael
 
lotnybartek
Frequent Visitor
Frequent Visitor
Posts: 95
Joined: Wed Apr 16, 2014 3:22 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Mon May 05, 2014 11:00 pm

Apparently this issue has been fixed in 6.13. From yesterday, all clients (6 clients using L2TP/IPSec) were connected. Today cache size is 56 now. Normally it would be something between 2k-4k.
 
Fraction
Frequent Visitor
Frequent Visitor
Topic Author
Posts: 82
Joined: Wed Jan 16, 2013 9:42 pm
Location: Helsinki, Finland

Re: RB2011UAS-2HnD stops responding spontaneously

Tue May 20, 2014 6:28 pm

Apparently this issue has been fixed in 6.13. From yesterday, all clients (6 clients using L2TP/IPSec) were connected. Today cache size is 56 now. Normally it would be something between 2k-4k.
Has not happened for me either with 6.13, so this looks promising!
 
iprob
Frequent Visitor
Frequent Visitor
Posts: 66
Joined: Wed Mar 07, 2012 12:44 am

Re: RB2011UAS-2HnD stops responding spontaneously

Tue May 20, 2014 7:13 pm

We opened up a support ticket about this issue and they indicated it is fixed in 6.13. We're only now beginning to roll it out so I don't have any definitive results yet.
 
lotnybartek
Frequent Visitor
Frequent Visitor
Posts: 95
Joined: Wed Apr 16, 2014 3:22 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Tue May 20, 2014 8:31 pm

So I can only confirm. This one is fixed in 6.13. No problems for couple of days.

Thank you Mikrotik ;-)
 
kazi33
just joined
Posts: 5
Joined: Tue Apr 22, 2014 9:08 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Sun Jun 01, 2014 12:16 pm

I just filed a support ticket Ticket#2014060166000098. My Rb2011UiAS-IN falls in reboot loop with power cycle. It recovers after 2 -3 hours. One router is rebooting for few days. From LCD, it says
-Loading kernel from nand
-Starting services
After around 30 seconds falls in same loop.
I saw this issue with 6.12. I was hoping to have it fixed in 6.13. But, in my case, upgrade to 6.13 did not help.
 
jarda
Forum Guru
Forum Guru
Posts: 7604
Joined: Mon Oct 22, 2012 4:46 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Sun Jun 01, 2014 4:15 pm

Kazi33, have you tried netinstall? If not, try it.
 
iprob
Frequent Visitor
Frequent Visitor
Posts: 66
Joined: Wed Mar 07, 2012 12:44 am

Re: RB2011UAS-2HnD stops responding spontaneously

Sun Jun 01, 2014 4:22 pm

I've upgraded several routers to 6.13 and haven't had an issue and the failures related to route cache have stopped. We only have one open bug. That is in a scenario with dual ISP setups and marking PPTP packets. L2TP/IPSec was fixed in 6.13 and PPTP is expected to be fixed in 6.14.

I agree with jarda, if you're having that reboot loop then try a netinstall and restore the config. I haven't seen that issue on any of the hardware we've upgraded (RB2011, RB751, RB951 and x86).
 
kazi33
just joined
Posts: 5
Joined: Tue Apr 22, 2014 9:08 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Thu Jun 05, 2014 7:24 pm

Sorry, I did not get email for past few days on both of your postings above. Problem was having "Dude" package installed on mikrotik-RB2011UiAS-IN.
After I removed that using "system package uninstall dude" and rebooted the router problem disappeared. This happened
with their pre-release build 6.14c25 as well. I think, our IT team installed dude on mikrotik for monitoring purpose. It's like
a poison pill on mikrotik. After 2nd or 3rd reboot, it falls in reboot loop when "dude" is running as a service on mikrotik.
I was in Aruba test engineering for about 8 years and we used to fix this kind of issues asap in next release. Not sure how
seriously mikrotik takes this issue. If a package acts like poison pill, software should block that installation on router.

To deal with this:

Check dude:
>store print
Flags: X - disabled, A - active
# NAME TYPE DISK S
0 A dude1 dude system a

To uninstall dude:
[admin@ccc] > system package uninstall dude
[admin@ccc] > system package print
Flags: X - disabled
# NAME VERSION SCHEDULED
0 advanced-tools 6.13
1 dude 4.0beta3 scheduled for uninstall
2 X wireless 6.13
 
jarda
Forum Guru
Forum Guru
Posts: 7604
Joined: Mon Oct 22, 2012 4:46 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Fri Jun 06, 2014 5:11 am

Welcome to the club!
 
Pengu1n
just joined
Posts: 5
Joined: Tue Jan 31, 2012 12:24 am

Re: RB2011UAS-2HnD stops responding spontaneously

Mon Jul 07, 2014 10:36 am

Hello
Same problem with route-cache on RB1100AHx2.
ROS v.6.15, fw 6.10
Attaching memory usage graph, when it becomes up to about 400MB router stops respond to IPv4 and can be accessed only by MAC telnet.
Router is actively using for site-to-site and l2tp VPNs.
graph1.JPG
You do not have the required permissions to view the files attached to this post.
 
mvalsasna
just joined
Posts: 17
Joined: Tue Sep 30, 2014 2:10 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Tue Jan 13, 2015 3:47 pm

just happened again with 6.21.1 on RB433AH

/ip route cache print
cache-size: 16384
max-cache-size: 16384

no ppp, one IPSEC tunnel and one VRRP instance
 
Kraken2k
Frequent Visitor
Frequent Visitor
Posts: 53
Joined: Wed Oct 01, 2014 1:50 pm
Location: Prague

Re: RB2011UAS-2HnD stops responding spontaneously

Thu Jan 07, 2016 1:12 pm

Solved this issue finally! (tested on version 6.32.1)

I had these problems since upgrade from 6.25 to newest version (last incident it was 6.30.2) on RB1100AHx2 - after few days, the router stopped to respond - still running, reacts to cable connect/disconnect but no response on ethernet ports.

I was able to connect to connect using serial port and found everything running as expected, but just no response to network traffic. When attempted to ping local addresses or even 127.0.0.1 I got the error message "No buffer space available" - later I found that an issue with the same symptoms existed in the past and was claimed to be already fixed (route cache overflow).

Interesting thing is, that we have two RB1100AHx2 routers with the same configuration (just few different IP addresses), but the second one is just backup with a little traffic and deactivated IPsec tunnels - that one works without any issue.

(Months with daily reboots passed)

Yesterday, I finally managed to resolve this issue on router with ~20 IPsec tunnels.

tl;dr version: Guess what... it was solved by turning the ip cache feature back on.

This settings had no effect in version 6.30.2. - when I opened the ticket back then, I got the advice from MT support : "turn the IP cache feature off", but it has no effect and the setting stayed there.

But turning on the route cache ( /ip settings set route-cache=yes) in 6.32.1 (I did not test the next versions yet) actually force the cache to work as it should. However if you change it on running system, this change affect the cache records from that point only - cache entries created prior to the point you turn the route cache feature on, stays there forever, until the router is restarted.

It almost looks like IPsec tunnels use router cache regardless the cache on/off settings, but if the case is turned off in IP settings, no one cares about the records in cache any more, so it will overflow in the end, causing all IPv4 traffic to stop. Turning on the cache feature forces all records to be managed by regular cache algorithm, so it works as it should.
 
whitbread
Member Candidate
Member Candidate
Posts: 108
Joined: Fri Nov 08, 2013 9:55 pm

Re: RB2011UAS-2HnD stops responding spontaneously

Sat Jan 09, 2016 4:12 pm

Solved this issue finally! (tested on version 6.32.1)
...
it was solved by turning the ip cache feature back on.
...
I can confirm, that enabling /ip settings route-cache works on RB750GL, RB951G-2HnD both on Rel. 6.34rc34 :D

Who is online

Users browsing this forum: No registered users and 21 guests