Community discussions

MUM Europe 2020
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Wed Dec 23, 2015 4:38 pm

We are using CCR1072 since few days, and the moment we connected around 100-200 users on PPPoE we started experiencing strange problems

1. After few hours of normal operations Most of the users start disconnecting and connecting on every few minutes/seconds
2. Some of the users are showing as double logged in, like username "test" and also "test-1" are logged in. (maybe as result from 1).
3. Almost all Dynamic Simple Queues are RED

The issue occurs after running normally for few hours and we can't find what triggers it as of now.

We tried both 6.33.3 and 6.32.3 versions but the issue remains same.

Reboot of the device solves the issue immediately, so we can't blame the network for the Connectivity flapping.

Anyone with ideas what can be the issue, or what can be checked further.
Last edited by marlowbg on Sun Jan 17, 2016 5:28 pm, edited 2 times in total.
 
User avatar
pukkita
Trainer
Trainer
Posts: 2997
Joined: Wed Dec 04, 2013 11:09 am
Location: Spain

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Wed Dec 23, 2015 9:18 pm

Are you using local users (secrets) or radius for AAA? What's your PPP > Profile Limits "Only one" parameter??

Is the CCR firmware up to date?
Simplicity is the Ultimate Sophistication - Da Vinci
Getting the most out of this forum
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Wed Dec 23, 2015 10:11 pm

Hi,

Authentication happens through Radius.

In PPPoE Profile, "Only one" option is set to "Default".

In normal circumstances when problem is not triggered everything is smooth, but once problem starts ... nothing but reboot helps.

Tested 6.33.3 and 6.32.3 versions, if this is what you mean by firmware...
 
User avatar
pukkita
Trainer
Trainer
Posts: 2997
Joined: Wed Dec 04, 2013 11:09 am
Location: Spain

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Thu Dec 24, 2015 12:15 pm

I mean System > Routerboard firmware.

Looks like what you're experiencing is the first pppoe client connection gets in a "stale" status, and then client stablishing a second connection.

It could be due to some sort of Layer 2 problem, not related to the CCR, or to the CCR itself.

Check firmware is up to date.
Simplicity is the Ultimate Sophistication - Da Vinci
Getting the most out of this forum
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Thu Dec 24, 2015 12:39 pm

Hi,

firmware is 3.27

The 200 customers that are connecting are split on 5 different Vlans, and when the issue Starts, All users on all Vlans are experiencing the issue... When I reboot the CCR1072, it's immediately resolved.

This is the reason I suspect the CCR1072, and not something else.
 
User avatar
pukkita
Trainer
Trainer
Posts: 2997
Joined: Wed Dec 04, 2013 11:09 am
Location: Spain

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Thu Dec 24, 2015 12:43 pm

If that's the case, then contact Mikrotik Support.

You should do first an export reset to no defaults, then reload the configuration. You may be asked to do a netinstall also, so if you can, do it and test if the problem persists afterwards.

If it does, try to generate a supout file when no issues are happening, and another when issues are present, then submit both with a detailed explanation of the setup (switches connected to it, etc) to support.
Simplicity is the Ultimate Sophistication - Da Vinci
Getting the most out of this forum
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Sat Dec 26, 2015 12:49 pm

I have even installed 1 more NAS that is based on x86 platform, but again same problem occurred after around 30 hours of running.

I'm now running the 2 NASes (one ccr1072 and second x86) in parallel and have created supout on both when everything is fine, and when issue occurs will create second one.

No other options for now....
 
User avatar
pukkita
Trainer
Trainer
Posts: 2997
Joined: Wed Dec 04, 2013 11:09 am
Location: Spain

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Sat Dec 26, 2015 2:53 pm

I have even installed 1 more NAS that is based on x86 platform, but again same problem occurred after around 30 hours of running.
Which ROS version? Where is it connected?

I'm afraid that could point to L2 problems further down your network, not related to the CCR/x86...

What's behind the CCR/x86? Network topology?
Simplicity is the Ultimate Sophistication - Da Vinci
Getting the most out of this forum
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Sat Dec 26, 2015 3:41 pm

Problem happened on 6.33.3 on the x86 NAS, now I have downgraded to 6.32.3

Network is simple.

1 Data center, where 2 NASes are connected to Huawei S6700 switch.

Some of the users, are directly connected on the same Huawei, on a Vlan.

Other part of the users are through Vlans that are going through Point to Point links, provided by Telcos to distant areas.
 
User avatar
pukkita
Trainer
Trainer
Posts: 2997
Joined: Wed Dec 04, 2013 11:09 am
Location: Spain

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Sat Dec 26, 2015 3:50 pm

And are you sure this isn't a "network glitch" on your provider network or Huawei switch? Try to came out with a test to proof that if that's the case...

Do you have any sort of HA setup?

Double check physical connections to the core network.

So ALL Vlans are failing?
Simplicity is the Ultimate Sophistication - Da Vinci
Getting the most out of this forum
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Sat Dec 26, 2015 3:59 pm

I don't think it's a Network related issue, because

1. Some of the users (50% of them, that are direct on the Huawei Switch through fiber) were previously running on other NAS (Linux based) again on PPPoE and were having no issues of such kind.

2. On reboot of Mikrotik, issue is immediately resolved. If No reboot, it can continue for hours.

3. When the issue occurs, all vlans are affected (the one running directly + those through VPLS).

So it might be something on the network that triggers the issue...., but definitely Mikrotik is also to be blamed due to above facts.

Otherwise I don't have HA (high availability) running.
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Mon Jan 04, 2016 9:37 pm

I have opened communication with Mikrotik support team.

Hope to get resolution fast as it seems that it's some software related issue/bug.
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Wed Jan 06, 2016 6:49 pm

Unfortunately still there is no solution for this issue and it's strange to me if someone else is not getting it.

We suffer from 10-15 occurrences

Support is saying that R&D team is working on the issue, but there is no visibility on the progress ...
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Sat Jan 09, 2016 5:38 pm

As per last communication from Mikrotik support 2 days back:


1. The issue is well known old bug but unsolved for now.

2. It’s happening when someone accessing PPP->Active Connections through Winbox and maybe together with some other unknown condition because not every access of PPP -> Active Connections triggers the issue.

Workaround suggested by them:

1. Do not use Winbox to access PPP->Active Connections menu
2. We can use Webfig + console in case we need to see PPP->Active Connections menu


I have disabled the Winbox access totally from IP->Services menu since 2 days,

but unfortunately today again the same issue occurred so I'm expecting next ideas from them...
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Strange issue with CCR1072 and PPPoE with both 6.33.3 and 6.32.3

Sat Jan 16, 2016 3:02 pm

Issue keeps on happening on both CCR1072 and the x86.

Still no positive results and even no recent replies from Mikrotik Support.

What are the users of Mikrotik doing in such situations?
 
vladimirslk
Member Candidate
Member Candidate
Posts: 110
Joined: Wed Feb 10, 2010 2:03 am
Location: Estonia, Tallinn
Contact:

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.xx versions available

Sun Jan 17, 2016 4:57 pm

downgrade 6.29.1 :) did not noticed any issues.
uptime 200d
 
User avatar
chechito
Forum Guru
Forum Guru
Posts: 1749
Joined: Sun Aug 24, 2014 3:14 am
Location: Bogota Colombia
Contact:

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.xx versions available

Sun Jan 17, 2016 8:50 pm

downgrade 6.29.1 :) did not noticed any issues.
uptime 200d

6.29.1 is from june 2015, are you sure it supports the ccr1072??
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Mon Jan 18, 2016 4:54 pm

HI,

I'm ready to use only x86 based solution if I have a way to stabilise it.

Why do you think the older version and especially 6.29.1 will resolve my issue?
 
acidsas
newbie
Posts: 35
Joined: Tue May 21, 2013 1:48 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Mon Jan 18, 2016 6:19 pm

We have the same experience since first versions of ROS v6 on x86 and different CCRs. Accessing active tab on PPP menu in Winbox sometimes crashes ppp (pptp/l2tp/pppoe/etc ppp) connections with radius auth. I've did numerous supouts, email support but didn't get a reply stating that it is a long time know bug. Support suggested to tune our radius, make some more supouts and so on.
As a workaround I've added missing columns to PPP->Interfaces in Winbox (IP & uptime) and crashing is gone.
To fix connections after you get red simple queues and also red dynamic ip's you can use two scripts. This way you don't have to reboot the box after ppp crash. Also take in account that if you don't use the "Active Connections" tab there are almost no issues with "red simple queues" (maybe once in 6 month).
/system scheduler
add interval=1m name=red_simple_remove on-event="/queue simple remove [find invalid=yes disabled=no]" policy=ftp,reboot,read,write,policy,test,password,sniff,sensitive start-time=startup
add interval=1m name=remove_dynamic_invalid_ips on-event="/ip address remove [find dynamic=yes invalid=yes]" policy=ftp,reboot,read,write,policy,test,password,sniff,sensitive start-time=startup
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Tue Jan 19, 2016 1:55 pm

Hi Dude,

It's good to see someone else also aware about this issue.

It's strange to me how Mikrotik R&D team are not finding a way to either fix it or disable the View that is triggering the issue.

Almost all cases when you use this View on a NAS that is working since 2-3 days the issue is triggered - at least in our deployment :)

I have to mention also, that Their support convinced me that the issue is happening only through Winbox, but it's not true - When using PPP->Active connections through WebFig interface, the issue is also triggered - already happened 2-3 times.

Otherwise thanks for the script, it will most probably help to avoid the reboot of the device. I have implemented it and will monitor next days.
 
acidsas
newbie
Posts: 35
Joined: Tue May 21, 2013 1:48 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Tue Jan 19, 2016 1:59 pm

I can tell you even more... /ppp active print in console also triggers the issue. :(
As I've mentioned before, you can use ppp interfaces instead of active.
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Tue Jan 19, 2016 2:17 pm

I was suspecting that the issue will not be limited to only Winbox usage, but explicitly asked MT support, and they confirmed it's only Winbox :) but anyway I hope that this topic here will help more people to know that actually it's not limited to Winbox, but also Console + Webfig.

Btw,

what is the ROS version you are right now using and is it on x86 or CCR?
 
acidsas
newbie
Posts: 35
Joined: Tue May 21, 2013 1:48 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Tue Jan 19, 2016 2:19 pm

Currently 6.32.3 on all production CCRs and x86s.
 
ethernet
just joined
Posts: 20
Joined: Tue Aug 25, 2009 1:44 am

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Sat Mar 05, 2016 12:41 pm

We also have the same problem on a x86 box. That x86 worked perfectly until the hard drive crash on 6.33.3. We put in a new hdd with 6.34.2, export-import the configuration and the problem appears.

We will wait for our maintenance window tonight and downgrade to 6.33.3 again to see if it will resolve the issue.

What i worry about is that when the first drive crashed, we booted up our esxi and we had the same problem. Downgraded the vm to 6.29.1 that works perfectly on another server that has the exact same hardware and problem was worse. When you click on ppp-active and double click on a client, the vm crashes and reboots. When upgraded back to 6.34.2 machine doesn;t crash on ppp-active, but still has the original problem of double connection that the OP described.

If anyone has any suggestions i am up for testing
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Sat Mar 05, 2016 5:25 pm

For me the scripts provided above in the thread are the only solution for the Double Queue/Logins.

Have you tried to implement them?
 
ethernet
just joined
Posts: 20
Joined: Tue Aug 25, 2009 1:44 am

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Sun Mar 06, 2016 10:16 pm

Yes i am using them, but they don't clear double pppoe interfaces. I am gonna try and disable the mpls package tonight. That is the only difference between the old server and new.
 
ethernet
just joined
Posts: 20
Joined: Tue Aug 25, 2009 1:44 am

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Mon Mar 07, 2016 3:58 pm

One more thing i noticed that makes this 100% mikrotik fault. I can reproduce the same problem every time:

1. Open PPP-Active in Winbox (any version of winbox old and new), you can see duplicate names. They have the same name and mac address but different ip and uptime.
2. Select the non working name and click the red minus. The name WILL disappear.
3. Click PPP-Profiles
4. Click PPP Active...double connection is still there!?!

Mikrotik is not removing dead connections automatically or manually.
 
ethernet
just joined
Posts: 20
Joined: Tue Aug 25, 2009 1:44 am

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Tue Mar 08, 2016 9:56 pm

So far so good. I disabled mpls ups and ntp package and everything is working fine as before. I was NOT using any of these packages anyway.

I hope this helps you out!
 
marlowbg
newbie
Topic Author
Posts: 33
Joined: Wed Oct 06, 2010 4:23 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Wed Mar 09, 2016 1:23 pm

I'm not having the MPLS package anyway... but having "ntp" package and really don't believe this should be somehow related to the issue.

What is your current RouterOS version?
 
ethernet
just joined
Posts: 20
Joined: Tue Aug 25, 2009 1:44 am

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Thu Mar 10, 2016 2:47 am

RouterOS version doesn't make a difference. I tried 6.29.1, 6.33.3, 6.34.2 and with every version i had the same problem.
Only disabling those packages resolved the issue. I am 100% percent sure that one of those packages was the problem.

Here is all the thing i tried before i finally resolved my issue:

1. Physical machine
2. Virtual machine
3. Restore backup
4. Reset and import settings
5. Reset and manual settings
6. Different RouterOS version (not later than 6.29.1)

NOTHING worked.

What was driving me crazy was that the only thing i changed was that dead hard drive (and those 3 packages). Machine was working beautifully for more than a year.

Don't waste time troubleshooting and do what i did.
 
Zar1n
just joined
Posts: 1
Joined: Wed Feb 19, 2014 6:24 am

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Thu Mar 10, 2016 5:29 am

Hello, everybody.

I have same problem with CCR 1036 on 6.34.2. Several times at day l2tp and pppoe sessions freezes, and then doubles.
 
cipito
just joined
Posts: 6
Joined: Thu Nov 22, 2012 11:36 pm

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Sat Mar 12, 2016 8:10 am

I'm having the same problem on 1036. ( edit: ccr 1016 )
It never happened to me, It has been running for about half an year or more with no problems.

I have updated to the last version few nights ago - i think 6.34.3 and it all started.
I must say it happened at a time when I was not logged in at all ( no webfig, no winbox ). So no connection with WINBOX and this problem for me.
It happened at 2 PM and then 3 hours later. Again, nobody was logged on.

I have reverted to 6.33.5 and hope it wont do it again.

CCR 1016 + radius + mysql

I have a problem with the radius server because it was out of HDD space, but restarting radius / mysql or even the server did not solve my problem.
I also thought there might be a storming or even a device in my network acting as a pppoe relay, but i have like 50 vlans and this happens on every vlan, so it cant be that.
 
bdinet
just joined
Posts: 10
Joined: Thu Apr 17, 2008 7:52 am

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Mon Mar 14, 2016 12:14 pm

Same problem - RB1100AHx2 - double PPPoE connections. ROS 6.34.2
 
User avatar
lcm
Trainer
Trainer
Posts: 57
Joined: Wed Apr 28, 2010 11:56 pm
Location: Brazil
Contact:

Re: Total PPPoE crashing with CCR1072 and x86 - With all RouterOS 6.3x.xx versions available

Thu Feb 02, 2017 11:49 pm

Same here!
Anyone know how can i remove the IP Pool used address that has not been released from freezed PPPOE interfaces?
Greets from Brazil.

Luiz Claudio Martins Maia

Who is online

Users browsing this forum: No registered users and 108 guests