Joined: Fri Jan 27, 2012 11:40 am Posts: 17
Karma: 1
ROS 5.10 - 5.12 very unstable as pptp BRAS on an x86. every 5-12 hours i see a a soft lockup on a random core of the cpu, after that whole system hangs. ROS 5.11 have much more performance on my BRAS (about 150+ Mbps, than it hangs). supout.rif does not creates in case of hangup. ROS 5.6 seems to be very stable, uptime is over 2 weeks, generally. but in pptp mode performance is wery low, below 130 Mbps. Configuration - x86 core quad Q6600 + dualhead INTEL 82576 NIC + rtl8169 NIC for communication with billing server, (radius protocol). Billing - Abills 0.52b. Simple queue for each user, connected via pptp. In case of 120+ users online i see that LA on 1 of cores CPU reaches 100% (98-100% by interrupts). This 100% LA moves from 1 core to another, and total throughput of BRAS limited near 130 Mbps. In this case other cores loaded on 5-20% (system-resources-cpu) in tools-profile i see that this core loaded by queuing & firewall (mostly). Today i bind tx-rx-lanes of NICs to different cores of cpu, and increase ethernet-default queue to 400 and default-small queue to 50. Now i gathering statistics of LA. On this forum i read that i should disable RPS on my ethernet interfaces. How can i do this? i can't find it. Any ideas about increasing PPtP performance? (Linux Accel-ppp + quagga?)
P.S. Cracked ROS 3.30 was very stable, but it doesn't support 82576. I bought L5 license for ROS 5.6 and i don't know why i should pay for such instability. P.P.S. Sorry for my English, i'm from Ukraine.
Joined: Tue Feb 27, 2007 1:52 am Posts: 710
Karma: 23
Location: Guernsey and UK
I will support NetworkPro's suggestion. I have tried various ROS versions, upgrade/downgrade but always a problem. Then, changed it for a NIC with an Intel based chipset, no more problems. Might be a problem with the hardware chipset, might be a ROS driver problem. Who cares, I now know to stay away from the Realtek chipset. Haven't got the time to investigate every reason why something doesn't work!
_________________ Ron Touw - Mikrotik Certified Trainer LinITX.com - MultiThread Consultants Get your MikroTik RBs and Training: http://linitx.com/category/166 Official UK MikroTik Distributor IRC channel: #routerboard on irc.z.je (IPv4) or 6.irc.z.je (IPv6)
Joined: Fri Jan 27, 2012 11:40 am Posts: 17
Karma: 1
there is their answer! Hello,
1) in /queue interface menu set all queues to multi-queue-ethernet-default >> i change ethernet interfaces queues to multi-queue-ethernet-default. doesn't affected on router behavior. I don't know how i can change queue type from "default-small" to multi-queue-ethernet-default on my dynamic pptp user interfaces.
2)in /system resources rps menu disable all entries >>i did it 3 days before sending supouts to support. doesn't affected on router behavior.
3) in /system resources irq try to allocate cores manually (by default it at least one core per interface) >>i did it 3 days before sending supouts to support. doesn't affected on router behavior. Examine this presentation: http://www.tiktube.com/index.php?video= ... wolLonKGo= >> no more news for me. I did all optimisations.
Regards, Janis Megis
seems to me, it's no necessary to attach supouts - MTK support don't use them when analyses user complaints.
Joined: Fri Jan 27, 2012 11:40 am Posts: 17
Karma: 1
NetworkPro wrote:
I like these optimisation tips. Good info.
When you allocated IRQs to cores, can you tell if the core that goes to 100% is the one with the RTL8169 IRQ allocated to it?
>>no, i can't tell that - 100% LA core moves randomly, and independently from core, serving for Realtek.
Does mutli-queue-etherhet-default work with the RTL8169?
I don't know. It's no errors displayed when i change RTL queue to mutli-queue-etherhet-default. But in mutli-queue-etherhet-default kind ob buffer is "unknown" instead of "pfifo" on etherhet-default.
Joined: Fri Jan 27, 2012 11:40 am Posts: 17
Karma: 1
Yes, it still happen. The problem exist when any pptp user (it's may be only 1 user, may be few users) exceeds speed of 20-25 MBps. 100% LA on 1 of cores of CPU, moving from 1 core to another. when all users running at 6MBps it's all right, load normally balanced by CPU cores. But in this case overall load don't overs 100-110 MBps and i don't know which will be result, when total traffic exceeds,for example, 150 MBps. Seems to me, it's limitation of userspace pptp server software in MTK ROS.
Joined: Fri Jan 27, 2012 11:40 am Posts: 17
Karma: 1
i'm moved from slackware +2.6.23 kernel + accel-pptp 0.8.5 + self-made billing (all settings and DB, statistics in mysql, lot of perl scripts) with NAT on this Q6600 + 2 100 Mbit intel desktop NIC's and 80-95 MBps peak traffic load to ROS 3.30 + Abills billing + 2 D-link DGE-530T NIC's (marvell?).+ BGP default route (to handle real IP per each user) than i bought 82576 NIC $ ROS 5.6 to handle 82576. Troubles began. On attached interface utilisation diagram: apr. 26, 2011 - linux replaced by ROS 3.30 sept. 17, 2011 - ros 3.30 upgraded to ROS 5.6, NIC's upgraded to 82576 dualhead pci-e.
I move to ROS because my self-made billing can't handle users with real IP's.(it has been build in far 2002-2004y). Today i'm prepare my network to move to pppoe, than - to ipoe.(i have 4 routed subnets /24)
Can you swap the motherboard with a different model to test? It is what I would do if I get stuck like this due to hunch about hardware+drivers issues.
Joined: Fri Jan 27, 2012 11:40 am Posts: 17
Karma: 1
similar results on intel G33 (Gigabyte) & intel G41 (Asus).
seems to me - it's userlevel code of pptp server software on mikrotik.
I've tested this box in both pptp (all users) + 2-3 pppoe tunnels (test users, 1 mschapv2+mppe + 2 md5 chap without mppe), in pppoe throughput was near 100M, this load don't overload the CPU, i achieve (65-70 MBps download/ 25-60 MBps upload)/user (netbook on intel atom N450 CPU) in torrents and no softlockups on BRAS CPU.
Cool that you tested with different motherboards. At the time the problem happens - grab a supout.rif and send to support. If it is truly a bug they should simply fix it in the next RouterOS release or sooner by providing you a custom version.
Joined: Mon Feb 13, 2012 3:18 am Posts: 7
Karma: 1
I have a similar problem... BRAS in MT 4.x fine... upgrade to 5.8 - 5.12... trouble -> the (2 items) Intel D510MO (NIC-> realtek) has frozen ~80 and ~35 PPPoE sesion
Joined: Fri Jan 27, 2012 11:40 am Posts: 17
Karma: 1
NetworkPro wrote:
Cool that you tested with different motherboards. At the time the problem happens - grab a supout.rif and send to support. If it is truly a bug they should simply fix it in the next RouterOS release or sooner by providing you a custom version.
Cheers.
it's no sense to sending supout.rif - I've send 3 supouts, they didn't analyze them - they proposed me to do that optimizations, that i made on that time when i send supouts. or suppout doesn't include info about my configuration?
I am using it as a backup off-site PPPoE concentrator/main router serving a few clients while waiting for the main one to fail and bunch of PPPoE connections to come through (never happened so far, the main one is a good one ). I use the onboard gig eth for main Internet connection and the PCI slot with a MT RB44GV eth card (yes PCI is a bottleneck but traffic is low there).
Only problems so far were Silicon Motion USB controller inside USB Flash - incomatibility with Intel Chipset, I believe. If there were other issues - they looked like USB Store problems.
Users browsing this forum: Bing [Bot] and 26 guests
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot post attachments in this forum