ROS 5.10 - 5.12 very unstable as pptp BRAS on an x86. every 5-12 hours i see a a soft lockup on a random core of the cpu, after that whole system hangs. ROS 5.11 have much more performance on my BRAS (about 150+ Mbps, than it hangs). supout.rif does not creates in case of hangup.
ROS 5.6 seems to be very stable, uptime is over 2 weeks, generally.
but in pptp mode performance is wery low, below 130 Mbps.
Configuration - x86 core quad Q6600 + dualhead INTEL 82576 NIC + rtl8169 NIC for communication with billing server, (radius protocol). Billing - Abills 0.52b. Simple queue for each user, connected via pptp.
In case of 120+ users online i see that LA on 1 of cores CPU reaches 100% (98-100% by interrupts). This 100% LA moves from 1 core to another, and total throughput of BRAS limited near 130 Mbps. In this case other cores loaded on 5-20% (system-resources-cpu) in tools-profile i see that this core loaded by queuing & firewall (mostly).
Today i bind tx-rx-lanes of NICs to different cores of cpu, and increase ethernet-default queue to 400 and default-small queue to 50. Now i gathering statistics of LA. On this forum i read that i should disable RPS on my ethernet interfaces. How can i do this? i can’t find it.
Any ideas about increasing PPtP performance? (Linux Accel-ppp + quagga?)
P.S. Cracked ROS 3.30 was very stable, but it doesn’t support 82576.
I bought L5 license for ROS 5.6 and i don’t know why i should pay for such instability.
P.P.S. Sorry for my English, i’m from Ukraine.
I will support NetworkPro’s suggestion. I have tried various ROS versions, upgrade/downgrade but always a problem. Then, changed it for a NIC with an Intel based chipset, no more problems. Might be a problem with the hardware chipset, might be a ROS driver problem. Who cares, I now know to stay away from the Realtek chipset. Haven’t got the time to investigate every reason why something doesn’t work!
in /queue interface menu set all queues to
multi-queue-ethernet-default
i change ethernet interfaces queues to multi-queue-ethernet-default. doesn’t affected on router behavior.
I don’t know how i can change queue type from “default-small” to multi-queue-ethernet-default on my dynamic pptp user interfaces.
2)in /system resources rps menu disable all entries
i did it 3 days before sending supouts to support. doesn’t affected on router behavior.
in /system resources irq try to allocate cores manually (by default it at least
one core per interface)
I don’t know. It’s no errors displayed when i change RTL queue to mutli-queue-etherhet-default.
But in mutli-queue-etherhet-default kind ob buffer is “unknown” instead of “pfifo” on etherhet-default.
Yes, it still happen.
The problem exist when any pptp user (it’s may be only 1 user, may be few users) exceeds speed of 20-25 MBps. 100% LA on 1 of cores of CPU, moving from 1 core to another.
when all users running at 6MBps it’s all right, load normally balanced by CPU cores. But in this case overall load don’t overs 100-110 MBps and i don’t know which will be result, when total traffic exceeds,for example, 150 MBps.
Seems to me, it’s limitation of userspace pptp server software in MTK ROS.
i’m moved from slackware +2.6.23 kernel + accel-pptp 0.8.5
self-made billing (all settings and DB, statistics in mysql, lot of perl scripts) with NAT on this Q6600
2 100 Mbit intel desktop NIC’s and 80-95 MBps peak traffic load
to ROS 3.30 + Abills billing + 2 D-link DGE-530T NIC’s (marvell?).+ BGP default route (to handle real IP per each user)
than i bought 82576 NIC $ ROS 5.6 to handle 82576.
Troubles began.
On attached interface utilisation diagram:
apr. 26, 2011 - linux replaced by ROS 3.30
sept. 17, 2011 - ros 3.30 upgraded to ROS 5.6, NIC’s upgraded to 82576 dualhead pci-e.
I move to ROS because my self-made billing can’t handle users with real IP’s.(it has been build in far 2002-2004y).
Today i’m prepare my network to move to pppoe, than - to ipoe.(i have 4 routed subnets /24)
Can you swap the motherboard with a different model to test? It is what I would do if I get stuck like this due to hunch about hardware+drivers issues.
similar results on intel G33 (Gigabyte) & intel G41 (Asus).
seems to me - it’s userlevel code of pptp server software on mikrotik.
I’ve tested this box in both pptp (all users) + 2-3 pppoe tunnels (test users, 1 mschapv2+mppe + 2 md5 chap without mppe), in pppoe throughput was near 100M, this load don’t overload the CPU, i achieve (65-70 MBps download/ 25-60 MBps upload)/user (netbook on intel atom N450 CPU) in torrents and no softlockups on BRAS CPU.