max sub count with pppoe

Hey guys I am getting frequent disconnects of pppoe sessions on wireless links that typically have -79db or better both ways. we have forced connecting in b mode only. we are a WISP.

at the moment we have several 30- 45 client towers running pppoe via rb450g’s/433ah’s and ubiquity powerstation 2’s AP’s without issue on 1 wireless network per site. using cts/rts on all access points.

we are, however, having MAJOR issues on a site with 70 subscribers on them after we converted them to mikrotik pppoe/ubiquity.

a large percentage of users using pppoe have really poor pings. many cannot ping larger packet sizes at all. it varies in severity thruout the day.

I was thinking about trying to load a rb433ah with a ubiquity xr2 card to see if the powerstation 2s were just underpowered, Im not worried about bandwidth as we are throttled quite low and have a max connection load of about 500 connections at once. but after reading another thread about max clients on cts/rts im having second thoughts.

now here is the thing, before we were on a neteq/ubiquity radio setup, public ip’s to the clients. i saw on the odd occasion, similar symptoms, usually an ap reboot fixed the issue for a week or two. Im wondering if i should just settle on 40 as a max number of users connected to a tower or at the very least move to mac authentication instead of pppoe on these really big sites as the working theory around here is that using pppoe exasorbated the oversubscription problem into a monster.

what do you guys think? will going to mac auth likely reduce the poor service to my subs?

we installed the sr2/rb433 unit and all was looking really good till this afternoon when people started dropping a lot of packets.


oh by the way, we have rts/cts set to 256 on most of our cpe’s, tried upping a few to 512, with little affect.

I tried enabling access restrictions based on mac, this was to experiment with the oversubscription thory. what i found when i kicked all but about 30 wireless clients, the problem persisted. which is friggin odd.

so i dug a little further, the switch feeding the MT is forcing 10meg half duplex. and im dropping about 5% packets to the mikrotik wan interface.

would a 5% loss rate at the MT get worse down the line to my clients, say 10-15% loss?


we are in the middle of the summer seasonal subscriber surge and cannot segment this network jsut yet as we dont have the tower real estate for another antenna. in the spring we are breaking this big tower into a 5.8 and a 2.4.

We fixed the duplexing issue that was causing packet loss to the MT. pings are reasonably good now.

The only issue we are seeing now is once or twice every day ALL the pppoe users and users i have setup with basic routing just drop right out.

They still show connected in the wireless clients list.

a disable and then enable of the wireless card brings everyone back in.

there are some fm transmitters on the same wireless tower as our antennas. could rf interferance from fm transmitters be causing this erratic behavior? this sympom was occurring on a powerstation 2 ext AP, a engenious 2611p, and now its doing the same thing on our mikrotik/ubiq sr2 card.

At least we are down to one symptom…

another thing to note, all my clients with eoc2610 ext radios dont redial pppoe wourth a shit after an ap reboot. they all have the 1.43.05 firmware which is supposed to have resolved that issue. all my 2611p’s redial immidiatly after gaining back wireless link. The 2610’s require a power cycle to get back on again. The 2610’s i have been taking off pppoe, and running them over the mikrotik like a regular router and they come back no problem after an ap rebot..

I got to reading about the different best practices for the mtu/mru for pppoe connections. I have mine set to 1492 at all my towers and thats working well. yesterday just to see what happens i changed it on the problem site to 1480. the only affect i saw is that a small handfull of the 2610 ext radios logged in after an ap reboot. like 4 or 5 of them.


has anyone got some reccomendations for me? im starting to run out of ideas here.

so today i took a little while to watch what was happening while it was down before i rebooted the wireless card.

in the logs i showed a lot of attempted pppoe dial ins that disconnected halfway or immidiatly after authentication. also showed for my dhcp users discover, offers, but no acks. didnt see any messages about wireless re-associations. so the wireless lionks to the customers stayed up.

so its like there is next to no data going over the wireless links. its like the AP locked up, but maintained connections to the cpe’s.

my wireless config:

Flags: X - disabled, R - running
0 R name=“wlan1” mtu=1500 mac-address=00:15:6D:53:38:AB arp=enabled
interface-type=Atheros AR5213 mode=ap-bridge ssid=“8888” frequency=2437
band=2.4ghz-b scan-list=default antenna-mode=ant-b wds-mode=disabled
wds-default-bridge=none wds-ignore-ssid=no default-authentication=yes
default-forwarding=no default-ap-tx-limit=0 default-client-tx-limit=0
hide-ssid=no security-profile=default compression=no

I am running 75 pppoe connections on a single rb433ah and a single omni with no problems on the pppoe side of things. I do have issues with wireless throughput because I have over loaded the unit. I can still get 5meg throughput on off hours but can pull down to 512k to 1meg during peak usage. I typically keep 50 users max per tower and this one tower is slated for sectorizing very soon. I run all mikrotik equipment.

reserch thus far:

-pppoe does not appear to play a role in stability of the AP or overall performance in general. It does narrow the available bandwidth a bit due to the pppoe overhead. but thats about it.

-the older eoc-2610’s do not like redialing pppoe to the mikrotik/sr2 AP’s after connection to the ap has been restored. will see about a firmware update to fix that at engenious

-the number of subs connected to the ap should not be causeing the ap lockups. In theory, we could put more, but performance would be what suffers, not stability.

-The number of subs we have should not cause the disconnect issues we have been seeing our problem site but would certainly explain high latency and slowness issues that do crop up during peak hours.

-max thruput at 80 subs is approx 4.5megs so we will have to adjust throttleing based on that number we have set our shaping to a max 4meg pipe, bandwidth management is much better now

-an AP can only communicate with 1 cpe at a time. bad things happen when cpe’s
try to talk over one another, rts/cts prevents this from happening making cpe’s wait their turn with the AP. done

-if 10 people are browsing at the same time latency should be around 50ms. basically anything more than 40 subs, we just have to hope that everyone doesnt log on at the same time because if anyone uses voip on that ap, they will get really poor service.


What this means to us:

-well basically, our best practices really should be adjusted to 50ish users per AP, tops. This would keep performance at an acceptable level.
-We should contact engenious about the pppoe redial issue.
-mikrotiks make great access points, we should deploy mikrotik units as ap’s everywhere, they save cabinet space, can run multiple ap’s at the same time.

  • we still dont have any concrete idea about the cause of the ap lockups at the site we constantly have to moniter.

I don’t belive pppoe would have any bearing on “connection issues” because it’s a different network level. PPPOE just runs on top of your connection layer (that being wireless, ethernet, or even dial-up (like you would want to).

In the past when I had an AP that would lock up it didn’t matter what programing changes I made, the end result was I had to replace the AP. I had an AP that would lock up only when the temp droped below 34F. Tried all sorts of things and finally just replaced it.

pppoe really doesn’t take any significant bandwidth so I wouldn’t worry about that much.

If your pppoe clients have issues reconnecting on there own i’m sure there is a time-out setting that can be adjusted. I know that if my ap is reset the clients immediatly start sending pppoe authentication requests imemdialy and just don’t let up until they are authenticated. In fact I’ve seen the units try to authenticate so much to the degradation of the network. (if someone at the office suspends an account in the billing system and not tell me, thus I don’t log into the unit and turn off the pppoe).

-Michael

no, i dont believe pppoe has any bearing on the ap lockups or the performance issues at all anymore.

I have 2 issues at this point i need sorted tho.

  1. ap lockups. they have occured on 3 different AP types in the same location. we also seem to be killing them off at this site at an accellerated pace. there are fm transmitters and i doubt we are shielded properly.

  2. fix the pppoe redial issue, because its annoying me. there doesnt seem to be a setting in the eoc-2610’s for a timeout value. plus i doubt if there was one its set to infinity, they never redial unless the unit is power cycled.

I’m not really an expert regarding wireless networks, but

there are some fm transmitters on the same wireless tower as our antennas. could rf interferance from fm transmitters be causing this erratic behavior

high powered FM transmitters can really fuck ethernet connections up, big time… even if you are using high quality cables and did proper grounding/shielding.

yeah, when we put up the mt the first time we were running poe to the top, it really messed things up. now we got the MT in a cabinet at the bottom and ran lmr400 to the top. Its much better shielded. seems to work good most of the time, until the ap locks up that is.

My primary relay tower is on a very large FM tower running 100,000 watts and the only issue I had was with the rb411ah’s. I could only run 10meg 1/2 duplex on the Ethernet cable but when i use 433 or 433ah i can run 100meg full duplex on the very same ether cable. I keep all external coax to 3ft lengths or better yet use integrated antannai’s when possible.

Ok so still fighting my mikrotik/xr2 deployment issues, Here is an update.

\

  1. When using a static ack-timeout value, the AP’s still “lock up” from time to time, all meaningful traffic over the wlan interface cease. all i see in my syslog are attempted pppoe connections that timout and retry over and over. same for dhcp users. Restarting the wlan interface fixes this.

When mikrotik support suggested i use dynamic ack-timout and allow the 1m data rate instead of forcing 11meg, I get the same behavior a couple times per night, but it rights itself without having to restart the wlan interface. it takes a couple hours before it starts flowing normal traffic usually. i still end up rebooting the wlan interface because it seems about 1/3rd of my users dont come back into pppoe session till i do.

I have reverted back to static arp for now. seems less cranky set that way, only drops once every day or two that way.


2. engenious brand Realtek based EOC 3320 -ext radios have a real hard time redialing PPPOE after wireless re-registration during one of the previously mentioned events. They have to be power cycled to redial pppoe. I have set these radios to dynamic IP’s and set speeds manually.

  1. FM rf issues seem to be under control at the one site, able to run 100meg FD to the mikrotiks there now without loss. had to use lmr-400 to the antenna, and inside a metal cabinet i have an armoured case for the MT and a few ferric loops between the switch and the MT.

  2. I have deployed a rb411/sr2 AP for a small 15 client site. It runs like the bloody energizer bunny, no problems there at all.


    anyone have any more suggestiosn for my AP lockups?

Ok mikrotik support has pretty much reverted to telling me there is a collision issue (hidden node issue) on the ap’s.

I could go on a hunt for a cpe that doesnt have rts/cts set… but i have my doubts that collisions are the issue. the wlan interface isnt acting like a flooded/colliding network interface. Total thruput on wlan1 when its in its “not working” state is around 20kbps.

suggestions?