Community discussions

 
roadracer96
Forum Veteran
Forum Veteran
Topic Author
Posts: 714
Joined: Tue Aug 25, 2009 12:01 am

RB1000 SSTP, major disconnect issues

Sat Aug 04, 2012 1:52 am

Have went back and forth with support a few times, sent multiple supouts and debug logs. Issue still persists.

~200ish SSTP tunnels to an RB1000, 5.18 and now 5.19, all 5.18 clients. RB450gs, 433ahs, 2011s... I run Amanda backup over the VPN links. So at peak load, its running about 40ish mbit download at about 30-40% cpu load.

The problem I keep having is.. randomly throughout the day, the RB1000 SSTP server will just stop working. There will be 200 active connections, then, poof, they all get dropped and 200 clients attempt to reconnect and "pending" interfaces are created, but they never get connected. All authentication is handled through FreeRADIUS. Authentication works fine. Redundant RADIUS servers on a local network to the router. All I have to do to get it to start working again is disable the SSTP server and re-enable it. All clients almost immediately reconnect with no issues.

This may happen 1-2 times a week, or 2-3 times a day. Seems to happen most often when there is heavy load on the router, but happens quite a bit when there is almost no load (~1mbit).

All I can say is.. I have had 3 years worth of VPN related problems with MT. OpenVPN being a problem, advised to use SSTP. SSTP going through many ups and downs, getting better, getting worse. Problems arising due to the # of connections I have increasing as customer base grows, new releases that fix problems and introduce new ones. I dont think I have had a completely STABLE version yet.

EDIT: To add. Ive stripped the router config down to bare essentials. I disabled ipv6, hotspot, wireless packages, manually disabled all dynamic routing protocols except BGP (Using it). Bare essentials of firewall rules. Like 6 rules, one nat rule and a mangle rule acting on the VPN IPs to clamp MSS. The SSTP server has a certificate w/ the IP of the router in the name, it verifies client certs. All clients have a cert and the CA installed. 2x RADIUS servers with mysql backend. Some routes set from RADIUS, nothing fancy. Maybe one /30 or /29 per client. 1 interface on the internet, 1 interface with 4 vlans attached to a managed switch. All other firewalling is done on the other side of the switch. It really is the barest possible config. I am running the NTP server package. I ended up doing this because I had more problems when I ran the SSTP server on the same router that I had IPSEC connections, queues, and hundreds of firewall rules. So I segmented it off. It helped, but didnt solve the problem. I have gone as long as 10-12 days without an issue, but more recently, it is happening at least every day, sometimes more often.

Is anyone else running upwards of 200 SSTP tunnels on a PowerPC routerboard? Successfully?

At this point, Im not even getting responses from support as I send them new supouts and logs. Really getting irritated. Has been 10 days since the last response from them saying other problems have been fixed.. Telling me pretty much nothing
 
User avatar
AnRkey
Member Candidate
Member Candidate
Posts: 119
Joined: Tue Sep 15, 2009 6:01 pm

Re: RB1000 SSTP, major disconnect issues

Sun Aug 05, 2012 10:54 am

Have went back and forth with support a few times, sent multiple supouts and debug logs. Issue still persists.

~200ish SSTP tunnels to an RB1000, 5.18 and now 5.19, all 5.18 clients. RB450gs, 433ahs, 2011s... I run Amanda backup over the VPN links. So at peak load, its running about 40ish mbit download at about 30-40% cpu load.

The problem I keep having is.. randomly throughout the day, the RB1000 SSTP server will just stop working. There will be 200 active connections, then, poof, they all get dropped and 200 clients attempt to reconnect and "pending" interfaces are created, but they never get connected. All authentication is handled through FreeRADIUS. Authentication works fine. Redundant RADIUS servers on a local network to the router. All I have to do to get it to start working again is disable the SSTP server and re-enable it. All clients almost immediately reconnect with no issues.

This may happen 1-2 times a week, or 2-3 times a day. Seems to happen most often when there is heavy load on the router, but happens quite a bit when there is almost no load (~1mbit).

All I can say is.. I have had 3 years worth of VPN related problems with MT. OpenVPN being a problem, advised to use SSTP. SSTP going through many ups and downs, getting better, getting worse. Problems arising due to the # of connections I have increasing as customer base grows, new releases that fix problems and introduce new ones. I dont think I have had a completely STABLE version yet.

EDIT: To add. Ive stripped the router config down to bare essentials. I disabled ipv6, hotspot, wireless packages, manually disabled all dynamic routing protocols except BGP (Using it). Bare essentials of firewall rules. Like 6 rules, one nat rule and a mangle rule acting on the VPN IPs to clamp MSS. The SSTP server has a certificate w/ the IP of the router in the name, it verifies client certs. All clients have a cert and the CA installed. 2x RADIUS servers with mysql backend. Some routes set from RADIUS, nothing fancy. Maybe one /30 or /29 per client. 1 interface on the internet, 1 interface with 4 vlans attached to a managed switch. All other firewalling is done on the other side of the switch. It really is the barest possible config. I am running the NTP server package. I ended up doing this because I had more problems when I ran the SSTP server on the same router that I had IPSEC connections, queues, and hundreds of firewall rules. So I segmented it off. It helped, but didnt solve the problem. I have gone as long as 10-12 days without an issue, but more recently, it is happening at least every day, sometimes more often.

Is anyone else running upwards of 200 SSTP tunnels on a PowerPC routerboard? Successfully?

At this point, Im not even getting responses from support as I send them new supouts and logs. Really getting irritated. Has been 10 days since the last response from them saying other problems have been fixed.. Telling me pretty much nothing
Wow, that sounds rough.

I would put down an X86 box with ROS 5 and start moving clients over one at a time until I hit problems. (PPC on multiple RB800s gave me big problems, in short PPC is shit for big jobs)

Try and isolate the problem.

One thing to remember about SSTP is that it looks like SSL traffic to your providing ISP. Is there any traffic shaping on your ISP supplied line? If there is, this will almost certainly cause problems.

If MT is not giving you more info, then you are most likely the only one with this problem and they have no more info. This is good because it means that something in your setup can be replaced to fix the issue since it seems to be something only effecting you.

More info about how the setup is behaving might help me help you.

R
MTCNA
 
User avatar
AnRkey
Member Candidate
Member Candidate
Posts: 119
Joined: Tue Sep 15, 2009 6:01 pm

Re: RB1000 SSTP, major disconnect issues

Sun Aug 05, 2012 11:14 am

Please do a continuous ping from a PC behind the RB1100 to the client router's public IP. Do another ping from the same PC to something very stable at your ISP, like their DNS server.

When your next lot of SSTP disconnections occur, check to see if you could still get pings through to the client and the "stable" host.

Are you losing tunnel connections because of your setup or is this an ISP issue? That's what I want to figure out.

If you can still ping the remote client's public IP during a SSTP disconnection, then it's your setup and not the ISP.

R
MTCNA
 
roadracer96
Forum Veteran
Forum Veteran
Topic Author
Posts: 714
Joined: Tue Aug 25, 2009 12:01 am

Re: RB1000 SSTP, major disconnect issues

Sun Aug 05, 2012 1:57 pm

Its not the ISP. Its in one of the top rated data-centers. in the country. Giig uplink, and itll pull that much if I transmit to multiple hosts simultaneously. I have 6 SSTP connections on a RB1100 on the same switch in the same rack in the same datacenter and they never drop connection. But there are only 6 connections. Communication is never lost to the RB1000. The problem is, SSTP service itself is bombing out in some way. In the interfaces list, I end up seeing SSTP-1 through SSTP-100 or 200 or something as clients are attempting to reconnect unsuccessfully. If I simply disable and re-enable the SSTP server in ROS, all 200 clients will reconnect in a matter of 5-10 seconds. No datacenter QOS should even be able to crash a mission critical service permanently. Sure. It may cause a disconnection, but not a disconnection that should persist until a process is killed and restarted. That can only possibly be a problem in the software.
 
roadracer96
Forum Veteran
Forum Veteran
Topic Author
Posts: 714
Joined: Tue Aug 25, 2009 12:01 am

Re: RB1000 SSTP, major disconnect issues

Mon Aug 06, 2012 5:12 am

And again today, middle of the day, only about 1mbit of traffic on 200 connections. The tunnels I have on my 1100ah all stayed up. Def not an ISP problem. Had to disable SSTP server and re-enable, then every connection popped back up.

Commented config.. You can see it really is stripped down to the bare minimum right now... Just the smallest amount of config to make it work...
/interface vlan
add interface=ether2 l2mtu=1596 name=vlan100 vlan-id=100
add interface=ether2 l2mtu=1596 name=vlan30 vlan-id=30
add interface=ether2 l2mtu=1596 name=vlan20 vlan-id=20
add interface=ether2 l2mtu=1596 name=vlan40 vlan-id=40
/ppp profile
add change-tcp-mss=no local-address=w.x.y.z name=Customer only-one=no use-compression=yes use-encryption=no use-ipv6=no use-mpls=no use-vj-compression=yes
/queue type
set 0 kind=none
set 1 kind=none
set 2 kind=none
set 3 kind=none
set 4 kind=none
set 6 kind=none
set 7 kind=none
/routing bgp instance
set default as=65012 client-to-client-reflection=no out-filter=bgp-out redistribute-connected=yes redistribute-other-bgp=yes redistribute-static=yes router-id=w.x.y.z
/routing ospf area
set [ find default=yes ] disabled=yes
/routing ospf instance
set [ find default=yes ] disabled=yes
/routing ospf-v3 area
set [ find default=yes ] disabled=yes
/routing ospf-v3 instance
set [ find default=yes ] disabled=yes
/snmp community
add addresses=w.x.y.z/24 name=radius
/system logging action
set 1 disk-lines-per-file=1000
set 3 remote=w.x.y.z
/user group
add name=PSG policy=telnet,ssh,reboot,read,write,winbox,!local,!ftp,!policy,!test,!password,!web,!sniff,!sensitive,!api
/interface sstp-server server
set authentication=mschap2 certificate=fw2-new default-profile=Customer enabled=yes keepalive-timeout=120 max-mru=1400 max-mtu=1400 mrru=1400 verify-client-certificate=yes
/ip address
*snip*
/ip dns
*snip*
/ip firewall address-list
*snip*
/ip firewall connection tracking
set tcp-established-timeout=1h tcp-syncookie=yes
/ip firewall filter
add chain=forward connection-state=established
add chain=forward connection-state=related
add chain=input connection-state=established
add chain=input connection-state=related
add chain=input dst-port=80 protocol=tcp src-address-list=PSG_NoCust
add chain=input connection-state=new dst-port=123 protocol=udp
add chain=input connection-state=new dst-port=443 in-interface=ether1 protocol=tcp
add chain=input connection-state=new dst-port=22,8291 in-interface=ether1 protocol=tcp src-address-list=AllowSSH
add chain=input connection-state=new dst-port=443 in-interface=vlan100 protocol=tcp
add chain=input connection-state=new dst-port=22,8291 in-interface=vlan100 protocol=tcp src-address-list=AllowSSH
add chain=input connection-state=new icmp-options=8 protocol=icmp
add action=jump chain=input dst-address-type=broadcast jump-target=silentdrop
add action=jump chain=input jump-target=drop
add action=drop chain=silentdrop
add action=log chain=drop
add action=drop chain=drop
/ip firewall mangle
add action=change-mss chain=forward new-mss=1360 protocol=tcp src-address=SSTPCLIENTIPS/16 tcp-flags=syn tcp-mss=1361-65535
add action=change-mss chain=forward dst-address=SSTPCLIENTIPS/16 new-mss=1360 protocol=tcp tcp-flags=syn tcp-mss=1361-65535
/ip firewall nat
add action=src-nat chain=srcnat dst-address-list=PSG_CustNew src-address=!w.x.y.z src-address-list=PSG_NoCust to-addresses=w.x.y.z
add action=src-nat chain=srcnat dst-address-list=PSG_CustNew src-address=w.x.y.z to-addresses=w.x.y.z
add action=dst-nat chain=dstnat dst-address=w.x.y.z dst-port=514 protocol=udp to-addresses=w.x.y.z to-ports=514
/ip firewall service-port
set ftp disabled=yes
set tftp disabled=yes
set irc disabled=yes
set h323 disabled=yes
set sip disabled=yes
set pptp disabled=yes
/ip neighbor discovery
set ether1 disabled=yes
set ether2 disabled=yes
set ether3 disabled=yes
set ether4 disabled=yes
set vlan100 disabled=yes
set vlan30 disabled=yes
set vlan20 disabled=yes
set vlan40 disabled=yes
/ip proxy access
add
/ip route
add distance=200 gateway=w.x.y.z
/ip smb
set allow-guests=no
/ppp aaa
set use-radius=yes
/radius
add address=blah secret=blah service=ppp timeout=1s
add address=blah secret=blah  service=ppp timeout=1s
add address=blah secret=blah realm=blah service=login src-address=w.x.y.z
add address=blah secret=blah realm=blah service=login src-address=w.x.y.z
/routing bgp network
add network=w.x.y.z
/routing bgp peer
add hold-time=30s keepalive-time=5s name=fw1 remote-address=w.x.y.z remote-as=65100 tcp-md5-key=blah ttl=default
/routing filter
add action=discard chain=bgp-out prefix=w.x.y.z
add action=discard chain=bgp-out prefix=w.x.y.z
add action=discard chain=bgp-out prefix=w.x.y.z
/snmp
set contact=blah enabled=yes location=FW-2 trap-community=public trap-target=0.0.0.0
/system clock
set time-zone-name=America/Detroit
/system identity
set name=fw-2
/system logging
add action=remote topics=sstp
add disabled=yes topics=radius
add action=remote topics=ppp
add action=remote topics=radius
add action=remote topics=critical
add action=remote topics=debug
add action=remote topics=error
add action=remote topics=info
add action=remote topics=interface
/system ntp client
set enabled=yes primary-ntp=w.x.y.z secondary-ntp=w.x.y.z
/system ntp server
set enabled=yes manycast=no
/system routerboard settings
set cpu-frequency=1333MHz
/system watchdog
set automatic-supout=no watchdog-timer=no
/tool bandwidth-server
set enabled=no
/tool mac-server mac-winbox
set [ find default=yes ] disabled=yes
/user aaa
set default-group=PSG use-radius=yes
 
roadracer96
Forum Veteran
Forum Veteran
Topic Author
Posts: 714
Joined: Tue Aug 25, 2009 12:01 am

Re: RB1000 SSTP, major disconnect issues

Mon Aug 13, 2012 8:06 pm

Support has been in the router and installed some debugging package a week ago... Thats the last I heard. SSTP has crashed several times since they installed the package, but they havent got back to me yet... No response to emails...
 
roadracer96
Forum Veteran
Forum Veteran
Topic Author
Posts: 714
Joined: Tue Aug 25, 2009 12:01 am

Re: RB1000 SSTP, major disconnect issues

Sat Aug 25, 2012 5:22 am

Still regular disconnects. Actually had to reboot the router today to get everything going again.
 
roadracer96
Forum Veteran
Forum Veteran
Topic Author
Posts: 714
Joined: Tue Aug 25, 2009 12:01 am

Re: RB1000 SSTP, major disconnect issues

Sun Aug 26, 2012 2:39 pm

2 days of uptime, happened again... Still nothing from support.
 
roadracer96
Forum Veteran
Forum Veteran
Topic Author
Posts: 714
Joined: Tue Aug 25, 2009 12:01 am

Re: RB1000 SSTP, major disconnect issues

Mon Sep 17, 2012 4:13 pm

Havent heard from support in 30 days.

Have sent 5 emails and none have been responded to.

I upgraded to 5.20 and it is still happening. I put a queue in place to limit traffic to 25mbit, still happens.

What do I need to do to get resolution to this issue?
 
dominicbatty
Frequent Visitor
Frequent Visitor
Posts: 91
Joined: Wed Jul 07, 2010 12:26 pm

Re: RB1000 SSTP, major disconnect issues

Mon Sep 17, 2012 4:19 pm

I also have some issues with SSTP reported to support, which appear to be the SSTP running at 100% for no reason which is resolved by stopping and starting the SSTP server on the router. Disconnecting users has little effect and I've also seen SSTP-connections coming back in without users. Mikrotik support said they were going to log into my router a few weeks back but I never heard anything.

I know you have a lot of users and a large amount of throughput but my user base is relatively small and are mostly on PPTP connections with just a single user that I moved over to trial on SSTP. Just wanted you to be aware that it looks like there are issues at the lower end of the scale as well on a large user base in case this was relevant to you.

I'm running RouterOS (v5.20) on VMWare ESXi 5 on a Dell PowerEdge R710 just in case this is relevant.

Regards, Dominic.
 
dominicbatty
Frequent Visitor
Frequent Visitor
Posts: 91
Joined: Wed Jul 07, 2010 12:26 pm

Re: RB1000 SSTP, major disconnect issues

Tue Sep 18, 2012 4:36 pm

I've just received word from Mikrotik support that my 100% CPU issues with SSTP have been found and fixed in 5.21 RC1 for which they have sent me a download link to try. Might be worth you trying as well, email me dominic at abaca.co.uk if you want a copy of the link as I don't really know if I should post it in here. Thanks, Dominic.
 
gnuttisch
Member
Member
Posts: 309
Joined: Fri Sep 10, 2010 3:49 pm

Re: RB1000 SSTP, major disconnect issues

Tue Oct 23, 2012 9:11 am

any news? does it works?

Who is online

Users browsing this forum: Google [Bot], MSN [Bot] and 99 guests