Secondary RADIUS Server

All,

I have a question I’ve been meaning to ask for some time.

I have two RADIUS servers setup in the event of failure, a primary, and secondary. Why won’t RouterOS kick over to a secondary RADIUS server if the primary has excessive time outs?

Here’s my RADIUS config:

# jul/27/2009 10:26:51 by RouterOS 3.25
# software id = ****-***
#
/radius
add accounting-backup=no accounting-port=1813 address=10.100.1.12 \
    authentication-port=1812 called-id="" comment="Primary" disabled=no domain="" \
    realm="" secret=***** service=wireless timeout=300ms
add accounting-backup=no accounting-port=1813 address=10.100.1.13 \
    authentication-port=1812 called-id="" comment="Secondary" disabled=no domain="" \
    realm="" secret=***** service=wireless timeout=300ms

How does RouterOS handle multiple entries for the same service? It doesn’t appear to use a round robin approach, and even with excessive timeouts to the primary server it won’t fall back and use the secondary.

To be clear, it does use the secondary from time to time, although I don’t see the logic behind it’s choice. I want the authentication to be as seamless to the user as possible. RouterOS’s approach is not seamless, and will continue to attempt to authenticate to a downed primary RADIUS server even though the secondary is online.

Can anyone clarify how RouterOS chooses which RADIUS server to use when multiple are listed?

You can list multiple radius for different services.. one for PPP one for Hotspot, and so on..
To make them work as failover, you should use a netwatch script to enable/disable them on ping…

/system script
add name=RadiusCHECK  source=":local i 0; {:do {:set i ($i + 1)} while (($i < 5) && ([/ping xxx.xxx.xxx.xxx interval=3 count=1]=0))}; :if ($i=5 ) do={/radius disable [find address=xxx.xxx.xxx.xxx]; /radius enable [find address=yyy.yyy.yyy.yyy] }; :if ($i<5 && [/radius get [find address=xxx.xxx.xxx.xxx] disabled]=yes) do={/radius enable [find address=xxx.xxx.xxx.xxx]; /radius disable [find address=yyy.yyy.yyy.yyy]}"

where xxx.xxx.xxx.xxx is the primary radius and yyy.yyy.yyy.yyy is the backup.
then on schedule..

/system scheduler
add comment="" disabled=no interval=30s name=RadiusCHECK on-event=RadiusCHECK start-time=startup

That’s a simple workaround, not a smart one though… since the machine can be able to ping but might have the radius service down..

AFAIR, ROS uses first RADIUS, in case of timeout it uses second one, etc.

well, it seems like ROS do it for every request =)

Hi,

What wrong is with this script ? When I added this srcipt to MT (5.24 ) I got error:

/system script
add name=RadiusCHECK  source=":local i 0; {:do {:set i ($i + 1)} while (($i < 5) && ([/ping primary_radius interval=3 count=1]=0))}; :if ($i=5 ) do={/radius disable [find address=primary_radius]; /radius enable [find address=second_radius] }; :if ($i<5 && [/radius get [find address=primary_radius] disabled]=yes) do={/radius enable [find address=primary_radius]; /radius disable [find address=second_radius]}"
syntax error (line 1 column 58)

The problem is with: {:do {:set i ($i + 1)}

Thank you

Hi,

no one knows why the script does not work?
Maybe someone has a different script to failover radius ?

Why do you need a script? I just tested the /radius failover function with v5.25, and it works fine. The radius servers will be contacted in the order in the “/radius” list IF they qualify for that user. Only if the server does not respond does it use the next server. If the first server accepts the connection (edit: returns a packet actually), the second server will not be used.

Add: If you are getting a lot of timeouts on the primary radius server, maybe you should work on that first. If it is a busy server or on a busy connection, you might want to try extending the timeout value.

/radius
print detail
set 0 timeout=2s

But it will now be 2 seconds before the primary radius server will timeout, and the second server contacted.

Hi,

Both servers are working properly, I have no problem with the timeout.
The problem is that when the first server goes down. Replication database is every 24 hours.
If using two servers, sometimes the user authenticates to a second server, and that is the problem.
I enable Accounting Backup on the second server but then it does not work when the first server goes down.
Someone already described a similar problem on the forum.
Therefore, a good solution would be to use script to switch servers.

If using two servers, sometimes the user authenticates to a second server, and that is the problem.

That is the problem. Why are your clients authenticating on the second server if the first is working? That sounds like a network problem.

And when you switch servers with a script, the problem will reverse itself. It won’t go away.

With accounting backup on the second server, that may cause a problem. If you are using the second server, that means the first is down. It won’t get any accounting backups. You are wasting your time.

sometimes happens that the second server responds faster than the first.

p.s.
servers are working in WAN

And what does that have to do with it? The router does not try to contact both RADIUS servers at the same time. It tries the first server that qualifies, and IF it does not respond by the timeout, THEN it tries to contact the second server. IF it cannot contact the second server, THEN you get a “RADIUS server not responding” message on the login page.

What I am attempting to tell you is you should have the first server set to backup accounting to the second server, so when the first fails, the second will have current data on your users.

What I am attempting to tell you is you should have the first server set to backup accounting to the second server, so when the first fails, the second will have current data on your users.

Yes, but when the first fails and I set second server to backup accounting, new users will not be able to connect.

Backup accounting to where? A failed first server? Add a third RADIUS server entry, and try accounting backup to it from the second server.

I tried, but there is another problem. When I set:

  1. second server with checked Accounting Backup
  2. first server
  3. second server

When first server down new user can authenticate but on the second server, I have two of the same users (double authorization)
First entry is from Accounting Backup, second entry is correct authorization via second server.

I found a problem with script. The problem was, to remove the ( " " ), after that I could add script to the MT

/system script
add name=RadiusCHECK  source= {
:local i 0; 
{:do {:set i ($i + 1)} while (($i < 5) && ([/ping primary_radius interval=3 count=1]=0))}; 
:if ($i=5 ) do={/radius disable [find address=primary_radius]; /radius enable [find address=second_radius] }; 
:if ($i<5 && [/radius get [find address=primary_radius] disabled]=yes) do={/radius enable [find address=primary_radius]; 
/radius disable [find address=second_radius]}}

Thank you for help