Load Balancing, Mangle and FailOver Oh My... Little help?

So 2nd job ever using Dual WANs.
WAN 1 is Verizon 4GLTE modem from CradelPoint
WAN 2 is a T1

The T1 is not reliable at all and bogs down with just a few clients.
So we were going to switch them to 4G LTE. The only worry I have there is that there are Data Caps on 4G. Depending on Plan… I can get them 30Gs per month at $120. But each Gig over cost $15 each.

“What about Sonos and Netflix? We really want to be able to stream a lot of stuff.”

So My plan became to break a class C network into 2 WANs. The Plan was to set all connections from the computers to go out WAN 1. All the media devices would go out WAN 2. This would allow web surfing and usage to run at full pace. While media devices that use buffers and tons of Data would use the SLOWER but uncapped T1.

192.168.1.0/24
So I set up my network to have media devices between 200-245. I set up the customers computers to be between 80-100. Then there is a DHCP Pool from 100-150.

First Time with Mangle.
Set a rule to mark packets from 192.168.1.80-192.168.1.99 as “HIGHSPEED”

Then I went to routes and added a route that told pakets marked as HIGHSPEED use WAN1.

Now the 2 WANs are both DHCP from providers. The T1 does not provide any per DNS info… so that had to be set to get it to work.

Added providers DNS info to DNS Servers. The DNS from WAN 1 (Verizon) came up in Dynamic

THIS WORKED GREAT!!!

The T1 had a distance of ZERO by default. The 4G gave its self a distance of ONE.

Everything from the HighSpeed network went to WAN1 everything else went to WAN 2.

THROW IN A POWER OUTAGE…

There have been major outages across the DC Metro Area. Yesterday I get a call that the system is not working…
First thought… maybe the 4G tower is down.

Arrive on site to find closest 4G tower is down. But the system is using another 4G tower.
But the issue is that the T1 is up but its not working. It pulls an IP and Gateway… but nothing is working. So I assume this means that the provider is down but the T1 still makes it back to the central station.

I check routes and it shows that WAN2 is reachable. So it is trying to send info out but it is not getting replies.

So I start to think that my DNS info is trying to use WAN1s DNS that I entered manually.

After a ton of messing with it I got it going.. but I think I probably made it a lot harder then it needed to be.

So starting from the top…
I want the members of Highspeed to use WAN 1 if it is up. If it is not working I would like it to use WAN 2.
I want everything else to use WAN 2 ONLY.
I thought I did this with Mangle.
If 80-99 High Speed
If ! 80-99 Media

I found that the system worked with out the MEDIA route rule because the router assigned WAN 2 a distance of 0.

I guess what I need is a way to make sure that WAN 1 actually works. If it fails to then switch to WAN 2.

So I thought Netwatch
UP ip route enable “Highspeed”
Down ip route disable “Highspeed”

But I would need to make sure that netwatch is making sure that Highspeed is actually active.
I thought that I did this in routes… like setting a route that states 8.8.4.4 ether1. But that did not work.

I need to set it so a IP address is only reachable from WAN 1. So if I am monitoring that… when it goes down to disable the rule that forces traffic to WAN 1.

Sorry for the long post.

For the failover i found this solution very good:
http://wiki.mikrotik.com/wiki/Advanced_Routing_Failover_without_Scripting

For the DNS issues, you better use public DNS servers since when using more than one ISP,
it is possible you can not reach the DNS server of one ISP using as gateway the other ISP.
Also make sure to redirect all DNS requests to the router itself.

I have looked at that page before.

I wonder if it’s outdated. 5.18 will not let me put a word in dst-address?

I didn’t understood you. What word do you want to put in dst-address? Can you post an example?

Sorry about that.

After reading it over SLOWLY and Repatedly… it became clear the words like HOST are place holders.

I need to put in actual pingable addresses in all those places.

Anyone have an export of /ip route that is using this? I could really use an actual demonstration of what actually works.

You have to put the public address of a server that is known not be down, like google dns server.
For example, if your gateway is 1.1.1.1, it would like like this:

/ip route 
add dst-address=8.8.8.8 gateway=1.1.1.1 scope=10 target-scope=10
add dst-address=0.0.0.0/0 gateway=8.8.8.8 check-gateway=ping

Can my gateway be a physical interface? I have 2 dynamic IP WANs.

WAN1 is Cellular on Ether1
WAN2 is a T1 on Ether5
LAN is on Ether2

That page you pointed me to has Host1A Host1B…

Thanks for the help.

Unfortunately not, your gateways can not be interface. That solution is based on next hop in routing table.
Defining an interface as gateway does not on the other hand define the next hop. Host1A and Host1B are the
public servers you would want to ping, meaning you can put there any IP of the server you would choose.
I like that solution because it is a straight forward with not to much configuring, but if you have
to specify an interface as gateway rather than IP, then you have to look for failover with scripts.
There are a few examples on the wiki (failover with script).
Try one that suits you better and tell us if anything is not clear.