Community discussions

 
WirelessRudy
Forum Guru
Forum Guru
Topic Author
Posts: 3089
Joined: Tue Aug 08, 2006 5:54 pm
Location: Spain

ROS dns cache has its limits?

Tue May 10, 2016 11:32 am

/ip dns
set allow-remote-requests=yes cache-max-ttl=1d cache-size=2048KiB max-udp-packet-size=4096 query-server-timeout=2s query-total-timeout=10s servers=8.8.8.8,8.8.4.4,208.67.222.222,208.67.220.220

/ip firewall nat
add action=redirect chain=dstnat comment="re-route dns requests to Google DNS" dst-port=53 protocol=udp to-ports=53
add action=redirect chain=dstnat comment="re-route dns requests to Google DNS" dst-port=53 protocol=tcp to-ports=53

With about 700 connected users this system stops resolving after an hour or so for serveral client (not for all!)
The dns cache sits in our gateway router, a CCR1016-12G with 512Mb memory.

Someone told me the dns cache can only handle up to 100 parallel requests...

Ideas?
Show your appreciation of this post by giving me Karma! Thanks.

Rudy R. Puister

WISP operator based on MT routerboard & ROS.
 
User avatar
ZeroByte
Forum Guru
Forum Guru
Posts: 4051
Joined: Wed May 11, 2011 6:08 pm

Re: ROS dns cache has its limits?

Tue May 10, 2016 6:03 pm

You should set up your own cacheing DNS servers. A good strategy for a larger deployment would be to set up a pair of anycast IP addresses, and deploy lightweight forwarding-only caches around your network, all pointed to your real cacheing resolver hosts.

Configure all clients to use your anycasted address (there was a suggestion to use 100.100.100.100 in the previous thread) as their DNS server address.
Configure your anycast local caches to forward unknown requests to your central caches.
Configure your central caches to do standard resolution (root -> tld -> donaim traversal)

The anycast solution is good because any given user will get a reply from the nearest instance of the anycast server - reducing network traffic. Since this host performs cacheing, subsequent queries for the same data will not leave the local region of your network. Meanwhile, the fact that you have your own centralized cache will reduce your total outbound traffic to the global DNS servers - i.e. if a second instance of the anycast server gets a query that was recently requested of the first instance, it will be cached in your central server, so the second anycast resolver will simply receive the centrally-cached information.

Finally, you'll want to make sure that your anycast and central cache resolvers will only respond to queries from your own IP ranges, so that if a user gets infected with a botnet and attempts to participate in a DDoS attack, their packets will go no farther than the nearest anycast node (which will discard the requests because the dns query will appear to be from the victim's IP and not your customer's IP, or else the query will appear to be from the customer's router (srcnat at the router) and the replies will just go from the local anycast node to the affected customer. You could also place reasonable rate-limits on the anycast nodes to further reduce the impact of ddos clients in your network.
When given a spoon,
you should not cling to your fork.
The soup will get cold.
 
WirelessRudy
Forum Guru
Forum Guru
Topic Author
Posts: 3089
Joined: Tue Aug 08, 2006 5:54 pm
Location: Spain

Re: ROS dns cache has its limits?

Tue May 10, 2016 6:59 pm

You should set up your own cacheing DNS servers. A good strategy for a larger deployment would be to set up a pair of anycast IP addresses, and deploy lightweight forwarding-only caches around your network, all pointed to your real cacheing resolver hosts.

Configure all clients to use your anycasted address (there was a suggestion to use 100.100.100.100 in the previous thread) as their DNS server address.
Configure your anycast local caches to forward unknown requests to your central caches.
Configure your central caches to do standard resolution (root -> tld -> donaim traversal)

The anycast solution is good because any given user will get a reply from the nearest instance of the anycast server - reducing network traffic. Since this host performs cacheing, subsequent queries for the same data will not leave the local region of your network. Meanwhile, the fact that you have your own centralized cache will reduce your total outbound traffic to the global DNS servers - i.e. if a second instance of the anycast server gets a query that was recently requested of the first instance, it will be cached in your central server, so the second anycast resolver will simply receive the centrally-cached information.

Finally, you'll want to make sure that your anycast and central cache resolvers will only respond to queries from your own IP ranges, so that if a user gets infected with a botnet and attempts to participate in a DDoS attack, their packets will go no farther than the nearest anycast node (which will discard the requests because the dns query will appear to be from the victim's IP and not your customer's IP, or else the query will appear to be from the customer's router (srcnat at the router) and the replies will just go from the local anycast node to the affected customer. You could also place reasonable rate-limits on the anycast nodes to further reduce the impact of ddos clients in your network.
Ok, I'm going to talk 'simple' here now, so I do understand and the illiterate reader..... :?

An 'anycast' address? I think we already have something like that...we have a bridge without ports in the main gateway and called that 'loopback' and gave it a /32 address. (My assistant did so because needed for OSPF).
I use this IP to have my CPE's pointing for their time signal. I'd presume this is what you call an 'anycast' address?

Also; since from this same CCR PPPoE tunnels are made towards the AP's (or a small router in front of these). I'd presume any routers halfway the path the tunnels pass makes no sense.
So we might as well setup the dns cache server on the CCR?
And if that is not working (like it is not) we'd better arrange a dns cache server suggested that just 'hangs' on this gateway CCR?
Show your appreciation of this post by giving me Karma! Thanks.

Rudy R. Puister

WISP operator based on MT routerboard & ROS.
 
User avatar
ZeroByte
Forum Guru
Forum Guru
Posts: 4051
Joined: Wed May 11, 2011 6:08 pm

Re: ROS dns cache has its limits?

Tue May 10, 2016 8:09 pm

The address you're referring to is called a loopback address.

Anycast means that multiple hosts on the network in arbitrary locations may all use and respond to the exact same IP address.

8.8.8.8 is an anycast host - meaning that Google has many hosts all over the world with this same IP address, and whichever instance of it is nearest to you is the one where your request will be routed.

Suppose your network has 20 tower sites - if you were to create an anycast address 100.100.100.100 and assign it to 20 DNS servers, you could put an instance of that server in every tower site on your network, and DNS requests from each tower site would be routed to the local instance. If the node fails, then it will stop participating in the anycast, so some other instance of 100.100.100.100 will take up the slack while you can repair the failed node.

Here's a basic overview of how you accomplish this:

The way you do it is advertise the route for 100.100.100.100 into OSPF from every box that is going to participate in the anycast. The host will also need its own unique (not-anycasted) address so that its own requests will be able to have its replies properly routed back to it.

So the host would run OSPF and have, say, a /30 between it and the site's router - e.g. 100.100.100.0/30 with .2 = the unique address of the anycast node. The node will advertise 100.100.100.100/32 into OSPF, and the directly-connected router will obviously want to use that node as the shortest path to the anycasted address. There will be other 100.100.100.100/32 advertisements in OSPF, so some router elsewhere in the network can choose whichever one is closest.

You would monitor the health of the node using its unique address (100.100.100.2) and whenever that node sends a request to the central cacheing resolvers, it would send the request from the .2 address. If the node stops responding to DNS queries, you would take it "out of service" by dropping the advertisement for 100.100.100.100 - if the node actually dies completely or gets its connectivity broken (a tech pulls the wrong network cable, etc) then naturally the device will drop out of the pool because it can't announce itself anymore.

If the cacheing anycast node allows you to define static entries, you could do something clever like configure each node to give its unique address as the answer to the query mynode.example.com (or whatever) - so if working with a customer and you need to know which server is actually talking to them, have them ping "mynode.example.com" and see what IP comes back, and what the ping time is, etc.

Another good thing do resolve locally in the anycast nodes would be all of the RFC1918 private IP reverse DNS -> NXDOMAIN so that queries for local IP addresses wouldn't go bouncing up the chain to the central servers or to the Internet. (the root servers get tons of wasted traffic asking for reverse DNS on private IP space)
When given a spoon,
you should not cling to your fork.
The soup will get cold.
 
WirelessRudy
Forum Guru
Forum Guru
Topic Author
Posts: 3089
Joined: Tue Aug 08, 2006 5:54 pm
Location: Spain

Re: ROS dns cache has its limits?

Tue May 10, 2016 9:04 pm

Wow @ZeroByte! Good stuff and reasonable good explanations! :D

The only thing acadabra to me is the last paragraph....

Anyway, on my network with some 700 clients and with some 20 AP locations (some of them just some OmniTik on a roof with a back haul antenna connecting it to an uplink node and upwards to my central) I think its a bit over the top.
On the other hand we might make two or three strategically located servers that then at least have fail over in the network the way you prescribe...

But to start, we are going to order a Cuad Core server for our network. We are thinking of hosting our own local speedtest.net server and we run a virtual machine. So there is usage for a server.
Any recommendation to setup a dns 'cache' server?

And what about a proxy cache? On a 300/300 microwave link, would that still bring any benefits to a network? We tried one once but didn't notice too much improvements. But that is years ago and the 2nd hand device crashed one day and was binned. Later we tried it on our CCR but not too much improvements though. Problems yes...
But after all, locally stored web content can speed things up but than the extra routing and page renewing in the cache itself holds things back?
Show your appreciation of this post by giving me Karma! Thanks.

Rudy R. Puister

WISP operator based on MT routerboard & ROS.
 
User avatar
ZeroByte
Forum Guru
Forum Guru
Posts: 4051
Joined: Wed May 11, 2011 6:08 pm

Re: ROS dns cache has its limits?

Tue May 10, 2016 10:14 pm

Well, it's just DNS cacheing, not web cacheing. But the discussion that spawned this thread made the point that "snappy, responsive DNS is an important key to a snappy, responsive user experience" and the closer a cache is to a user, the faster it can respond.

As for the central dns resolver hosts you can get away without tons of horsepower - pretty much any decent box with a decent amount of RAM these days will do just fine for a proxy resolver. (at least on your scale, it shouldn't require much horsepower)

You could even try a local proxy resolver in a ROS device near some customers by transparently redirecting their queries to the local box, which is configured to use your own server as the forwarding resolver... But I think that ROS cachehing proxy resolver might not be a fantastic choice for production use in a WISP.... You could look into using an OpenWRT image with BIND, or something more streamlined like dnsprox, etc.
When given a spoon,
you should not cling to your fork.
The soup will get cold.

Who is online

Users browsing this forum: No registered users and 76 guests