I've figured the easy way to mark the traffic related to the YouTube, Facebook, etc.
Introduction:
So far, I read on the forum about people manually maintaining huge lists of ip addresses for this purpose. Starting from v6.36 RouterOS allows adding domain names to address-lists. This means we can simply add youtube.com to the address list and RouterOS will automatically resolve this hostname to the IP address(es) and place it into the address list. This helped a lot, since we could just add for example youtube.com to the address list, hoping that it would distinguish the desired traffic automatically. But, this wouldn't work, because yt, fb and others are using what's called a CDN (Content Delivery Network) for their content delivery and only use the main domain for the web UI. Further more, these CDNs are usually geographically dispersed around the world, in order to provide consumers with the closest servers for content delivery.
These CDN hostnames/ip addresses are what we are looking for to mark/filter/limit the traffic to/from it.
The problem:
Find a way to automatically update all the CDN ip addresses (place them into an address list), so we could mark/filter/limit the traffic coming from/to these addresses. If we take a look at one youtube video request, we'll see that the content delivery usually comes from "googlevideo" CDN network. Hostnames like these are usually seen in your connection list (depending on the country you are coming from):
r5.sn-ncc-cxbe.googlevideo.com
r15.sn-c0q7lnek.googlevideo.com
r5.sn-4g5ednsl.googlevideo.com
It's apparent that the subdomains are quite difficult to predict (which I guess was intentional, since they don't want their service to be easily blocked). But one thing is predictable there and it's the domain googlevideo.com itself. The similar case is with facebook and the other big players.
One of the possible solutions:
Now, we'd like to automate the discovery of all these hosts on such CDN networks, which our users are visiting. We could sniff the content of the HTTP requests to youtube.com and see which CDN hosts are being offered for content delivery, but this is CPU consuming and lately even not possible, due to most of the websites switching to HTTPS anyway. Another way could be that we sniff the DNS requests for the hostnames on these CDN domains, and get the hostnames/ip addresses we need. The issue with this approach is that our users might be using some public dns servers (like Google's 8.8.8.8 for example), so we would need to sniff all the DNS traffic in order to discover new CDN hostnames. Sniffing such traffic also requires CPU to parse the DNS requests/responses. Unless we redirect all DNS requests through our DNS service, running on our router. This way, our router will serve as a DNS server for our users, no matter which public DNS server users choose to use. It will make sure we always have the DNS cache populated with the current CDN hostnames being used by our users. The downside of this approach is the increased CPU usage, due to the increased DNS traffic, which now our router needs to handle.
So, in short, we could solve this problem in the following way. First, we redirect all the DNS requests to our DNS server, running on the router:
Code: Select all
/ip firewall nat
add disabled=no chain=dstnat protocol=udp dst-port=53 action=redirect to-ports=53
add disabled=no chain=dstnat protocol=tcp dst-port=53 action=redirect to-ports=53
We create a simple script, which will filter all the dns cache entries, looking for the entries we need (like "googlevideo.com") and will update our Address List named "social":
Code: Select all
:foreach i in=[/ip dns cache find name~"googlevideo.com"] do={ :do { /ip firewall address-list add list=social address=[/ip dns cache get $i name]; } on-error={} }
Code: Select all
/ip dns cache find name~"googlevideo.com"
At this point, we can either mark/filter/limit all the packets coming/going from/to these ip addresses.
If you have any idea how can we improve the solution to this problem, please leave a comment.