One of my Mikrotiks is behind an ISP that hides all of its routers from traceroutes from devices on their network (stupid, but whatever). Since there are six hops between that device and the ISP’s upstream (the first place that responds to ICMP pings) and the Mikoritik /tool traceroute tool will automatically stop probing after five consecutive timeouts, this has the effect of making it impossible for me to ever trace a route from this router to anywhere on the WAN.
Sample from the device in question outbound:
[admin@ANC] > /tool traceroute use-dns=yes count=10 8.8.8.8
# ADDRESS LOSS SENT LAST AVG BEST WORST STD-DEV STATUS
1 100% 10 timeout
2 100% 10 timeout
3 100% 10 timeout
4 100% 10 timeout
5 100% 10 timeout
[admin@ANC] >
To illustrate what’s going on, here’s a traceroute from me at my current location to the device in question:
[admin@SGF] > /tool traceroute use-dns=yes [redacted] count=10
# ADDRESS LOSS SENT LAST AVG BEST WORST STD-DEV STATUS
1 100.64.103.1 0% 10 0.4ms 0.8 0.4 3.8 1
2 38.131.218.241 0% 10 0.2ms 0.2 0.2 0.3 0
3 38.65.114.217 0% 10 2.7ms 2.7 2.7 2.7 0
4 v320.core1.mci3.he.net 0% 10 9.1ms 13 9.1 19.7 4.6
5 100ge12-1.core1.den1.he.net 0% 10 30.8ms 31.7 21.3 58.3 12.2
6 100ge10-2.core1.slc1.he.net 0% 10 74.2ms 43.1 32.7 74.2 12.5
7 100ge11-2.core1.pdx1.he.net 0% 10 51.7ms 50.6 49.1 60.8 3.5
8 gci2.nwax.net 0% 10 49.4ms 49.5 49.3 49.9 0.2
9 10.128.4.157 40% 10 timeout 49.7 49.4 50.3 0.3
10 10.128.4.166 40% 10 timeout 73.8 73.3 75.5 0.8
11 10.128.240.17 40% 10 timeout 73.8 73.3 75.2 0.6
12 10.128.4.130 30% 10 timeout 88.5 88.4 88.7 0.1
13 10.128.8.57 30% 10 timeout 91.2 90.8 92.2 0.4
14 100% 10 timeout
15 x-x-x-x.gci.net 0% 10 98.8ms 99.9 98.7 101.1 0.8
[admin@SGF] >
As you can see, hops 9-14 are either private IPs or nonresponsive. It’s those six hops that don’t appear at all when tracing outbound from on GCI’s network. (I’ve also tried with protocol=udp and it’s the same, so that’s not a solution.)
I don’t have access to a computer behind that router at the moment, but tracing from Windows or Unix works fine–there are six nonresponsive hops and then the 7th hop and beyond work fine.
So my question is: is there a way to force the Mikrotik to continue past five nonresponsive traceroute hops? Maybe playing with the TTL or something?
Thanks.
You can make an firewall rule which catches those packets from your IP to ISP LAN and redirects to your own router. I didn’t knew the traceroute stops after 5 failures.
Edit actually my mikrotik traceroute doesn’t stop after 5 timeouts.
Thanks. OK, so going with the nugget I gleaned from your advice, I set up a Mangle rule on the Output chain to change the TTL of anything less than 5 to a new TTL of 5. I tested it on my device here and it had the intended effect of making my first four hops all the same as my fifth hop.
I then deployed the same rule on the device in Anchorage and it didn’t work. I found I had to increase the TTL all the way to 9 to get any results. Apparently GCI, the ISP, hides even more of its network from the inside interface than I thought–the first hop I was able to get to respond was all the way in Seattle (1,500 miles away, at the internet exchange there). They started doing that a couple years ago for completely unknown reasons but since I usually trace routes from a Windows or Linux machine and not from the Mikrotik itself, it never affected me (beyond it being a minor annoyance to wait while those hops failed to respond):
[admin@ANC] > /tool traceroute use-dns=yes count=10 8.8.8.8
# ADDRESS LOSS SENT LAST AVG BEST WORST STD-DEV STATUS
1 google.nwax.net 0% 10 55.1ms 56.5 54.3 69.5 4.4
2 google.nwax.net 0% 10 55ms 55.7 53.7 61 2
3 google.nwax.net 0% 10 57ms 56.5 54 66 3.9
4 google.nwax.net 0% 10 55ms 54.9 54 56 0.7
5 google.nwax.net 0% 10 55ms 56.3 54 62 2.3
6 google.nwax.net 0% 10 54.4ms 55.3 54.1 60.4 1.8
7 google.nwax.net 0% 10 54.5ms 55 54.4 55.9 0.5
8 google.nwax.net 0% 10 54.5ms 55.7 54.4 58.6 1.2
9 google.nwax.net 0% 10 54.9ms 54.9 54.3 55.7 0.4
10 108.170.245.113 0% 10 54.9ms 55 53.9 56.2 0.6
11 108.170.231.27 0% 10 55.3ms 56.5 54.4 68.4 4
12 google-public-dns-a.google.com 0% 10 54.7ms 56.1 54 62.6 2.4
[admin@ANC] > /tool traceroute use-dns=yes count=10 206.81.93.33
# ADDRESS LOSS SENT LAST AVG BEST WORST STD-DEV STATUS
1 12.122.158.182 0% 10 89.2ms 82.2 79.1 89.2 2.6
2 12.122.158.182 0% 10 83.2ms 81.9 79.1 87.5 2.6
3 12.122.158.182 0% 10 80.1ms 80.8 79.3 83.9 1.4
4 12.122.158.182 0% 10 79.7ms 80.4 79.6 83.3 1
5 12.122.158.182 0% 10 79.9ms 80.5 79.2 87.6 2.4
6 12.122.158.182 0% 10 80.4ms 80.8 79.7 83.9 1.5
7 12.122.158.182 0% 10 79.3ms 80.9 79.3 84.2 1.8
8 12.122.158.182 0% 10 88.2ms 81.8 79.7 88.2 2.6
9 12.122.158.182 0% 10 80.3ms 81.9 79.4 95.7 4.8
10 12.122.158.157 0% 10 79.1ms 79.3 78.9 80.4 0.4
11 12.117.205.66 0% 10 85.6ms 80.8 79.4 85.6 1.7
12 ip-206-81-93-33.astac.net 0% 10 109.3ms 102.3 99.9 109.3 2.9
[admin@ANC] >
Anyway, it works, so I was able to diagnose the path I was looking at. (I needed to see if that second test above, to the ASTAC address, was staying local to Alaska or routing from Anchorage to Seattle and back–it’s the latter.) Thanks.
I thought maybe it’s just the fact that the first five were timeouts but I was able to sort of recreate it here:
[admin@SGF] > /tool traceroute use-dns=yes protocol=udp count=10 [redacted]
# ADDRESS LOSS SENT LAST AVG BEST WORST STD-DEV STATUS
1 100.64.103.1 0% 10 0.4ms 1.9 0.4 5.2 1.9
2 38.131.218.241 0% 10 0.2ms 0.2 0.2 0.3 0
3 38.65.114.217 0% 10 2.7ms 2.7 2.7 2.8 0
4 v320.core1.mci3.he.net 0% 10 9ms 11.4 9 21.1 4.6
5 100ge12-1.core1.den1.he.net 0% 10 21.5ms 21.4 21.3 21.5 0.1
6 100ge10-2.core1.slc1.he.net 0% 10 40ms 38 32.7 42.7 4.4
7 100ge11-2.core1.pdx1.he.net 0% 10 49.9ms 57.7 49.1 88 12.1
8 gci1.nwax.net 0% 10 49.5ms 49.5 49.3 49.9 0.2
9 10.128.4.157 10% 10 timeout 49.5 49.3 50.1 0.2
10 10.128.4.166 30% 10 timeout 73.7 73.4 74.7 0.4
11 10.128.240.17 0% 10 73.4ms 73.4 73.3 73.6 0.1
12 10.128.4.130 0% 10 88.4ms 88.5 88.4 88.6 0.1
13 10.128.8.57 0% 10 90.9ms 91 90.8 91.1 0.1
14 100% 10 timeout
15 100% 10 timeout
16 100% 10 timeout
17 100% 10 timeout
18 100% 10 timeout
[admin@MikroTik] >
I don’t have the far-end device set to respond to UDP pings, so it effectively serves as a nonresponsive device, and it causes the same behavior–and you can see it times out after five nonresponsive hops, even if they’re at the end of the traceroute. I’m running 6.41.1.