We have one strange Multi ISP issue. Another set of eyes would be helpful.
https://drive.google.com/file/d/0B53FAY ... sp=sharing
The screenshot is showing the case where traceroute is run to Google's servers over ISP2. Only the first hop is shown, others are a timeout. You can see packet sniffer at the same time that all requests got answers that were not processed by the router. The only answer from the ISP2 GW is processed.
Two ISPs connected over same optical fiber using OPTOKON S125-W31-LP-LX-D module.
One ISP1 is already working at VLAN 100 and with its own public IPv4.
We are now trying to configure other VLAN 200 for ISP2.
Two VLAN interfaces are attached to SFP port as follows:
set [ find default-name=sfp1 ] name=AbsolutOK rx-flow-control=auto tx-flow-control=auto
add interface=AbsolutOK name=Amres vlan-id=100
add interface=AbsolutOK name=Gama vlan-id=200
Public IP space is configured for each VLAN:
add address=126.96.36.199/26 interface=Gama network=188.8.131.52
add address=184.108.40.206/27 interface=vlan-DMZ network=220.127.116.11
add address=18.104.22.168/27 interface=vlan-DMZ network=22.214.171.124
add address=126.96.36.199/30 interface=br-Amres network=188.8.131.52
br-Amres is the bridge of ports Amres VLAN and physical port 6 for debugging.
/interface bridge port
add bridge=br-Amres interface=Amres
add bridge=br-Amres interface=P6
Routes are added that way so ISP1 is primary default route, ISP2 is secondary routing table for policy routing:
add distance=1 gateway=184.108.40.206 pref-src=220.127.116.11 routing-mark=route_amres
add distance=1 dst-address=18.104.22.168/27 gateway=bridge-DMZ pref-src=22.214.171.124 routing-mark=route_amres scope=10
add check-gateway=arp distance=1 gateway=126.96.36.199
add distance=2 gateway=188.8.131.52 pref-src=184.108.40.206
add distance=1 dst-address=192.168.0.0/16 gateway=192.168.1.10
/ip route rule
add dst-address=192.168.0.0/16 table=main
add dst-address=220.127.116.11/27 table=main
add routing-mark=route_amres table=route_amres
Mangling connection and route marking are done with:
/ip firewall mangle
add action=mark-connection chain=prerouting in-interface=br-Amres new-connection-mark=amres_con passthrough=no
add action=mark-connection chain=input in-interface=br-Amres new-connection-mark=amres_con passthrough=yes
add action=mark-connection chain=prerouting new-connection-mark=amres_con passthrough=yes src-address=18.104.22.168/27
add action=mark-connection chain=prerouting new-connection-mark=amres_con passthrough=yes src-address-list=route_amres
add action=mark-routing chain=prerouting connection-mark=amres_con new-routing-mark=route_amres passthrough=no
add action=mark-routing chain=output connection-mark=amres_con new-routing-mark=route_amres passthrough=no
While everything works just fine with ISP1 there is a really strange issue with ISP2.
We can ping ISP2 GW 22.214.171.124 with and without route_amres mark. Also, we are getting ARP resolution without any problem. But any other address ping or trace (with Internet public address as destination) is failing. Traceroute shows the response from ISP2 GW, and after that timeout in other steps. The strangest thing is that we can with packet sniffer see that returning packet had come to Amres interface, but it is not processed at all. Sniffing that interface, we can also see that there are some requests from Internet to our public IP, but no response from the router.
It is important to say that in time of testing, all IP firewall filters was disabled, and one explicit Accept for incoming packets was added at the top. That implicit Accept has not counted packets that were seen with the packet sniffer, just like they were not sent to this router. We have confirmed that packets originated from the Internet are sent to MAC address of the router (that is advertised in ARP process) using the packet sniffer. Mangle incomming rules are counting packets that were comming form Internet.
To workaround this problem we have created a bridge (br-Amres) and bridged Amres VLAN with the physical port. Than attached other router to that physical port and test ISP2 connection. It was perfectly fine. After that, as we had two connections form that router to main router (with SFP port), we have configured policy routing so only specific, source-based selected, traffic was passed over ISP2. That test passed without a glitch, but the second router is with poor CPU performance so it is not a solution.