I know the config is a bit messy
I'll try to put here just the snippets of the config relevant to this topic.
First things first, I did take your advice and change the distance of BSNL from 2 to 10, and the LTE connection to 20 to give a more clear idea on the preference of each route.
This is the current config for routes:
/ip route
add check-gateway=ping disabled=no distance=1 dst-address=192.168.10.0/24 gateway=192.168.6.2 pref-src="" routing-table=main scope=30 suppress-hw-offload=no target-scope=10
add check-gateway=ping disabled=no distance=1 dst-address=0.0.0.0/0 gateway=192.168.9.1 pref-src="" routing-table=main scope=30 suppress-hw-offload=no target-scope=10
add comment="For Cloud Update" disabled=no distance=1 dst-address=0.0.0.0/0 gateway=BSNL pref-src="" routing-table=only_via_BSNL scope=30 suppress-hw-offload=no target-scope=10
You were interested in the extra route I have marked For Cloud Update. It is linked to this topic
viewtopic.php?p=973860#p973860
Instead of adding a new routing table for the mangle rule this time, I used the same routing table I have for the DDNS update
Currently the routing looks like this
Flags: D - DYNAMIC; A - ACTIVE; c, s, d, v, y - COPY
Columns: DST-ADDRESS, GATEWAY, DISTANCE
# DST-ADDRESS GATEWAY DISTANCE
D v 0.0.0.0/0 BSNL 10
D d 0.0.0.0/0 192.168.2.1 20
0 As 0.0.0.0/0 192.168.9.1 1
DAc 117.210.144.1/32 BSNL 0
DAc 192.168.1.0/24 LANbridgeLocal 0
DAc 192.168.2.0/24 Airtel 0
DAc 192.168.6.0/24 mgmtBridge 0
DAc 192.168.9.0/24 ExcitelFTTH 0
1 As 192.168.10.0/24 192.168.6.2 1
;;; For Cloud Update
2 As 0.0.0.0/0 BSNL 1 ----------------------------- routing table : only_via_BSNL (I'm using this same routing table for the newly created mangle rule)
All the routes above are in the 'main' routing table except the last one.
Coming to the mangle rule for the inbound traffic from BSNL
/ip firewall mangle
add action=mark-routing chain=output comment="Routing for Mikrotik Cloud (DDNS Update)" dst-address-list=mikrotik-cloud log=yes new-routing-mark=only_via_BSNL passthrough=no
add action=mark-routing chain=prerouting in-interface=BSNL log=yes log-prefix=RouteOutBSNL new-routing-mark=main passthrough=yes
I did try using the exact command you mentioned in the last post but that seemed to have a syntax error
add action=mark-connection chain=prerouting comment=\
"Connection from WAN (BSNL)" in-interface=BSNL mark=no=mark new-connection-mark=\
conn_from_outside_BSNL
point where the CLI gave me the syntax error marker is mark=no=mark . I did try changing it to mark=no-mark, didnt help either. Looks like 'mark' is not a valid command in mangle.
Anyway, after all of this was in place, i.e. Mangle rule configured to mark everything that comes in via BSNL, with action to mark routing, and then the routing mark pinting to the only_via_BSNL routing table. I did try reaching an inside host from an outside network. I also had a PCAP running to give me a clear understanding of the traffic flow
Looks like its the exact same thing happening again. I'll try to explain how the TCP handshake went.
----Excitel is WAN1 and BSNL is WAN2----
-SYN is recieved at the BSNL interface from the outside host on the Internet.
-RouterOS runs NAT, and forwards it to the internal host (192.168.1.2)
-SYN-ACK is received from 192.168.1.2 on the internal bridge interface
-NAT changes the source IP to (Public IP) BSNL and destination to the remote(outside host) IP
-Routing forwards the NATed traffic to the MAC address of the Excitel ONU via the ethernet interface that connects to it. ------------This is the step where things go wrong.
-The remote host would eventually receive the SYN-ACK, but from an IP (Public IP of the Excitel network) which it never sent a SYN to.
-Hence an ACK never comes back and the TCP handshake fails.
I still think the reason behind this is the direction of traffic. Our marked traffic for routing is the traffic coming in from BSNL. (Direction - BSNL>Internal host) but the traffic we really want the policy to apply to is the reply traffic (Outbound from the internal host back to the router).
We need to somehow identify/mark the return traffic (maybe using the NAT table if its possible, because its working as expected till the NAT process). The layer 3 forwarding of the traffic is working as expected, its just the actual frame being forwarded at layer 2 where the problem really is.
Hoping that my ideas here help with figuring out something.