After switch to bonded DSL lines I have problem with TLS handshakes with some sites. Some works (e.g. google), some not.
Setup: 4 DSL modems + LTE modem (backup) → ISP’s RB3011 doing bonding → our RB3011 → NATed LAN
When we used single DSL line of other ISP, it worked with same configuration.
Now it’s not possible to do successful TLS handshake with some sites:
curl -ILv https://p3.zdusercontent.com
* Rebuilt URL to: https://p3.zdusercontent.com/
* Trying 185.12.82.12...
* TCP_NODELAY set
* Connected to p3.zdusercontent.com (185.12.82.12) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/certs/ca-certificates.crt
CApath: /etc/ssl/certs
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to p3.zdusercontent.com:443
* stopped the pause stream!
* Closing connection 0
curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to p3.zdusercontent.com:443
Connected to ISP’s RB3011 directly:
...
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES256-GCM-SHA384
* ALPN, server accepted to use http/1.1
...
I tried to add rules to allow all traffic of that IP and log it:
chain=forward action=accept dst-address=185.12.82.12 log=yes log-prefix="DEBUG <"
chain=forward action=accept src-address=185.12.82.12 log=yes log-prefix="DEBUG >"
Log is:
jul/22 01:01:58 firewall,info DEBUG < forward: in:bridge1 out:wan1, src-mac xxx, proto TCP (SYN), lan_client_ip:42540->185.12.82.12:443, len 60
jul/22 01:01:58 firewall,info DEBUG > forward: in:wan1 out:bridge1, src-mac yyy, proto TCP (SYN,ACK), 185.12.82.12:443->lan_client_ip:42540, NAT 185.12.82.12:443->(our_rb_ext_ip:42540->lan_client_ip:42540), len 60
jul/22 01:01:58 firewall,info DEBUG < forward: in:bridge1 out:wan1, src-mac xxx, proto TCP (ACK), lan_client_ip:42540->185.12.82.12:443, NAT (lan_client_ip:42540->our_rb_ext_ip:42540)->185.12.82.12:443, len 52
jul/22 01:01:58 firewall,info DEBUG < forward: in:bridge1 out:wan1, src-mac xxx, proto TCP (ACK,PSH), lan_client_ip:42540->185.12.82.12:443, NAT (lan_client_ip:42540->our_rb_ext_ip:42540)->185.12.82.12:443, len 275
jul/22 01:01:58 firewall,info DEBUG > forward: in:wan1 out:bridge1, src-mac yyy, proto TCP (ACK), 185.12.82.12:443->lan_client_ip:42540, NAT 185.12.82.12:443->(our_rb_ext_ip:42540->lan_client_ip:42540), len 52
jul/22 01:01:58 firewall,info DEBUG > forward: in:wan1 out:bridge1, src-mac yyy, proto TCP (ACK,PSH), 185.12.82.12:443->lan_client_ip:42540, NAT 185.12.82.12:443->(our_rb_ext_ip:42540->lan_client_ip:42540), len 632
jul/22 01:01:58 firewall,info DEBUG < forward: in:bridge1 out:wan1, src-mac xxx, proto TCP (ACK), lan_client_ip:42540->185.12.82.12:443, NAT (lan_client_ip:42540->our_rb_ext_ip:42540)->185.12.82.12:443, len 64
jul/22 01:02:59 firewall,info DEBUG < forward: in:bridge1 out:wan1, src-mac xxx, proto TCP (ACK), lan_client_ip:42540->185.12.82.12:443, NAT (lan_client_ip:42540->our_rb_ext_ip:42540)->185.12.82.12:443, len 64
jul/22 01:03:10 firewall,info DEBUG > forward: in:wan1 out:bridge1, src-mac yyy, proto TCP (ACK,RST), 185.12.82.12:443->lan_client_ip:42540, NAT 185.12.82.12:443->(our_rb_ext_ip:42540->lan_client_ip:42540), len 40
It seems to be problem caused by combination of bonding on ISP’s router and NAT on our, as single line with our router works, bonded lines with directly connected client as well.
NAT is done by:
chain=srcnat action=masquerade out-interface-list=WAN_all
WAN_all contains wan1 and wan2 while wan2 is not connected anymore.
Any ideas what exactly can cause the problem and how to fix it?