Just a terminological remark, the abbrevaiation NIC is typically used for a single Ethernet interface even if it is physically located on a multi-port plug-in card.
Sorry, pardon the language 
What I am saying is what I am saying. I can’t agree or disagree with your comparison to Hyper-V because I never ran Hyper-V on a server so I never dived into its multi-link capabilities
I’ve just read more the docs on VMWare networking and yeah, there is NIC teaming with switch independent configuration. Just like on the Windows Server and SET so yeah, what you said you said, is perfectly correct 
Yes in your case, not necessarily a direct one in other cases. If the servers do not need to talk to each other on L2, an L2 path between the server NIC and the router NIC may be sufficient. So if both routers have links to both switches, and the servers do not need to talk to each other,
there is no need for a direct link between the switches.
The second diagram shows only one patch from the Router to one switch. No links to both. The servers may talk to each other, but since all of them will have connections to all the switches, the traffic would stay there, good even if they want to talk to each other.
Are ISP1 and ISP2 just physical interfaces accessing the same L2 segment (i.e. switch redundancy at data center side) or do ISP1 and ISP2 represent different interconnection subnets that are both capable to accept route advertisement from you to tell which address from your /29 range is reachable through which of these two L3 subnets?
They are just different paths to whatever “switch(s)” they have behind the scenes. All they tell me is that I have a /29 range to assign to my router(s) and a gateway IP.
OK, so in order that redundancy and maybe load distribution worked (I don’t know how CloudFlare handles the traffic), the cloudflared on each server has to establish the VPN tunnels to multiple CloudFlare POPs. The task of responding via the correct tunnel is the job of cloudflared, but a task of spreading the tunnels from a given server to the individual CloudFlare POPs across the available WAN paths exists. I gather that the addresses of the VPN servers are given as FQDNs so the IP numbers may drift over time. On Mikrotik, I would use one address list per such FQDN to keep the DNS translation up to date and control the choice of WAN, but that implies that all the traffic would run through the same router. So for your use case, it seems much more useful to me to move this task to the server as well, and give the server two gateway IP addresses to choose from; these gateway addresses would be VRRP addresses, and while both routers would be up, each gateway address would be up on another router. If one router would stop working, the gateway address preferring it would migrate to the remaining one.
The way cloudflare works is that I set them as my DNS server, and create a public facing DNS entry like “app.contoso.com”. Whenever someone make an HTTP request, cloudflare will resolve that DNS to the IP address of a PoP closest to the user request location. Once that request goes inside the loca PoP, it is then routed thru cloudflare global infrastructure until it reaches the PoP closest to our datacenter.
We would have multiple tunnels, probably 1 per physical VMWare server connected to that PoP (or any other we decide to for HA reasons). Cloudflare will then load balance that incoming requests among those tunnels which in turn has pre-configured to which service on my internal network that packet/request should be sent to. My service just process the HTTP request and reply to it. So there is no need for the application to figure out which tunnel the request came from since it is replying to the same HTTP request. If this is a persistent connection like WebSockets for example, it will stick with that server until it is closed and we at the application level handle the processing of “which web socket should I send this to?”.
That way, we don’t need to expose our service to the inbound traffic at all as cloudlfare is dealing with with WAF, DDOS, etc for us. All we receive is clean traffic from a connection that was originated from our system to cloudflare. So in that sense, all router has to do (naively speaking) is to allow outbound connectivity with Cloudflare.
If the network uplink gateway IP is behind the VRRP, then whatever Router has that IP at the moment will be the one forwarding out the packets thru the tunnel. If the router becomes unavailable, it is fine, cloudflared will reconnect and we should be good.
if these are ports connecting the routers to the very same L2 network, you can use VRRP also on WAN side (just make sure you don’t use the same VRRP group IDs like the data center, as that would cause fireworks). In such case, you would again use two VRRP interfaces with different IP addresses from the /29, and when both routers would be running, each of the two addresses would be up on another one. The VRRP interfaces have scripts on-master and on-backup that are triggered when the role of the VRRP interface changes to master of backup respectively. These scripts may then adjust the priority of the VRRP interfaces attached to the LAN interfaces.
Ahhh I see. So we set the VRRP like this (hypothetical IPs assuming the /29 ):
- Gateway: x.x.x.5
- Router 1: x.x.x.1
- Router 2: x.x.x.2
- VRRP: x.x.x.3
- Default route on both would use x.x.x.3 with gateway x.x.x.5
Then set the on-master/backup scripts to configure the LAN VRRP which will update the floating gateway IP for the internal gateway network. I wasn’t aware it was a good idea to do VRRP using the public IPs (as long as the group id don’t collide as you said). Fantastic!
Thank you for all the input! The servers are already with me, will make tests and see what I can get out 