I'm going to give the extra polish you requested - but first, let me take a moment to just say this one thing:
I HATE - I mean, really HATE pinging things as a routing decision point.
It is necessary in situations (like yours) where dynamic routing is not an option, but it's chatty, in the big picture it wastes tons of bandwidth, and places all kinds of burden on "well-known" addresses. You could be nice and ping something a little less obvious, such as k.root-servers.net or some other IP in their anycasted /24, but I guarantee that hundreds of thousands of other administrators are doing exactly the same thing....
If I had to ping something, it would be things which I know to be anycasted (reliability), and I would only ping 1 per link because hey, if it goes down, you're just going to fail over to another interface, right? The only well-known anycast address that I've ever seen "go dark" was 4.2.2.2 and it's happened on a few occasions. I've never seen google's addresses go dark.
Anyway - time for me to put my helpful, un-grouchy face back on.
The general approach to load-balance of dual-home router is this:
1. Set up WAN1, WAN2 and LAN, both interfaces and IPs. In my case, WANs are PPPoE links (which leads to using interfaces names instead of IPs of local and remote link ends).
2. Set up NAT masquerading so traffic from LAN would go outside but no set output interface in this NAT rule.
3. Set up mangle rules like we discussed above to connection-mark traffic from LAN-WAN1 and LAN-WAN2 acls, then route-mark these packets (we do route-mark packets, right?) after these connection marks (say, with route_WAN1 and route_WAN2 marks)
4. Set up route tables so packets marked as route_WAN1 would route to WAN1 as distance=1 and to WAN2 as distance=2, and vise versa table for route_WAN2 packets.
5. Set up another ('default' or 'no mark') routing table and set up it the way we prefer un-route-marked traffic would go outside.
6. Set up if-uplink-alive checks so we'll be sure that we can use WAN1 and/or WAN2 to reach 0.0.0.0/0. If we don't have a BGP (and we don't) I'd prefer to use netwatch and scripts on check success and fail.
I agree with this list 99%. #3 especially shows you've got the idea right.
Regarding #2, Pukkita pointed out that you should specify the interface in order for masquerade to work. That's not entirely true either. I definitely prefer to use interfaces as the criteria for the masquerade rules, but I would use whatever scheme requires the fewest rules....
If I had only one LAN interface, I would just say out-interface=!lan action=masquerade.
This one rule would work for any arbitrary number of WAN interfaces without ever having to be touched again, even if I were adding and removing WAN interfaces constantly in real time. (the MASQUERADE table would be an entirely different story though - heh)
If I had multiple WAN and LAN interfaces, I would just put out-interface=wan1 action=masquerade / out-interface=wan2 action=masquerade.
Now we need to provide the ability to return back traffic from outside that sent to our internal resources (SMTP server as an example) which were made available via NAT. So I mangle in input (or forward?) chain the connections that fro interface-in=WAN1 (or WAN2) and finally route-mark it as route_WAN1 (route_WAN2). This part a bit mysterious for me as I expert it should be done 'automatically' due to NAT connection track table.
If you're connection-marking in prerouting, then these cases are already taken care of. Forget forward and input chains here - prerouting happens before both of these, no matter which one is coming next. It took me a while to break the habit too.
About the only thing you lack for completeness in the sticky-path solution is connections which the Mikrotik originates itself. This is done with the output chain. Of course packet marking in output is the same as packet marking in prerouting (mark based on connection mark) but there are a few philosophies you could use for your connection marks. The simplest and cleanest (in my way of thinking, anyway) is to use the source IP address as your switch, because the router has actually already chosen the out-interface and thus the appropriate source-IP. The connection mark just makes it set in stone, so if wan1 dies, the connection riding it should die too - not get switched to wan2 and the srcnat screwing things up there.
Here's a link to a complete solution with load sharing, redundancy, etc - the whole works. I actually was in attendance at this MUM presentation, and enjoyed it quite a bit: (it's a shame the talk itself wasn't preserved, there were great explanations and discussions about the various moving parts and what they did and why they are there)
http://mum.mikrotik.com/presentations/US12/tomas.pdf
This wiki article covers the "ping past default GW" case you (correctly) mentioned in the discussion:
http://wiki.mikrotik.com/wiki/Advanced_ ... _Scripting