If you assign a routing-mark to WAN->LAN packets, they may end up being sent back to the internet if only a default route exists with that routing-mark, which is typically the case. You haven’t shown your /ip route section so hard to say.
PCC can be normally used to assign a routing-mark directly, not to assign a connection-mark and then use another rule to translate the connection-mark into a routing-mark, because the result of the per-connection-classifier is the same for all packets of the same direction of the same connection. I’ve got no idea whether PCC->routing mark translation is faster or slower than connection-mark->routing-mark translation, but doing both for every single packet is definitely a waste of CPU.
what is your suggest to do for load balancing !
other question - user input and forward to incoming connections from internet interfaces or use prerouting
Ah, sorry, I wasn’t careful when looking at the first four action=mark-routing rules in prerouting, they are actually used to assign the routing-mark to LAN->WAN packets, not to WAN->LAN ones. The reason is that it is quite unusual to use three groups of connection-mark values (WAN->LAN, LAN->WAN, WAN->ROS) if their only purpose is to be translated to routing-mark values, but maybe there is some context I cannot see.
If there is none, you can save several rules per packet by using the same connection-mark for connections initiated from LAN side (distributed using PCC) and connections initiated from WAN side. There will be only one connection-mark → routing-mark translation rule per WAN (which saves CPU), and you can also use a single common rule per each WAN to assign a connection-mark depending on WAN in-interface per WAN if you replace the separate rules in input and forward chains by a single one in prerouting chain. Doing so doesn’t save any CPU, it just centralizes the conifguration and thus reduces the space for typos.
Another optimization point I normally use is to jump to a dedicated connection marking chain as the first rule in prerouting for connection-state=new packets, so that all mid-connection packets go straight to the connection-mark->routing-mark translation rules. This might not make much sense if you only used PCC to assign routing-mark values, but since you use the connection-mark->routing-mark translation anyway (to make sure connections initiated from the internet will be responded properly), it will save CPU also here.
So my set of mangle rules would look as follows: chain=prerouting connection-state=new action=jump jump-target=pr-cm
chain=prerouting in-interface-list=WAN action=accept
chain=prerouting connection-mark=WAN1 action=mark-routing new-routing-mark=WAN1 passthrough=no
…
chain=prerouting connection-mark=WAN4 action=mark-routing new-routing-mark=WAN4 passthrough=no
chain=pr-cm in-interface=WAN1 action=mark-connection new-connection-mark=WAN1 passthrough=yes
…
chain=pr-cm in-interface=WAN4 action=mark-connection new-connection-mark=WAN4 passthrough=yes
chain=pr-cm connection-mark=no-mark per-connection-classifier=both-addresses-and-ports:4/0 action=mark-connection new-connection-mark=WAN1 passthrough=yes
chain=pr-cm connection-mark=no-mark per-connection-classifier=both-addresses-and-ports:4/3 action=mark-connection new-connection-mark=WAN4 passthrough=yes
chain=output connection-mark=WAN1 action=mark-routing new-routing-mark=WAN1
…
chain=output connection-mark=WAN2 action=mark-routing new-routing-mark=WAN4
Don’t forget that if you assign routing-mark values in mangle chain output, you must use src-nat or masquerade rules on the WANs. The packets sent by ROS are routed, and therefore their source address is chosen, before the routing-mark is eventually assigned in chain output, and whilst the routing is repeated if a routing-mark is assigned, the source address doesn’t change automatically. EDIT: the paragraph above is valid, but only for connections initiated by ROS itself, which is not your case, so you can ignore it.