I aim to develop a robust backup and restore strategy for a critical MikroTik router in my network, with a relatively complex configuration. Recently, I had to replace the router (fortunately without an actual disaster) with a different unit of the same model and realized that my planned recovery approach was inadequate—especially if executed under pressure and/or by a colleague with limited knowledge of the network’s specific details.
I would appreciate any advice from those who have faced similar challenges in the past.
The Scenario
Imagine I have a MikroTik Router CCR2004-16G-2S+ managing inter-VLAN traffic for numerous VLANs, managing queuing and load balancing across three ISPs for outbound internet traffic, while also running containers on an external USB SSD and implementing relatively complex firewall rules, bonding/bridging and additional configurations.
Each night, I create a backup using an appropriate method, which is yet to be defined.
The Challenge: Proper Backup and Restoration on a Different Unit
Now, a real disaster strikes: the hardware fails, and someone with little or no knowledge of the intricacies of my network needs to restore it as quickly as possible on another CCR2004-16G-2S+, which we managed to find. Since our entire network relies on this router, recovery is urgent.
Issues Encountered During Restoration
In my testing, I encountered the following issues:
- Restoring a backup on a different unit does not seem to work properly. Inter-VLAN traffic is not allowed, firewall rules cause numerous dropped packets, containers are restored but do not start, and Ethernet interfaces have different names, just to mention a few of the issues I observed. Strangely, internet access to the router via ZeroTier still functions.
- Restoring from an exported configuration also fails to fully recover the system. Users and groups are missing, containers are not restored, inter-VLAN traffic is non-functional, and potential issues may arise—such as problems with ZeroTier identities or container settings.
On the simulation day, after trying both approaches, my access to the router was mostly limited to the console port, restricting my ability to diagnose the situation thoroughly.
Seeking a Reliable Recovery Method
While I can invest more time in debugging the restoration process now, I recognize the immense pressure of a real-life disaster recovery scenario. I hope a more straightforward, click-and-restore backup method is available.
