I’m working on a config-as-code setup for my MikroTik devices, and I’ve hit a recurring problem: when a new config breaks things (e.g. wrong firewall rule, bad IP, or a syntax error in the .rsc), I often lose access to the router entirely. Recovery means physically going there and factory resetting. Not fun.
I’ve been using an SSH-based provisioning script that:
- Export the current ROS config, save it locally
- Picks up my “commited”
.rscfile (templates it with Jinja2 + YAML secrets, so I don’t keep secrets in it) - Imports it to the device
- Runs
/system reset-configuration run-after-reset=... - Waits for the router to reboot and checks it’s alive
- Exports the complete current config again
- Prints
diffto see what has been changed.
This mostly works… until it doesn’t. If there’s a problem in the run-after-reset script, the router may never come back up.
So I’m thinking — can I make this safer?
Instead of wiping the device and hoping for the best, maybe I can do something like a “big-ass try-catch” around the whole config. The idea is:
-
Before applying anything, I save a binary backup (
latest-working). -
I upload the new config script (
new-config.rsc). -
Inside that script:
- Set a global variable like
PROVISIONED=trueat the end if everything works. - Wrap the whole config in a
:do { ... } on-error={ ... }block. - The
on-errorwould run/system/backup/load name=latest-working& reboot. And finally, as a safeguard - ifPROVISIONEDisn’t set, I restore the backup and reboot as well - just in case some errors are not properly thrown.
- Set a global variable like
Rough outline of the new-config.rsc:
:global PROVISIONED false
:do {
# all my config here (IPs, firewall, bridge, DHCP, etc.)
:global PROVISIONED true
} on-error={
/log error "Provisioning failed, rolling back"
/system backup load name=latest-working
:delay 10
/system reboot
}
:if ($PROVISIONED = false) do={
/log error "Provisioning incomplete, rolling back"
/system backup load name=latest-working
:delay 10
/system reboot
}
–
The eployment script (simplified):
ssh "$ROUTER" "/system backup save name=latest-working"
scp new-config.rsc "$ROUTER:new-config.rsc"
ssh "$ROUTER" "/system reset-configuration no-defaults=yes keep-users=yes run-after-reset=new-config.rsc"
–
What I’m Trying to Figure Out
- Does this approach make sense at all?
- Are there edge cases where this might fail silently or leave the router in a half-configured state?
- Any gotchas around using
/system backup loadin a script? Does it really revert everything cleanly? - Is it dangerous to reboot right after a restore like this?
- Is there a better way to mark success than a global variable?
- Any risk of import continuing past an error in the
:doblock? - Finally: what if some error in configuration halts the importing process e.g. because it’s waiting for user input?
Appreciate any feedback - I’m trying to avoid ever having to crawl into a basement with a paperclip again.