Built a tool that generates RouterOS configs from plain language, looking for feedback

But the answer, IMHO, is not the exact one, it's the "best one" of the ones it already answered to the others and get no complains on the quality.
It used to be called alpha-beta prunning algorithm in solving puzzels. Find the best move you can do with some constraints. It dos not mean it would be the right move. It would be just "the best" one at the moment. Linear programming also has it's own methods to find the best match (called solution) for set of equations/inequalities. Not the exact, just the best fit to obey them.

I have some questions:

  1. Can you share how the AI is validating the configuration it generates so that is free from security issues and works? e.g. Is the configuration tested on a CHR, using something tikoci/quickchr?

  2. Do you publish the source code for it, or have had any third-party audits on your code?

  3. Your privacy policy does not meet EU or California standards, which has me worried since I value my privacy. Do you plan on fixing that?

  4. How does the site make money? Is there a paid plan for bulk use, or are there limits to how many configuration you can generate.

  5. Is there an API that provides programatic access to generate scripts for a fleet of routers programmatically?

  6. What type of validation did you do on the quality of the generated configurations?

  7. Are there version limits on the generated configuration? For example, can it generated V6 code, or for a particular version say 7.20.8 or 7.23 since in some cases the syntax does vary.

Good questions, honest answers:

  1. No CHR testing pipeline yet. Configs are generated but not automatically validated against a live RouterOS instance. This is a real gap, adding CHR based validation before generation is returned is on my list.

  2. Not open source right now, and no third party audit has been done. Noted as a trust gap based on this thread.

  3. You are right, and I am actively rewriting the privacy policy to properly cover GDPR and CCPA requirements including explicit consent, data subject rights, and deletion requests. Will update here once live.

  4. Free right now, no paid tier. No hard generation limits currently, may add rate limiting to control cost.

  5. No API yet. Fleet scale programmatic access is a fair use case, will consider for a future version.

  6. Manual spot checking against known good configs so far, no formal test suite. This ties into point 1, CHR based automated validation is the right fix.

  7. Targets RouterOS v7 syntax generally, not pinned to a specific minor version like 7.20.8 or 7.23, and does not generate v6. Version specific syntax differences are a valid concern I have not fully accounted for yet.

Appreciate the direct list, this is useful for prioritizing what to fix next.

  1. It's that if there is not some paid offering, and it's not open source. Why not? That's what's creating friction in the feedback.

  2. In website, there is hand-waiving about generation possibilities, yet things like version and hardware/architecture, directly effect the potential config. For example, something like L3HW is only on some devices, if I provide the specific model and describe a setup that should be possible with L3HW, what are the chances the generate code will know how to do something like that right?

  3. Model training data is dominated with V6 config, so what are you doing to overcome this? If you're not releasing the code, perhaps consider publishing some the skills or instructions so those can be reviewed?

  4. If I want the generator to add /container or custom /app, does it have support for doing that correctly? Or is that going to be based on training-data alone, since containers are tricky to get right and require some knowledge of "Docker" as applied to RouterOS. Is this covered?

I waited a bit before writing.

  1. Why email, if it's "free"?

  2. Free doesn't exist; someone pays for the electricity. Reselling emails is profitable.
    Or it's an investment, letting others be beta-testers for free, then reselling the product in defiance of those who helped test it.
    Nobody does things just "For the Glory"...

  3. There's always a "3" for Artificial Deficiency.


Creating a "wizard" rather than an AI configuration is certainly more complicated,
but expecting the AI ​​to create "Production Ready" scripts is going too far.
Too presumptuous, really.

Sure, for basic things that "just reset to default" the configuration is simple...
I had some test scripts made, and if there was one that worked correctly with "copy and paste" as described...
They always need to be manually corrected here and there, but at that point you should already know where to start; otherwise, to the reader, they're just random lines of code thrown together.

Off-topic... or not? Re-Ad at your own risk.

We'll get to a point where everything will be done automatically.
Everything will be done by AI, and everyone will no longer need to think.
Humans will have exhausted their function for technological evolution.
As someone said, "Who said the next evolutionary step concerns humans?".
Humans must become like mitochondria in cells...
Just do that, the cell takes care of the rest,
which in turn simply does that because the organism takes care of it, and so on...


It's already clearly visible that the human race is becoming more and more stupid with each generation...
Herds of goats led by the infernal goat of the moment in wars, genocides,
the construction of new "temples" (that are never big enough as one's ego), religious schisms, and so on...

Good points:

  1. No paid tier or open-source release yet because the tool is still in its initial stage, not because of a business decision to withhold it. Once it stabilizes, I will decide between open-sourcing the core or offering a paid tier for scale; both are on the table.

  2. Fair concern. Right now the tool does not do hardware capability checks like L3HW support per model. If you specify a model and ask for something the hardware cannot do, there is a real chance it generates the config anyway without flagging the incompatibility. This needs a hardware capability matrix behind the generation logic, not something it has today.

  3. Correct that v6 dominates training data. I am not overcoming this yet in any systematic way beyond prompt instructions targeting v7 syntax. Publishing the underlying instructions for review is a reasonable middle ground if I am not ready to open source the full app; I will consider it.

  4. Container and custom /app support is not something I have specifically built or tested for. Given how version- and platform-sensitive RouterOS containers are, I would not trust the current generator on that without dedicated testing, and right now it has not had any.

MikroTik Config Generator is in an initial stage; feedback like this is genuinely valuable for building something that actually holds up, not just something that looks finished on the surface.

  1. Email is used to send the result and for occasional updates about the tool, not for account creation or gating access to features. Reasonable to question this though, given the tool is marketed as free.
  2. Understood, and it is a fair default assumption. To be clear on incentive here, this is not backed by ad revenue or a resale plan right now, it is a portfolio project meant to build credibility and a user base for a future paid tier or open source release, not to harvest and resell emails. I know that is exactly what someone with bad intent would also say, so I don't expect that to settle it on its own, only actual behavior over time will.
  3. Agreed, and this is the most useful point in your reply. Calling it "production ready" sets the wrong expectation. It should be positioned as a starting draft that still needs manual review and correction, not a copy paste final script. Wizard style structured input with AI assisted expansion would produce more reliable output than free form AI generation alone, and is a better long term direction than what exists now.

So what's the point of testing it for free?
People should start getting paid for the work they do,
and also for the sites they visit, especially if I see ads.
The site's creator should pay me because they, in turn, earned money from it...

@zilleali
Well,
you're good with people, or the AI ​​you use/are is very diplomatic (see the "3"rd point trap used...).

I'm not against these tools (aside from the off-topic note above),
but I like things to be clear from the start...

To be clear, free testing now does not mean current users get charged later for what they already used for free. If a paid tier happens down the line, it would apply to new usage tiers or higher volume, not retroactively monetize the feedback and testing already given.

There are no ads on the site and no plan to add them. Point taken though on the general principle, people testing something for free are doing real work, and that should not be forgotten if the product ever generates revenue.

Appreciate that, honest disagreement is more useful to me than polite silence.

Agree on clarity from the start being the actual fix here, not diplomatic replies after the fact. That is going into the landing page rewrite: what it costs now, what happens if a paid tier launches later, what data goes where, stated plainly instead of needing a forum thread to surface it.

(obviously, also because it certainly wouldn't be possible to do...)

Fair, agreed, that part was obvious in hindsight. Point stands regardless.

Well, given the use of a temporary, fake, e-mail address, you could ask for a debit/credit card number, and I will be happy to provide a fake one, to which you can debit very large bills retroactively.

No card or payment info is collected anywhere in this tool now or planned for the free tier, so that scenario stays hypothetical either way.

I know. :slightly_smiling_face:
... but I like hypothetical scenarios ...

Oh okay

I do not want to dismiss your work. But the hard part is not a website and plumbing to an LLM, I'd imagine Claude Fable could maybe one-shot a similar scheme site with a good prompt to do it. The hard part is providing an LLM with relevant context, and tools to help make sense of it. And, that's what lacking IMO. I have started a small effort to benchmark this. Only a few cases are covered. And a bit disorganized since I don't want to burn credits without a more solid framework.. But see bench-routeros-tools/README.md at main · tikoci/bench-routeros-tools · GitHub (Note: the skill test do not include a skill for the brenchmark'ed "traps"... so there are expectedly bad)

When talking about AI and RouterOS, a lot of focus is on "build me a config for X", while I suspect there more use cases like "this is broken. fix it." or "get me a chart+report on Y from Z combining A,B,C". In any case... if we want to see improvement with clunkers working with RouterOS... this going to take some community effort to curate LLM-friendly "SKILLs"/instruction/docs that correct "training traps" and fill-in-some-blanks for AI about RouterOS.

More antidotally but my own experience in developing AI tools for RouterOS... even frontier models struggle with a surprising number of "nuances" in RouterOS. There is actually a long list. The worst/common offender is once/duration= things. And these "your training may be wrong" things should be cataloged and organized (and benchmarked). I have shared a small set of RouterOS skills here: GitHub - tikoci/routeros-skills: Custom instruction SKILL.md for MikroTik RouterOS v7 · GitHub - but they are less focused on configuration, and more even some basics. So these all lack help on other hard ones like BGP/MPLS/OSPF, L3HW, "rose" (no benchmarked yet, but AI get confused quickly), new features like BTH/containers/mDNS. The list could go on.

And this is were isntructions/skill/tools could help ... however the underlying problem is you need grounded/validated source material. This is even harder for RouterOS since you have version/cpu-arch/install-packages/switch-chip/device-mode/etc as additional dimensions, so "grounded/validated" material is highly contextual (e.g. yet the clunker will take you seriously/literally if you say "this is the right way to do X").

This is why I'm pitching that the real value would not be yet-another-ai-generated-website, but rather sharing whatever learning you did put in to instructing/training/prompting your website/backend embodied. These could reviewed/extended elsewhere as part of richer set of AI docs.

Just my two cents.

Appreciate the depth here, and the benchmark repo is a genuinely useful reference, will go through it properly.

Agree with the core point. The website and LLM plumbing was never the hard part, and I don't claim otherwise. The actual work has been in prompt constraints and correcting exactly the kind of training traps you're describing, once/duration= being a good example of the category. Right now that knowledge lives only inside my prompt logic, not published anywhere useful to the wider community.

Publishing that as a reviewable SKILL/instruction set, rather than keeping it locked inside one tool, is a fair ask and a better use of the effort than polishing another config generator. Will look at structuring what I have against the format in your routeros-skills repo so it's actually usable by others, not just my own backend.

The broader use cases you mentioned, "this is broken, fix it" and cross referencing multiple sources into a report, are also more realistic reflections of real troubleshooting than one-shot config generation. Noted for direction going forward.