Using AI to help configuring RouterOS and scripting

Well, it’s service name was mixed in context with comparing specific AI models, not same thing.

I have used the last months: Claude, GitHub Copilot and ChatGPT. IMO chatgpt 5.5 is the best deal of them. Claude Code runs into rate limit way too fast/early.

Again comparing Copilot service with models :slight_smile: you can choose which can be used in Github, on free account as I see only avaiable models are GPT-5 mini and Claude Haiku 4.5 (default) which are not so good.

On company account on which I have subscription there are many more available, like Claude Opus 4.8 which is far superior than above mentioned on free account.

Bottom line is that Copilot output can significantly vary depending which model is used by service.

Not doing.

Ok, but then better to mention which model is used by Copilot to get more strict comparison.

I have said: GPT 5.5 offers the best value. Yet need to evaluate Opus 4.8.

For free? Probably yes, since Copilot only offers GPT-5 mini and Claude Haiku 4.5 on free account which are inferior to GPT 5.5.

Claude Opus 4.8 is not free AFAIK.

@optio may be right on OpenAI subscription on pricing, I'm not a direct subscriber since rather give Microsoft or Anthropic money...but may actually be better deal after CoPilot's pricing changes. I have both Claude Code and CoPilot paid subscriptions.

What I used to do was Claude Code to make complex/long-running/single prompts and/or plans, which is not a lot of token... then run them in CoPilot - picking a model, including "premium" ones, using their now old per-request pricing which could run for minutes/hours. In CoPilot I often used GPT 5.5, which is quite good (at least when access via a paid CoPilot subscription). GPT 5.5 is "somewhat better" than Sonnet 4.6, but not as good as Opus 4.7/4.8...but Opus is 3x cost of GPT 5.5 but seeming 0.5-1x better... Sonnet is actually pretty good, but GPT 5.5 seems a bit better, with both being real close in costs. Again just my opinion. But I might have to consider an OpenAI subscription since CoPilot pricing for the model is not as subsidized. e.g. There are a lot funky arbitrations on model pricing, so hard to know.

But the initial benchmark does show have some TDB "good" tools helps all models. And even Opus with training alone could "lose" to a free-ish model with good tools in some cases... Basically a spin of "garbage-in, garbage-out" (e.g. a minimum prompt to frontier model, may be worse).

One thing that likely help any LLM with agents is point some user-level instruction (~/.../CLAUDE.md and/or ~/.../AGENTS.md etc. - most LLM hardness have some file that read for all prompts) which you mention checking MikroTik new's https://manual.mikrotik.com/llm.txt would likely offer immediate improvement to most models. Or at least, one sentance that "reminds" the agent there are docs to read when processing a prompt is unlikely to hurt. The new docs have a markdown format, so be more token-efficient. See:

...and if you prompt your agent to read the above post, and have it add manual.mikrotik.com to the "user-level instructions", it figure out how to write content into the user-level AGENTS.md/CLAUDE.md/copilot-instructions.md/etc file...then future agent can "reason" if it's useful to "read the manual" based on whatever you prompted, since it already know how to access it. And once MikroTik actually publishes MD, I'll add the "only manual.mikrotik.com" as a another test case in bench-routeros-tools — tikoci.github.io – so we can test the theory more exactly

I am not talking about free. I have subscriptions for the services I named. And when I say GPT 5.5 I don't say "mini". I won't go into a detailed comparison, it is just my experience for my particular use.

@Amm0 Yes, Copilot changed pricing model affected from today, so choosing premium model can affect drastically in price. Here is some comparison https://cosmin-vladutu.medium.com/end-of-an-era-github-copilot-models-and-prices-change-2f1129035aa6

Now from my experience I agree on above, GPT 5.5 is very good for various analysis while there is no match IMO to Opus 4.7-4.8 (using them over Copilot) when it comes to generating code.

I think it is a case of ymmv. If you have asked me before gpt 5.5 released: nothing can match opus 4.x when coding. But codex gpt 5.5 in high/xhigh variant is very good - universally. Even for writing code. It comes down to skills (AI skills.md I mean), memory (building project wiki) and working with mcp servers.

Ok, you may be right, I did not try GPT-5.5-Codex model since it is not yet avaiable by Copilot - Models and pricing for GitHub Copilot - GitHub Docs , probably OpenAI is not offering it outside of their service so that can compete with Copilot.