Imagine that there is a Microtic Telegram parser capable of performing the functions of a user from a chat. That is, we report something like myFunc par1 par2 … parN in the chat, the parser transmits this to the Microtic and it executes.
At the same time, I form a string for :parse, in which I pass the name of the executable function to $funcName, and its parameters to $parametrs. Of course, the parameters of the function can be different: positional and named, but only string parameters are incorrectly passed (:type “str”)
All this works well for me. But only if the parameters are set in Latin. As soon as I try to pass parameters in the national language (for example in Russian), the construction stops working. I tried to recode the parameters to UTF8, but it doesn’t help.
How do I pass parameters in the national language to $parameters ? Or is it impossible ?
RouterOS accept only 7-bit characters:
NUL@ SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI
DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC[ FS\ GS] RS^ US_
SP ! " # $ % & ’ ( ) * + , - . /
0 1 2 3 4 5 6 7 8 9 : ; < = > ?
@ A B C D E F G H I J K L M N O
P Q R S T U V W X Y Z [ \ ] ^ _
` a b c d e f g h i j k l m n o
p q r s t u v w x y z { | } ~ DEL
Notice:not all control codes from NUL to US do something.
Any other character is unsupported.
Ç ü é â ä à å ç ê ë è ï î ì Ä Å
É æ Æ ô ö ò û ù ÿ Ö Ü ¢ £ ¥ ₧ ƒ
á í ó ú ñ Ñ ª º ¿ ⌐ ¬ ½ ¼ ¡ « »
░ ▒ ▓ │ ┤ ╡ ╢ ╖ ╕ ╣ ║ ╗ ╝ ╜ ╛ ┐
└ ┴ ┬ ├ ─ ┼ ╞ ╟ ╚ ╔ ╩ ╦ ╠ ═ ╬ ╧
╨ ╤ ╥ ╙ ╘ ╒ ╓ ╫ ╪ ┘ ┌ █ ▄ ▌ ▐ ▀
α ß Γ π Σ σ µ τ Φ Θ Ω δ ∞ φ ε ∩
≡ ± ≥ ≤ ⌠ ⌡ ÷ ≈ ° ∙ · √ ⁿ ² ■ NBSP
The unsupported 8-bit characters can be represented as HEX values, from \80 to \FF
UTF-8 is a mix of 8-bit characters for represent all possible languages.
How send 8-bit character to be used with RouterOS?
You must transform the character to HEX equivalent,
for example ±, common on both Latin CP1252 and Cyrillic CP1251 must be converted from UTF-8,
and on UTF-8 that character use two bytes: ± = “\C2\B1”
So, the program that send the string to RouterOS must convert all non-7-bit characters from UTF-8 to escaped HEX sequences.
Fo convert Привіт to one MikroTik usable string:
П = \D0\9F
р = \D1\80
и = \D0\B8
в = \D0\B2
і = \D1\96
т = \D1\82
Привіт = “\D0\9F\D1\80\D0\B8\D0\B2\D1\96\D1\82”
But obviously the conversion must happen before the RouterOS is involved on any way.
Or more “simply” (for all languages) use directly only 7-bit characters:
Привіт => Pryvit
The same is for emoticons, at the end are siply characters with specific design.
“@anav” = “@anav \F0\9F\8D\81”
Rex, thank you for your detailed answer. I understand everything you wrote.
To convert the parameters to UTF8, I used this function. It does a great job when you need to send CP1251 to Telegram, but it doesn’t work when I try to use it for :parse
Telegram use natively UTF-8 and you need a UTF-8 to ASCII to RouterOS converter.
The function FuncCP1251toUTF8 (really made with feet) and also my version for CP1252 (not the best, but more clear) http://forum.mikrotik.com/t/rextended-fragments-of-snippets/151033/1
Is clearly called to_UTF8 because convert RouterOS ASCII characters received to CP125x and to UTF-8 for use it on Telegram & Co.
Instead from Telegram if non-7-bit ascii text (Cyrillic, emoticon & Co.) is used, you receive UTF-8 and RouterOS is unable to directly understand.
Telegram send: “Привіт @Sertik, Привіт @anav” on UTF-8"
RouterOS receive, if nothing between convert the string:
"Привіт @Sertik, Привіт @anav ðŸ"RouterOS stop parsing at “Д because do not understand.
Like you use a function to convert RouterOS to Telegram language, inside RouterOS
you must write a function to convert Telegram to RouterOS language inside Telegram.