PDU transcoding

Hi all. I decided to start a separate topic. I understand that it seems that a lot has been written about transcoding SMS text, including using the functions from rextended. I mean the functions in the following posts:
HexGSM7toCP1252
http://forum.mikrotik.com/t/rextended-fragments-of-snippets/151033/1
CP1252toHexGSM7
http://forum.mikrotik.com/t/rextended-fragments-of-snippets/151033/1
UCS2toUTF8
http://forum.mikrotik.com/t/convert-any-text-to-unicode/164329/18
UTF8toUCS2
http://forum.mikrotik.com/t/convert-any-text-to-unicode/164329/26
pdutogsm7
http://forum.mikrotik.com/t/rextended-fragments-of-snippets/151033/1
gsm7topdu
http://forum.mikrotik.com/t/rextended-fragments-of-snippets/151033/1
I have some sms examples. To begin with, let’s take those whose DCS field is equal to “00”. That is, in theory, this is not UCS2. That is, in theory, one of these functions should be recoded into normal text immediately. Here are some examples of this text: Example 1 = “C8329BFD065DDF72363904”, Example 2 = “D4F29C0E6A97E7F3F0B90CA2BF41412A68F86EB7C36E32885A9ED3CB72”, Example 3 = “DCE532B94C06CDDF6F37”.
But any of the direct ways to use functions either cause an error or produce some unexpected result. For example.

> :put [$HexGSM7toCP1252 "C8329BFD065DDF72363904"] 
Invalid 7-bit value (200)

It is possible that only the value of the DCS field is missing to determine the encoding method. Perhaps there is something else to pay attention to. Perhaps some other sequence of actions is needed. I would like to understand two options for solving this situation, including when the DCS field is equal to “08”. The first option is how to use the above functions. The second option - what is the general recoding algorithm

> :put [$HexGSM7toCP1252 [$pdutogsm7 ("\C8\32\9B\FD\06\5D\DF\72\36\39\04")]]
Hello World!

> :put [$HexGSM7toCP1252 [$pdutogsm7 ("\D4\F2\9C\0E\6A\97\E7\F3\F0\B9\0C\A2\BF\41\41\2A\68\F8\6E\B7\C3\6E\32\88\5A\9E\D3\CB\72")]]
Test message to AT Command Tester

> :put [$HexGSM7toCP1252 [$pdutogsm7 ("\DC\E5\32\B9\4C\06\CD\DF\6F\37")]]
Invalid PDU data, expected value not provided.

Something is wrong on that string...

but adding what YOU consider useless....

> :put [$HexGSM7toCP1252 [$pdutogsm7 ("\05\00\03\59\03\03\DC\E5\32\B9\4C\06\CD\DF\6F\37")]]
@ H5 @needed soon

really are: é@øH5¿@needed soon, because 0xE9 (é on CP1252) 0xF8 (ø on CP1252) and 0xBF (¿ on CP1252) can't be printed on RouterOS terminal...

The "é@øH5¿@" multipart header IS PART OF THE MESSAGE, you must not strip that part when decoding the message.........

I didn't say I thought it was useless. On the contrary, I assumed that I did not take into account something. But I would like to understand why, in the case of a composite SMS, headers are also needed.

What are these signs? Because if you take a real SMS, then they are not there. Text only

Why these slashes? Is this specific to your function, or is it a prerequisite for decoding in general?

And the question arises, if the DCS field = "08", then what should be done and in what order? Including when to place slashes?

It’s time to explore the encoding functions. Despite the work done by Rextended , I would like to mention the option from Optio. Here’s the link to their encoder.
http://forum.mikrotik.com/t/using-two-arrays-process-the-text-and-create-a-third-array/166923/34

I would also like to mention a solution from someone who is not present on this forum (at least I haven’t seen them here), but their function also works. Currently, I have made some modifications to it, adding optional arguments that might allow adding encoding functions to UCS2/UTF8/url/translit in the future.
Although, as far as I know, the author has a solution for encoding conversion in all directions.

#Author of the idea: Fly.
:local decodeText do={
	:local num2char do={
		:local charNum [ :tonum $1 ];
		:return [ [ :parse "( \"\\$[ :pick "0123456789ABCDEF" ( ( $charNum >> 4 ) & 0xF ) ]$[ :pick "0123456789ABCDEF" ( $charNum & 0xF ) ]\" )" ] ];
	};
	:local Text $1;
	:local lenText $2;
	:local typeText $3;
	:local textDecoded "";
		:local cursorBit 0;
		:local nextpart 0;
		:for i from=0 to=( [ :len $Text ] - 1 ) step=2 do={
			:local tmp [ :pick $Text $i ];
			:local charcode ( [ :find "0123456789ABCDEF" $tmp ] * 16 );
			:set tmp [ :pick $Text ( $i + 1 ) ];
			:set charcode ( $charcode + [ :find "0123456789ABCDEF" $tmp ] );
			:if ( $cursorBit < 7 ) do={
				:set tmp ( $charcode & ( 127 >> $cursorBit ) );
				:set tmp ( $tmp << $cursorBit );
				:set tmp ( $tmp + $nextpart );
				:set nextpart ( $charcode >> ( 7 - $cursorBit ) );
				:set cursorBit ( $cursorBit + 1 );
			};
			:set textDecoded ( $textDecoded . [ $num2char $tmp ] );
			:if ( $cursorBit = 7 ) do={
				:set tmp $nextpart;
				:set cursorBit 0;
				:set nextpart 0;
				:set textDecoded ( $textDecoded . [ $num2char $tmp ] );
			};
		};
		:if ( $lenText > 0 ) do={ :set textDecoded [ :pick $textDecoded 7 [ :len $textDecoded ] ]; };
	:return $textDecoded;
};

And it would be desirable to combine the approach from the respected Rextended and from Fly.

Is there any way to determine which language is used in an SMS?

Valid only for GSM-7 (or ASCII-8) since UCS-2 is already multilanguage.
Good luck…
https://en.wikipedia.org/wiki/GSM_03.38#National_language_shift_tables
https://en.wikipedia.org/wiki/User_Data_Header

Which UDH field can indicate the language of the SIM card? Or how can you read the language of the SIM card in general?