Is 8MB in a variable from a txt file is possible?

Hi everyone, I would like to make a script to run once a week that updates address list on Mikrotik.
coming from this topic:

http://forum.mikrotik.com/t/max-size-of-variables-still-at-4096-anwser-is-no/168007/6

I found a very useful script for creating variables.
I’m trying to import a file.txt of 8 MB, only it cuts the list to more or less 3 MB.
Was I wrong or is it not possible?

Thank you.

my script is:

:global thefile ""
{
    :local url        http://public-dns.info/nameservers-all.txt ;
    :local filesize   ([/tool fetch url=$url as-value output=none]->"downloaded")
    :local maxsize    64512 ; # is the maximum supported readable size of a block from a file
    :local start      0
    :local end        ($maxsize - 1)
    :local partnumber ($filesize / ($maxsize / 1024))
    :local reminder   ($filesize % ($maxsize / 1024))
    :if ($reminder > 0) do={ :set partnumber ($partnumber + 1) }
    :for x from=1 to=$partnumber step=1 do={
         :set thefile ($thefile . ([/tool fetch url=$url http-header-field="Range: bytes=$start-$end" as-value output=user]->"data"))
         :set start   ($start + $maxsize)
         :set end     ($end   + $maxsize)
    }
}

#:log info "debug_thefile=$thefile"
#/file remove [find where name="check.txt"];
#:execute ":put \$thefile" file=check.txt;

:global content value=$thefile;
:local contentLen value=[:len $content];
:local lineEnd value=0;
:local line value="";
:local lastEnd value=0;
:local addressListName;
:set addressListName "DNS-DOH";

:if ($thefile != null) do={
  :log info "There are some New DNS"
  /ip firewall address-list remove [/ip firewall address-list find list=$addressListName]
  :do {
      :set lineEnd [:find $content "\n" $lastEnd ] ;
      :set line [:pick $content $lastEnd $lineEnd] ;
      :set lastEnd ( $lineEnd + 1 ) ;
      :local entry [:pick $line 0 $lineEnd ]
      :if ($entry~"^[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}") do={
            :if ( [:len $entry ] > 0 ) do={
#:log info "debug_entry=$entry"
                /ip firewall address-list add list=$addressListName address=$entry;
            }
      } 
    } while=($lineEnd < $contentLen);
  } else={
  :log info "There no DNS in list"
}

From ROS version 7.13 there is /file/read command (currently undocumented) which you can use to read file chunked with arguments chunk-size and offset. First, fetch command will need to download whole file on to FS and not be used to return response data as value to variable.
This can be useful for parsing text by some delimiter, to read file chunked and hold characters buffer variable until delimiter is reached (assuming that text between delimiters will not reach variable size limit). In your case it can be done since new line is delimiter and lines are not long, but this read feature it still not enough for parsing large meta data files like json (depends on structure, some can be parsed like large arrays of smaller items).

On 7.13.5 do not work, it display it only on terminal, not create any variable or is of any use…

But if on future is fixed, like adding as-value or similar…
:global readfile do={
:local url $1
:local thefile “”
:local filesize [/file get $1 size]
:local maxsize 32768 ; # is the actual maximum supported readable size of a block from a file
:local start 0
:local end ($maxsize - 1)
:local partnumber ($filesize / ($maxsize / 1024))
:local reminder ($filesize % ($maxsize / 1024))
:if ($reminder > 0) do={ :set partnumber ($partnumber + 1) }
:for x from=1 to=$partnumber step=1 do={
:set thefile ($thefile . [/file read file=$1 chunk-size=$maxsize offset=$start as-value])
:set start ($start + $maxsize)
:set end ($end + $maxsize)
}
:return $thefile
}

:local test [$readfile test.txt]

I didn’t tested it, strange it is not initially implemented to return value as default, there is no much practical use without value. Btw., regarding your function, are you sure this will work on large files and $thefile will not reach variable size limit or maybe device memory limit, since it appends to it bytes from all chunks? I think for parsing purposes bytes buffer variable will need to hold only bytes between delimiters (new line in this case) not all file bytes to avoid memory limits and adding parsed IP into address list should be performed inside chunk iteration when delimiter is reached.

Edit:
@Kataius idk. why I this didn’t come to my mind initially, since web server supports chunked download, but this is for ROS 7.14 only (read below why):
You can append to bytes buffer fetched chunk bytes and parse buffer by new line delimiter and also match to IP address as validation (you already have that in script) and add that IP into address list, then remove parsed bytes from buffer - this is to avoid buffer var. size limit when bytes are appended in another iteration (lines are not that long), this all needs to be performed inside fetch chunk loop. First you will need to perform HTTP HEAD method request on URL and read “content-length” header to get full size in bytes which you need to calculate number of chunks, but unfortunately HTTP HEAD method for fetch is available from ROS 7.14 so you will need to run that ROS version if you want to achieve this.
Tip: To properly handle possible fetch connection/server errors while populating address list in chunks loop, better use temporary list and when fetch performed successfully for all chunks then remove all items from actually used address list and items in temp address list put into actual address list. If something failed during chunked fetch just remove temp address list for cleanup and finish operation.

The var size limit is the memory on the device.
I take it for granted that one imports 8 megs of address list into appropriate devices, not into little toys like the hAP lite… :laughing:

http://forum.mikrotik.com/t/max-size-of-variables-still-at-4096-anwser-is-no/168007/4

Ah ok, just current free memory is limit… but also proposed solution for parsing chunks in prev. post which could work on any device without worrying about free memory, for this case doesn’t matter, but for parsing 100Mb file maybe it will and it can be implemented in ROS 7.14 without waiting as-value for /file/read in future version :slight_smile:

I have a RB5009UG+S+IN not a little haP lite :smiley:

Maybe there is some limit even if you have free memory for much more bytes…
Try with my approach mentioned in previous post if you skilled with scripting to implement it, actually you don’t need HEAD method first to get file bytes, it is more optimal but it works only on ROS 7.14+, as I see you are downloading full file before chunked fetch to read file size so it is possible to do it with that.

I doubt anyone in their right mind imports a 100MB list of IPs to block… at that point is faster to block everything and implement a white list…
If you have to work on files, a container is better, or better yet, an external server…

Hi Here,

I worked on two scripts, where I copied rextended’s script (please don’t hate me for this).
The logic is:

1 fetch the file in X file splits,

2 imports one by one and then deletes them all.

The problem is that the first script blocks the entire execution of the second script because it gives me 1364.1KiB split and the second script doesn’t read this big file.
Is there a possibility to split into smaller files and what is it?


Thank you

FETCH and SPLIT

:local orname;
:set orname "DOH";  
:local url "http://public-dns.info/nameservers-all.txt"
:local filesize ([/tool fetch url=$url as-value output=none]->"downloaded")
# 64512 is the max size of RouterOS text variables.
# To insert the incomplete end of the previous file at the beginning of the next file, reduce the size of each piece accordingly.
:local maxsize 64512
:local start 0
:local end ($maxsize - 1)
:local partnumber ($filesize / ($maxsize / 1024))
:local reminder ($filesize % ($maxsize / 1024))
:if ($reminder > 0) do={ :set partnumber ($partnumber + 1) }
    :for x from=1 to=$partnumber step=1 do={
         /tool fetch url=$url http-header-field="Range: bytes=$start-$end" keep-result=yes dst-path="/$orname$x.txt"
         :set start ($start + $maxsize)
         :set end ($end + $maxsize)
    }

IMPORT AND DELETE

:local part 1
:local input ""
:local done false
:local lineEnd value=0;
:local line value="";
:local lastEnd value=0;
:local addressListName;
:set addressListName "TRY";
:local orname;
:set orname "DOH";


/ip firewall address-list remove [/ip firewall address-list find list=$addressListName]
/file
:while (!$done) do={
  :local nameFile ("$orname".$part.".txt")
  :if ([find where name=$nameFile]) do={
    :set input ($input.[get $nameFile contents])
    :local content value=$input;
    :local contentLen [:len $input];
  :do {
      :set lineEnd [:find $input "\n" $lastEnd ];
              :if ([:len $lineEnd] = 0) do={
            :set lineEnd $contentLen
        }
     :set line [:pick $input $lastEnd $lineEnd];
        :set lastEnd ($lineEnd +1);
      :local entry [:pick $line 0 $lineEnd ];
              : if ($line !="") do={
                :if ($entry~"^[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}") do={
            :if ( [:len $entry ] > 0 ) do={
                /ip firewall address-list add list=$addressListName address=$entry;
                  :log info "IMPORT"
                  :delay 2
              }
           }
      } 
    } while=($lineEnd < $contentLen);
    /file remove ("$orname".$part.".txt")
    :set part ($part + 1);
  } else={
    :set done true
  }
}

Try with one up? :smiley:

An other up?

I suspect there is a bug on http-header-field argument for fetch command, maybe it is not sent correctly (I didn’t check this over local web server to analyse), download data is always 64512 bytes regardles which range is set in header when output=user as-value:

:put [:len ([/tool fetch url="https://public-dns.info/nameservers-all.txt" http-header-field="Range: bytes=0-1" output=user as-value]->"data")]
64512

when output=file it downloads whole file:

/tool fetch url="https://public-dns.info/nameservers-all.txt" http-header-field="Range: bytes=0-1" output=file
:delay 3
:put [get "nameservers-all.txt" size]
1396869

This indicates that Range header is not sent correctly.

Using curl is ok:

curl -s -H 'Range: bytes=0-1' https://public-dns.info/nameservers-all.txt | wc -c
2

@optio nothing wrong there, Data is a variable and so 64512 bytes is the max.

I did not succeed in reading chunks from files so I am doing other things in the meantime till it get fixed…some day.

It shouldn’t be since range header is set. 64512 is because full data is fetched and limited to that and that’s wrong (it should be 2 as from curl).

Read it like this:

put [:len ...... ->"data")]

Curl wc is word count not bytes.

Update: ran the line in Terminal and RouterOS always read 64512 bytes and only the first number of 0-1 is used. “downloaded=63;duration=00:00:01;status=finished” and 63 * 1024 = 64512 bytes

RouterOS is short cutting here thinking the data is going to be put in a variable and the max size is 64512 bytes so why not stuff all it in there and the user can then use all, or a part.

Maybe you missed -c argument…

$ wc --help
Usage: wc [OPTION]... [FILE]...
  or:  wc [OPTION]... --files0-from=F
Print newline, word, and byte counts for each FILE, and a total line if
more than one FILE is specified.  A word is a non-zero-length sequence of
printable characters delimited by white space.
...
  -c, --bytes            print the byte counts
...



It’s simple, download using fetch and curl with same Range header:

:put ([/tool fetch url="https://public-dns.info/nameservers-all.txt" http-header-field="Range: bytes=0-1" output=user as-value]->"data")
2607:5300:203:1797::53
...

64512 is just correct count of wrong data, data should contain only first 2 bytes (characters 2 and 6 in this case) because of Range header and http server will return only these if correctly header is sent, like when using curl:

curl -s -H 'Range: bytes=0-1' https://public-dns.info/nameservers-all.txt
26

@optio
Excuse me, but from the results of the experiments you posted it could be that the:
“Range: bytes=0-1”
is read/interpreted as (in pseudocode):
“Range: from_offset=0 length=64512”
(because any length<64512 is “rounded up” to 64512)
or
it is simply ignored (for whatever reasons) and it will always return the first 64512 bytes no matter the values you put in range.

Cannot the first 2 bytes be extracted from the retrieved 64512 in a second operation?

It seems to me that what needs to be checked/tested is whether the “from_offset=0” part is respected/used.
I.e., does (in your command/setup):
“Range: bytes=1-1”
return 64512 bytes of which the first one is 07?

This is server side operation for Range header (server returns bytes depending on Range header value), see https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Range
That’s is my whole point, yes you can extract from these returned bytes some bytes, but you cannot achieve chunked download of all bytes.
If you download all bytes to file with fetch, then you have issue mentioned here where you cannot use /file/read to read all bytes by chunks into variable, if Range header for fetch is working it will be possible then.

@optio I indeed missed the -c

Then in the end. I now think that it the webserver does not send the number of bytes request because it might not be supporting chunking and so sent all data in one go. Because the data variable only can fit 64512 bytes is that what you get returned by RouterOS. The rest of the file is ignored and my earlier conclusion that it was a kind of sliding window was also wrong.

If chunking is supported on the other end then RouterOS should be working correct.

By the way it is a possible way to test if a webserver is supporting chunking in RouterOS scripting. If you get a len: of 64512 or > then the requested number of bytes +1, then there is no support for chunking available.

CHECK:

:put ([/tool fetch url="https://view.sentinel.turris.cz/greylist-data/greylist-latest.csv" http-header-field="Ran
ge: bytes=0-80" output=user as-value]->"data")   
# For the terms of use see https://view.sentinel.turris.cz/greylist-data/LICENSE.



:put [:len ([/tool fetch url="https://view.sentinel.turris.cz/greylist-data/greylist-latest.csv" http-header-fiel
d="Range: bytes=0-80" output=user as-value]->"data")] 
81