Scheduler stops executing script

Hi!

I have a quite strange problem.

I am using a simple scheduler script on a CCR1009:

/system scheduler
add interval=27m name=pull_blacklists on-event=\
    "/system script run get_blocklists" policy=\
    ftp,reboot,read,write,policy,test,password,sniff
    start-date=jun/20/2020 start-time=00:05:15

The script is running fine for some weeks and then it stops suddenly to get executed. “Next run” does not update again in the GUI.

Do you have any idea, why this happens?
If I change the “Start-Time” for one second in that situation, the script continues to run (for some weeks).

Thank you for your help
Stril

…am I the only one, seeing that problem?

This is not diagnosticable, the scheduler is ok,
you must also post the script you try to run…

Hi!

There are multiple scripts on the router and all of them are running fine for some weeks and then, they stop simultaniously.

In scheduler put line’s like:

/log warning "ScriptX Starts"
/system script run get_blocklists
/log warning "ScriptX End"

And similar line in your log and your log-s show you if scripts is realy working or not and at what step it’s breake.

Try run manualy your script and check if he success.

Hi!

Scripts are running fine! 100%.
The only problem is, that the scheduler does not try to run them after some thousand successful runs.

This is also visible at “next-run” in scheduler, which is in the past in that case.

@Stril I asked you to post the script here for further analysis,
but your assumption that the executed script is perfect and does not block the scheduler, makes me make this decision:

End of help from my side.

Since problem seems to be connected to internal state of your router and normal users have no way of analyzing it, it’s useless to pursue this matter over the (user) forum. I guess your next steps should be: 1) create a supout.rif file when scheduler stops executing scripts and 2) open a ticket with support@mikrotik.com.

My last suggestion.
Do this log warning at script who have max of running counter - probably it is first to have a problem.

Next is to use a total free https://deadmanssnitch.com/plans a one triger, this give you a URL… and if this URL is not run in x time then this portal give you notification that it not see new request.
https://deadmanssnitch.com I use and similar projects to notify when some process not finish it’s job, general they are created to be check a linux CRON jobs but you can use them in many ways.
When you receive that e-mail notify then go to MikroTik, create a supout.rif and sent it directly to MikroTik, we at forum can only confirm that should be work and that config is ok.

@SiB, next time the scheduler start must check if previous is finished, if not warn user on some way.

pseudocode scheduler
set global varialble randomnameJhdsfg to “endscript” if the variable do not already exist
check global variable randomnameJhdsfg if it is set to “endscript”, if it is not, warn user on some way
set global varialble randomnameJhdsfg to “startscript”
run the needed script
set global varialble randomnameJhdsfg to “endscript”

rextended write

I know that you do all by scripting :).
But, sometimes to not open the open door, I just add at the end of schedulers/script just one line of code like:
/tool fetch url=https://nosnch.in/00abcdef3f
/tool e-mail send to=00abcdef3f@nosnch.in
and it's all. If scripts stops working by update this remote service by url/email, Then this online service send me e-mail (in free account at end of Hour/day/week/month).

Of course OP this thread should add just log warning and watch in logs if scripts run or not... many way can be done to diagnostic this.
Thats why we like MikroTik.. by scripting :slight_smile:

Hi!

I just added the log-warning.

Here the main script:

:log info message=("Start Sending Report");
:local ipList value="";
:foreach tmpAddress in=[/ip firewall address-list find where list=HONEYPOT] do={
:set $attackip value=([/ip firewall address-list get $tmpAddress value-name=address]);
:log info message=("$attackip"."Report to AbuseIPDB");
:do {/tool fetch keep-result=no http-method=post url="https://api.abuseipdb.com/api/v2/report" http-data="key=xxxx&categories=14&comment=Portscan&ip=$attackip"} on-error={:log info message="Error for Report of IP $attackip"}
:delay 6000ms;
};
:log info message=("Sending Report End");

One additioinal question:
Does scheduler stop executing ALL jobs, if one job does not finish?

You are right, this was not correct - Sorry for that!
I was thinking, that if all the scheduler jobs stop working simultaneously, the script itself cannot be the reason.

Did you ever see, that scheduler stops executing new jobs, because one script is "blocked" and that it restarts, by re-adjusting the start-time?

Sometime “fetch” freeze for answer from remote site and lock the script (and the scheduler)
on-error can not catch indefinite waiting…

also

“:set $attackip value=”

where is defined “attackip”?
and :set must be used without the $

only 6 seconds between fetch notification?
some fetch can sovrappose…

How many IP you report? IF the list is huge can take some hours…
You delete the IP or continuosly you resend again notice for already notified IP?

Rewrited script, without change logic

:log info "Start Sending Report"
/ip firewall address-list
:foreach tmpAddress in=[find where list="HONEYPOT"] do={
    :local attackip [get $tmpAddress address]
    :log info "BEGIN $attackip Report to AbuseIPDB"
    :do { /tool fetch keep-result=no http-method=post  \
                      http-data="key=xxxx&categories=14&comment=Portscan&ip=$attackip" \
                      url="https://api.abuseipdb.com/api/v2/report"
    } on-error={:log error "Error for Report of IP $attackip"}
# added for debug
    :log info "END $attackip Report to AbuseIPDB"
    :delay 10s
}
:log info "Sending Report End"

Hi!

THANK YOU! That’s a great help.
I changed the script with your input. I have no idea, why it did, what it had to do despite of the wrong parts…

Is there any possibility to to catch a freeze?

Currently, I am sending the list every hour and as I set timeouts to the “HONEYPOT”-list of less than one hour, there should be very few “double-send”.
How does the scheduler work, if the script did not complete before the next run?

Is there any possibility to exit a “foreach” after x entries? Is there something like an exit-condition?

Thank you very much!

fetch can cause a infinite DELAY, not infinite loop,
all is freezed, waiting fetch to finish, is not a cycle than can be autochecked if executed too much time and autoexit…

Try my script, if fail we add asyncronous fetch execution
Something like that, you can see my Snippets, on my signature the link.
Autoclose after 20s of waiting…

    :local jobid [:execute script="/tool fetch ............"]
    :local sec 0
    :while (([:len [/sys script job find where .id=$jobid]] = 1) && ($sec < 20)) do={
        :set sec ($sec + 1)
        :delay 1s
    }

Thank you! I will try both for the next weeks and report back, if there are any problems.

Thank you for your advice and sorry for my first reaction…

No problem

Quick question for the Guru’s
@rextended, @SiB, @mkx

Q1:
Wenn using parameters start-date and start-time
does RouterOS have to recalculate Next run every single time from these parameters or only on startup?

Q2:
Would it make a performace difference to change the schedule start-time to startup ?