More info from inside the containers
I updated the entrypoint and cmd as needed to invoke strace as the first process that launches the program to capture a trace of the activity, so as to ascertain of the application is mis-behaving.
You can see the strace did start successfully and try to launch the program
/ # ps -ef
PID USER TIME COMMAND
1 root 0:00 strace -ffyo /etc/vmagent/strace.out /vmagent-prod -remoteWrite.url=http://xxxxxx:9009/api/v1/push -promscra
9 root 0:00 strace -ffyo /etc/vmagent/strace.out /vmagent-prod -remoteWrite.url=http://xxxxxx:9009/api/v1/push -promscra
10 root 0:00 ps -ef
You can it has written a file out already
/etc/vmagent # ls -la /etc/vmagent/
total 20
drwxr-xr-x 2 root root 4096 May 28 00:04 .
drwxr-xr-x 20 root root 4096 May 27 23:36 ..
-rw-r--r-- 1 root root 10 May 27 23:36 .type
-rw-r--r-- 1 root root 476 May 27 23:43 prometheus.conf
-rw-r--r-- 1 root root 199 May 28 00:05 strace.out.11
You can see that the program has barely gotten anywhere, a full minute after the container starts up
/etc/vmagent # cat /etc/vmagent/strace.out.11
execve("/vmagent-prod", ["/vmagent-prod", "-remoteWrite.url=http://xxxxxx"..., "-promscrape.config=/etc/vmagent/"...], 0xfffffffffd50 /* 3 vars */) = 0
set_tid_address(0xeef210) = 11
brk(NULL) = 0xf3e000
brk(0xf40000) = 0xf40000
mmap(0xf3e000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf3e000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffff7ffd000
10 minutes later... it has managed to make 10 more calls, when there should be
hundreds of calls by now
execve("/vmagent-prod", ["/vmagent-prod", "-remoteWrite.url=http://xxxxxx"..., "-promscrape.config=/etc/vmagent/"...], 0xfffffffffd50 /* 3 vars */) = 0
set_tid_address(0xeef210) = 11
brk(NULL) = 0xf3e000
brk(0xf40000) = 0xf40000
mmap(0xf3e000, 4096, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xf3e000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffff7ffd000
munmap(0xfffff7ffd000, 4096) = 0
sched_getaffinity(0, 8192, [0 1 2 3]) = 8
openat(AT_FDCWD</>, "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size", O_RDONLY) = -1 ENOENT (No such file or directory)
mmap(NULL, 262144, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffff7fbe000
mmap(NULL, 131072, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffff7f9e000
mmap(NULL, 1048576, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffff7e9e000
mmap(NULL, 8388608, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffff769e000
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffff369e000
mmap(NULL, 536870912, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffffd369e000
mmap(NULL, 536870912, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xffffb369e000
It will slowly continue to make calls at an unbelievably slow pace for hours and still never manage to complete the number of system calls for the application to startup.
It is as if ROS is throttling the container, except CPU and Memory on the system show there is no resource problem.
[admin@MikroTik] /system/resource> print
uptime: 38m56s
version: 7.14.3 (stable)
build-time: 2024-04-17 12:47:58
factory-software: 7.5
free-memory: 657.4MiB
total-memory: 928.0MiB
cpu: ARM64
cpu-count: 4
cpu-frequency: 1320MHz
cpu-load: 1%
free-hdd-space: 90.2MiB
total-hdd-space: 128.0MiB
write-sect-since-reboot: 2387
write-sect-total: 388926
bad-blocks: 0%
architecture-name: arm64
board-name: hAP ax^3
platform: MikroTik
[admin@MikroTik] /system/resource> cpu/print
Columns: CPU, LOAD, IRQ, DISK
# CPU LOAD IRQ DISK
0 cpu0 1% 0% 1%
1 cpu1 1% 0% 1%
2 cpu2 2% 1% 0%
3 cpu3 1% 0% 0%