Trouble: Can't connect to ATL after update

Hello,

I am in a big trouble.

My setup is ATL → hAP3 → LAN.

What I did:

  1. I have updated the hAP3:

/system/package/update/download
/system/routerboard/upgrade

Everything went fine and 7.16 works after the reboot.

  1. I have updated the ATL:

/system/package/update/download
/system/routerboard/upgrade

Update was successful.
Then I rebooted.

RESULT: I can no longer connect to the ATL (I always use only SSH and that is the only enabled method):

ssh: connect to host 192.168.188.1 port 22022: Network is unreachable

I cannot even ping the ATL from the hAP3 (which powers it) or from the computer. I get packet-loss=100%. No configuration was touched before or after the update.

At the same time, I see that the LED of on the hAP3 port to which the ATL is connected is lit, i.e. there is connection on L1. From time to time that led goes dark for 2-3 seconds but then it is on again.

I tried connecting the ATL cable directly to the computer (using the PoE injector supplied with the ATL) but again - no connection is possible.

The problem:

My ATL is on a very high pole and it is impossible to reset it via the button. A reset would need unmounting of the whole construction.

I don’t know why the update caused this but now I have no Internet connection because the ATL cannot be used. I am sending this using my phone Internet.

Please kindly advise how to proceed.

There are reports of PoE issues in the 7.16 release topic.

Your ATL is probably configured with DHCP client and not with a static IP address? 192.168.188.1?
Instead of direct connection between your computer<->ATL try the following:

As described you power your ATL usually by HAp poE out port. Disable Poe out on the Ethernet port where your ATL is connected to (https://help.mikrotik.com/docs/display/ROS/PoE-Out#PoEOut-offmode) or use a non-poe port. instead put the power injector in between HAP and ATL.

You first updated your HAP and the ATL afterwards. If there is really something fishy with Poe in latest release, your ATL maybe faced power interruption while updating. That would be worst case and could also explain why it is not reachable as well.

Thank you for your quick reply.
OK, I found the thread you mention:

http://forum.mikrotik.com/t/v7-16-2-stable-is-released/178911/1

Unfortunately, I don’t understand from it what exactly I should do. Does it suggest that 7.16 is buggy? If yes, is downgrade to a good version possible?

Your ATL is probably configured with DHCP client and not with a static IP address? 192.168.188.1?

I have not touched it for a year (since purchased), so I don’t have it all in my head. The IP address of the ATL is 192.168.188.1. I believe the rest is pretty much factory default.

I am not quite sure what is the difference between:

A) PoE power coming from the injector and Ethernet connected directly to the computer (not through hAP3)
B) PoE power coming from the injector and Ethernet connected to hAP3

Anyway, I followed your advice and disabled the PoE out on the hAP3 port and connected the power injector in between hAP3 and ATL.

[admin@MikroTik] > /interface/ethernet/poe/print detail   
 0 name="ether5" poe-out=off poe-priority=10 power-cycle-ping-enabled=no power-cycle-interval=none

Sadly, the result is still the same as in the OP - can’t connect to the ATL. I also still see the LED of the ether5 port going dark for about 3 sec from time to time (not sure if this is normal, haven’t paid attention before the updates).

your ATL maybe faced power interruption while updating.

All my equipment is UPS powered. There have been no surges or power outages during the update either.

So, what can I do?

It is not about your general electrical power supply or your UPS. The POE controller on your HAP may unintentionally powercycled or something.

See changelog of 7.16

*) poe-out - upgraded firmware for SAMD20 PSE (AF/AT) controlled boards (the update will cause brief power interruption to PoE-out interfaces);

The difference between connecting your ATL to computer or HAP is pretty easy to understand: your HAP may act as an DHCP server, whereas your computer most probably not. So in case your ATL is configured as DHCP-client, then you are out of luck when connecting to your computer → your ATL won’t receive any IP address and will be not reachable in no way. There would be a last resort, using WinBox MAC mode, but you told us that you disables all IP services except SSH. :confused:

But I assume you rather have changed the default static IP address from 192.168.88.1 to 192.168.188.1. So you could be in luck, when configuring static IP address on your computers ethernet adapter. 192.168.188.2/24 maybe. See if connection is possible.

Check the leases on your DHCP server and see if you can spot the ATL, make sure you have “Active Host Names” selected.

After I upgraded to 7.16 I lost all my static addresses but noticed they had been assigned other ip addresses.

The POE controller on your HAP may unintentionally powercycled or something.

Is that Mikrotik’s fault? (e.g. bug)

See changelog of 7.16

Yeah, I have seen that. However, I have no idea how I am supposed to use the information provided. OK - “will cause brief power interruption” - Why? If such interruption is bad - how come this is a “stable” version resulting in such troubles? This is quite confusing.

The difference between connecting your ATL to computer or HAP is pretty easy to understand: your HAP may act as an DHCP server, whereas your computer most probably not.

Alright. But why should I need the computer to be a DHCP server to connect to the ATL? I have always been able to connect to the HAP using either DHCP or a static address (e.g. 192.168.88.*) and then access the ATL (which is 192.168.188.1) both from the HAP and from the LAN computers.

Not being a network expert, my understanding is that the ATL itself acts as a DHCP server, the HAP receives an IP address from it, then the HAP itself (running a DHCP server too) gives an IP address to the computers on the LAN. In that sense, I have never had any problem connecting to the ATL (network 192.168.188.1) while being assigned either a static IP address on the HAP network (192.168.88.) or DHCP (also 192.168.88.). In that sense, I really don’t understand what may have changed, assuming that updates don’t touch configuration.

But I assume you rather have changed the default static IP address from 192.168.88.1 to 192.168.188.1.

The default IP address of the ATL is 192.168.188.1. Also mentioned in the manual. I have not changed it.

@marsbeetle

Check the leases on your DHCP server and see if you can spot the ATL, make sure you have “Active Host Names” selected.

What are “leases”, which DHCP server do you mean and how do I do what you suggest, please?

So you could be in luck, when configuring static IP address on your computers ethernet adapter. 192.168.188.2/24 maybe. See if connection is possible.

Yes, I am able to create such connection in NetworkManager:

ipv4.addresses: 192.168.188.22/24
ipv4.gateway: 192.168.188.1

and I can connect to that.
The problem is that it results in nothing useful - I still can’t SSH to the ATL, and trying to ping it gives:

$ ping 192.168.188.1
PING 192.168.188.1 (192.168.188.1) 56(84) bytes of data.
From 10.138.15.200 icmp_seq=1 Packet filtered
From 10.138.15.200 icmp_seq=2 Packet filtered
...

What host is 10.x? can you please unplug any other network devices. just connect your PC and ATL.

What host is 10.x?

According to https://ipinfo.io/10.138.15.200 it is a bogon IP address. I have absolutely no idea why it shows.

can you please unplug any other network devices. just connect your PC and ATL.

Previously you said that I should connect PC → HAP --(PoE injector)–> APC.
Are you suggesting now that I should connect PC --(PoE injector)–> APC? Or something else?
To avoid further confusion, please clarify. Thanks.

I have absolutely no idea why it shows.

It might be due to the inter-VM network which is 10.0.0.0. Testing outside that network, i.e. directly from the physical Ethernet interface of the PC:

$ ping 192.168.188.1
PING 192.168.188.1 (192.168.188.1) 56(84) bytes of data.
From 192.168.188.22 icmp_seq=1 Destination Host Unreachable
From 192.168.188.22 icmp_seq=2 Destination Host Unreachable
From 192.168.188.22 icmp_seq=3 Destination Host Unreachable
...

Your ATL may need to be “Netinstalled”. Tick “Keep configuration”. This should bring it back to life. https://help.mikrotik.com/docs/display/ROS/Netinstall. I know, its hard to swallow, as you need to press the reset-button 10+ seconds to get into Etherboot mode…

Isn’t there really anything else to try?

This is a disaster. If Mikrotik can cause such troubles to a customer through “stable” updates - how can one possibly trust any update going forward?

Or is anyone using remote/difficult-to-reach-physically devices doomed to such issues?

I wonder if there is any safe way to update in such cases.

Are you using winbox? Assuming you have the defaults, you should be able to get in via its MAC, not IP, address in the WinBox app from the LAN side of the router. If you can get in, look at the Logs & do an :export at Terminal and paste those here if you’d like. If Winbox with MAC address does NOT work, then you’d be down to re-flashing it with netinstall tool as suggested above (but that is a more involved process…so do try using the MAC+winbox first)

Are you using winbox?

No, only SSH from Linux.

As for Netinstall - this seems quite complicated. I wonder if I will be able to do it right.

I guess I’m confused. Are you not able to get in after upgrade? Or does it just not work for LTE after upgrade?

To clarify my earlier answer:

As noted, the “winbox” client app using ethernet(layer2) so even if the config is FUBAR, if RouterOS boots you should be able to get using the MAC address. The defaults should have it enabled, but control via:
/tool mac-server set allowed-interface-list=all
/tool mac-server mac-winbox set allowed-interface-list=all
/tool mac-server ping set enabled=yes

If those were enabled, and you cannot get into the router…

There is an intermediate step before netinstall. You can reset it the default configuration by de-powering it, THEN plug it back in WHILE holding the reset button for ~7 seconds (>5 sec, <10 sec) - the reset button needs to be press when power is applied.

If you need to save then config, and cannot get in…then netinstall is needed. But I cannot imagine your config is very far from default, so the “reset to defaults” may be easier than netinstall.

But if you CAN get into the router and something is not right, just post the config here by using “:export” in the Terminal/ssh.

Also, since I think you have a hAP… this won’t help now… but if you enable RoMON & the hAP was on same network as ATL, then RoMON be able to get into ATL via the hAP. It does require using winbox, where you connect to romon on the hAP, and assuming romon was enabled on ATL, winbox then show the ATL as an option connect (via the hAP “proxying” the winbox protocol).

And in the pantheon of ways to set it up BEFOREHAND for remote access, there is also “back-to-home”… so if you enabled that directly on the ATL, if LTE was up but LAN had issues… you could use VPN via BTH app/WG to get in too… Same with zerotier…

Anyway RouterOS has lots of options to do avoid a netinstall. Some do require setup before :wink:

Are you not able to get in after upgrade? Or does it just not work for LTE after upgrade?

What I can do:

Set up a network connection in NetworkManager manually with a static gateway 192.168.188.1 (the ATL) and client IP address 192.168.188./24. Then I can connect to that connection (i.e. have the link up) but that’s all - there is nothing I can actually use it for: I can’t ping anything (even the gateway), I can’t SSH to the gateway. So, this “possible connection” is very low layer. I am not proficient enough to explain it better, sorry.

As noted, the “winbox” client app using ethernet(layer2) so even if the config is FUBAR, if RouterOS boots you should be able to get using the MAC address. The defaults should have it enabled, but control via:
/tool mac-server set allowed-interface-list=all
/tool mac-server mac-winbox set allowed-interface-list=all
/tool mac-server ping set enabled=yes

If those were enabled, and you cannot get into the router…

Considering I have no access to the RouterOS of the ATL, I have no way to check. One thing is sure: after buying and configuring the ATL, I explicitly disabled all possibilities for connection to it and left SSH only. For security reasons.

There is an intermediate step before netinstall. You can reset it the default configuration by de-powering it, THEN plug it back in WHILE holding the reset button for ~7 seconds - the reset button needs to be press when power is applied.

That’s my initial assumption. If I could get hands on the ATL, I can try this “factory reset” which you explain, then work my way up. The big question remains though - what about updating? (now, with the obviously buggy version, and in future) I surely don’t want to engage into construction/deconstruction work just because the new software version obviously does not work.

My biggest concern with Netinstall is the security of the process. I don’t quite understand how one should have security by downloading some proprietary piece of software, running it as root, without any firewall protection whatsoever and allowing external device to communicate directly with the Ethernet port of the PC. IOW, to restore the functionality of a device, one should expose even working systems to who knows what. (Yes, I distrust network infrastructure by default).

But if you CAN get into the router and something is not right, just post the config here by using “:export” in the Terminal/ssh.

I can get only to the hAP ac3. If you think we can see something from there - please let me know.

Just to complete the thread…

If you do get to needing a netinstall… you can run it as a container on the hAP. See https://hub.docker.com/r/ammo74/netinstall - this avoid all the setup required on Windows for netinstall Netinstall on Windows is just error prone since Windows security scheme really does not like the low-level networking things & one thing wrong, netinstall will not work. If you have a Linux box somewhere, that’s better than windows to run netinstall too.

Now the fact it’s already up on mast & reset button is far way, either the current “reset-to-default” or “netinstall” options are not going help solve getting it down part. :frowning: So I get your problem here… and RouterOS is complex so not easy the first time on any of this - but they do have a lot tools to deal with remote devices.

:frowning:

Sorry I was finishing my thread since I like to keep the options together :wink:.

Reading your response. That seems like good news — If you can get into the ATL via ssh and 192.168.188.1 - there is no need for going to mast.

Next question be is the LTE connection working, since something there go wrong during upgrade in that part for sure…

And “/interface/lte [find] monitor” show that via ssh. If that’s connected… then do an “:export” and cut-and-paste the configuration here. If you also wanted to run “/ip address print” and “/ip route print” and “/log print” that help too.

You may just consider a downgrade to 7.15.3, which involve using scp to copy the packages download from Mikrotik to router. But let’s see what’s going on with LTE first.