Persistent 60GHz Link Lock-up on 600m PtP - Worked for 4 Years, Now Failing Even with Brand New Hardware

Hello everyone,

I’m facing a recurring issue with a MikroTik 60GHz Point-to-Point link between two sites. Despite exhaustive troubleshooting and hardware replacement, the problem persists.

Context:

  • Distance: 600 meters with a clear Line of Sight (LoS).

  • Reliability History: This specific link worked perfectly for 4 years.

  • Hardware: I recently replaced the remote unit with a brand-new, identical model out of the box, but the issue remains.

  • Cabling: All Cat6 cables and RJ45 connectors have been renewed on both ends.

  • Firmware: Running RouterOS v7.22.1.

The Problem: The link drops randomly at unpredictable times (could be day or night). When it fails, the wlan60-1 interface on the AP side stays in a "not running" state (no 'R' flag), and the "Connected To" MAC address field disappears.

Critical Detail: Software-level resets (Disable/Enable) of the interface on the AP side do not restore the link. The only way to get it back is a physical power cycle (hard reboot) at the remote site. Once power-cycled, it reconnects instantly and works fine for a random period (from a few hours to a couple of days) before locking up again.

Hi,

Try to use the new device on main site and put the old remote back.
Isn't the main device malfunctiong?

Hi, actually we replaced both units with new ones. Both the main site and the remote site are now using brand new devices.

Which RouterOS version were you running before?
Could It be something in 7.22.1?

From title: Worked for 4 Years

Then some clever person thought installing RouterOS 7.22.1 (which certainly didn't exist 4 years ago) was a brilliant idea.

Reset things the way they were, and everything will work as it did before...

NOTICE: Whatever was misunderstood also depends on how poorly the OP was done,
such as whether installing 7.22.1 was done before or after the malfunction, etc.,
so there's no point in complaining if things are specified poorly.

1 Like

have you checked power source? replaced power supplies and poe injectors?

image
To summarize the current situation:

I have updated the new devices to the latest firmware version, 7.22.1, and reviewed all the settings. As you can see in the attached image, it has been running for about 12 hours now. I am monitoring it to see if the connection drops again. If it does, I will let you know.

I even set up my own computer to constantly log everything from the problematic device; however, the worst part was that when the connection dropped before, it wasn't recording any logs at all.

Anyway, we will see. If the issue recurs, I will update you here.

Yes, even with the new setup, everything has been replaced. At both locations, I am using brand-new devices straight out of the box, including new PoE injectors and cables. I have renewed every single component you can think of.

For example, yesterday When I went to the second location, I checked the antenna and the lights were on yest. I connected my computer to the switch at that site, but I still couldn't see the disconnected antenna in MikroTik. I had no choice but to unplug and plug it back in again to get it working.

To clarify: The old devices had been running version 6.45.9 for 4 years without any issues. I have now decommissioned the old units and replaced them with brand-new devices out of the box at both locations. These new units also came with 6.45.9 from the factory. It had been working this way, but for exactly one week now, the device at the second location freezes every day at a specific time and doesn't recover until it's unplugged and plugged back in.

I actually just updated to version 7.22.1 yesterday. The older devices were running 6.45.9, and even the brand-new ones I just installed came with 6.45.9. I updated all of them to the latest version yesterday.

So, aside from checking on both devices the scheduler to see if something is there, even if it's insignificant (to you),
it depends on some external factor whether there's something so specific that it causes the same thing to happen at the same specific time...

Obviously, it's particularly difficult to understand what's happening via a forum.

Are you there to monitor when it happens?

Let's see what happens.

In a normal PtP "wireless wire" link there is nothing in 7.22.1 that works "better" than old version 6 AFAIK, if the issue continues I would downgrade both devices to latest v6, 6.49.19 and continue troubleshooting on that version.

1 Like

Thanks for the suggestion, jaclaz. Actually, the issues started while I was running v6.49.19. I spent a week troubleshooting the link, including replacing all the cabling and checking the physical alignment, but nothing seemed to help. I upgraded to the latest v7 version as a last resort to see if the newer firmware and drivers would resolve the instability.

If only device hasn't been shipped with factory firmware v7

...

...

If problem started before you replaced all components and it’s still there you must have left something behind or it wasn’t related to replaced components in the first place

Actually, I had already replaced all the components and the disconnects were still happening. However, I updated both devices to the latest firmware last night around 11 PM. It's been about 15 hours now and no drops so far. Fingers crossed! :slight_smile: I'm monitoring them right now: Signal Quality is steady at 80 and MCS is at 8. Looking good for now.

image

In addition, I’ve done the following:

I also set up a Watchdog on both ends. If one of the devices fails to ping the other side (meaning the link is frozen), the system will detect this and automatically reboot. This allows the connection to recover on its own without requiring any physical intervention.

I assume the uptime will reset to zero when it reboots, but so far, I don't think that has happened yet.

You could try temporarily setting the log to storage (disk), so that messages will survive a reboot.

So, some more details left out at the beginning, but it all comes full circle to my first post...

Why update something that's been working for 4 years?
And why, if something doesn't work, was NOT the first logical thing done: put everything back the way it was?