We recently rolled out a network consisting of the following:
Devices:
CCR1009: Router and internet gateway.
CRS328-4C-20S-4S+: Fiber distribution switch.
CRS328-24P-4s+: Edge switch in RouterOS mode with ROS 6.42.9 as well as 6.43.2
CRS328-24P-4s+: Edge switch in SWOS 2.8 Mode. 4 VoIP Phone @ 1.6W, 1 Cambium E410 AP @ 5W
10 of CRS112-8P-4S: Edge switch with ROS 6.42.9 as well as 6.43.2, 48v PSU, each with 2 POE devices. 1 VoIP Phone @ 1.6W, 1 Cambium E410 AP @ 5W
Software versions:
Mikrotik Tested with SW0S 2.8, 6.42.9 as well as 6.43.2
Cambium AP's tested with Version 3.5 R4, 3.7.1-r4 and 3.8-r3
VoIP phone are various makes and models.
Setup:
All ports bridged.
Tested with Allow fast path enabled and Fast Forward enabled
Tested with Allow fast path enabled and Fast Forward disabled
Tested with Allow fast path disabled and Fast Forward Disabled
Tested with RSTP and with no STP configured on bridge.
Tested with and without loop protect on affected ports.
There are no loops on the network.
Problem:
We will randomly lose connectivity to POE powered devices on edge switches.
We have removed all VLANS from network in order to do troubleshooting.
AP's and phones have a management IP on VLAN1 untagged.
Port status reckons that link is running while POE device is no longer responding.
Devices are powered-on according to POE status and power usage still remains the same ~5W AP and ~1.6W Phone regardless of having or losing contact to affected device.
MAC Address of affected device is still visible in some cases "/interface bridge host print" -> "11 D 58:C1:7A:0B:XX:XX ether1 bridge 51s"
In other cases, the MAC will be missing from hosts on affected interface.
MAC address of affected device will still be still visible under /ip arp, but will naturally dissapear after a while.
Device cannot be pinged and needs to be power cycled to restore operation.
Out of all the outages occuring it seems like the higher powered devices are affected more with 80% of outages occurring on 5W AP's and the rest on lower power devices.
Outages occurs on CRS328-24P-4s+ as well as CRS112-8P-4S.
[admin@unit4] > /interface ethernet poe monitor 0
;;; AP
name: ether1
poe-out: auto-on
poe-voltage: auto
poe-out-status: powered-on
poe-out-voltage: 47.8V
poe-out-current: 94mA
poe-out-power: 4.4W
power-cycle-host-alive: no
power-cycle-after: 2m27s
We have deployed automatic power cycle and ping timeout @ 10 minutes to try and achieve SOME uptime.
There seems to be fundamental issues on the hardware / software as the same model phones and AP's are functioning well on other deployments with HP, Cisco and Netgear switches.
Devices not requiring POE seems to stay up.
Does anybody have any ideas on how to troubleshoot this further, or do i need to change switches?