Yes, I’m bit lazy and use SwOS on that nice home central switch. Earlier fw 2.13 works great about year without any issue but once upgraded to lastest it will hang somehow on 10-20h ..
Every time I will notice that my wifi ap’s that are connected to this sw are unavailable and when I log in from my desktop that are directly connected to it it look good. But when login my router (rb4011, with quite basic cfg) I notice that all dhcp leases are gone, only hosts that are with static ip are working.
Once I reboot crs312, everything start working again and all clients get new leases. Tomorrow I need to drop second powerline out and put clockswitch to remain power cable to mitigate issue. But this sound …
At same time I upgraded ros 7.12.x to 7.14 on rb4011, if it has some bug among this sw.. (yes, my mistake, a) upgrade something that works b) upgrade both on same night)
crs312 is connected to rb4011 with aoc fiber dac (10G).
It seems that I need reboot crs312 now every 24h to keep clients get ip addr with dhcp. Just rebooted sw and all clients are now back in dhcp server lease table.
Question is not silly … All works nicely on previous version 2.13 and went broken after updated to lastest 2.16.
Tried also reload/reupload lastest second time .. I’m trying figure out is older version available anywhere.. (not found from Mikrotik DL or archive site)
Yes, have now done some debugging .. Hosts that has static ip works ok.
Lease time is “mikrotik standard 10min” … issue cause always near after 24h uptime .. But, this might be good to test.
I have also CSS326 and in that box latest fw works ok.. but it of course has different switch chip. Also I think that most of CRS312 users may run ROS instead SwOS.
Some cfg:
DHCP snooping is off
no ACL
Mikrotik discovery protocol is on
snmp enabled
5 vlans with port isolation and learning enabled
some ports has only tagged vlan, some only untagged
no LAG ports
all other settings are basically “default” state
if there is SW behind port of crs312, there is same issue with hosts that are connected to those sw. Most of connected sw are mikrotik, one is zyxel
my router (rb4011) was connected to crs312 .. first with aoc fiber dac and then tested with rj45 cat6 also .. same issue. Now router is connected to another sw instead this crs312.
sw has now 6d uptime and I set my desktop pc from static to dhcp and it drop nw access as it will not get ip from dhcp server. After change back to static it work as normal.
Added dhcp logging to rb4011 and I don’t see any (dhcp) requests from my desktop pc.
It look that after 24h from crs312 reboot, all dhcp traffic are dropped/filtered out until next reboot. With static ip all works ok.
in System tab, there is DHCP Snooping and port that provide dhcp service to this area (in my case RB4011 is connected to CSS326 that are connected with dac to CRS312) need to be selected AND next parameter under portlist of snooping ports, “Add Information Option” must be unchecked.
I have tested this now with 2 hosts that are connected to crs312 and current situation look promising.
Let’s test it now with this configuration on few days.
near all setups was identical as begining of case, only changes was “Add additional..” checkbox.
Today morning near nothing works. Only hosts that are connect directly crs312 works ok, they got ip addr from dhcp etc … BUT clients that are behind any connected box between crs312 and host, like wifi box have loss their ip after yesterday..
→ checked that direct connected host get ip after nw cable was small time unconnected
→ moved wifi ap to behind css326, not getting ip from rb4011 that are connected direct to crs312
→ after above finding, I moved rb4011 to css326 and then all wifi clients get ip
→ after router move to css326, then crs312 clients no more get ip
I thought Mikrotik would fix this quickly, because it’s a show-stopping bug. But it looks like we’re going to have to shout about it to get it patched.
Has everyone experiencing this problem created a support ticket ?
Otherwise changes are high they don’t even know about it … some MT staff uses this forum but not all. And those that do, do not read all posts..
Thank you for sharing your troubleshooting steps so far. However, I couldn’t replicate the same behavior in our labs.
Could you provide your network diagram and specify the ports/VLANs where the DHCP client and server are located (mark the clients that work and that do not)? Additionally, share screenshots from the SwOS VLAN, VLANs, and System pages. If you have LAG configured, include a screenshot of the SwOS LAG page as well.
If your setup includes multiple SwOS devices, try disabling the System “Add Information Option” setting on all switches and let us know the results.
If your network includes other devices that might add Option 82, ensure the SwOS ports connected to those devices are marked as “Trusted,” as described on the SwOS System page.
Ports that receive DHCP client packets with already added Option-82 must also be trusted, otherwise these packets are dropped.
For whatever it’s worth, My RB4011 is directly connected to two different CSS326 switches and then there is one more CSS326 in the house. All the switches have been running 2.16 since within a day or two of 2.16 coming out. No problems at all.
I’ve attached screenshots of the GUI’s Link and System tabs on 2.13. The Link tab shows all the currently connected devices and the negotiated speeds. The DHCP port will be COMBO4, which is connected via 10GbE copper to a Netgear XS724EM switch (firmware latest at 1.0.2.8 ). The XS724EM switch is connected via SFP to an Asus AX89X router.
Not sure if this matters, but there is a Wi-Fi node connected to the CRS312 (labeled AX88U). I first became aware of the problem in 2.16 because all my wireless clients were dropping off that node. That’s when I saw that every device connected to the CRS312 was no longer reachable.
No VLANS are in use on my network (all “disabled” in the CRS312 GUI), and no LAG is configured on the CRS312.
This switch is “mission critical” in the sense that it links key devices in our home network, which is why I rolled back to 2.13 at the first sign of trouble. I’m happy to upgrade to 2.16 and then capture and submit logs, but I don’t know how to do it in the SwOS GUI.
TantalizingEmu, I was still unable to replicate the same behavior. Try updating to version 2.16, take screenshots of all the SwOS pages, and report this to our support system or through e-mail: support@mikrotik.com.
If possible, capture DHCP packets on both client and server devices and share the pcap files as well.
While you are at v2.16, check if any of the ports on the SwOS RSTP page show State “discarding”, while they should be “forwarding” (ports that are up and running and not creating a L2 loop). Also, try disabling “Add Information Option” on the System page, and see if that makes any difference.
Okay, I upgraded to 2.16 via the GUI this morning. I have a busy day and don’t have time to dig into the steps required for DHCP packet capture. I am seeing “discarding” on a number of ports on the RSTP page, but each of these links is currently inactive/unused.
“Add information option” is unchecked on the ‘System’ page.
I’ve grabbed screenshots of each GUI page and will be sending that to the support e-mail.
Thanks for the speedy response and the willingness to help. I would love to know this is fixed for the upcoming 2.17 release.
Well, we’re now at two full days of uptime, which is more than I had previously on 2.16. It looks un-checking the “Add information option” was important to the stability I’m now seeing.
I don’t want to get my hopes up, as sakke42 briefly saw an improvement before the issue reappeared. But I did want to offer a 48-hour status update.