What can it be, ghost issue ?

Since last 12 days, there has been weird things happening in the network.
From morning 10am-8pm, random 100-200 customers keeps on disconnecting and connecting.

But the same customers immediately get stable after 7-8pm normally.

Here is the screenshot of a customer whom I randomly picked.
He got 81 times disconnected between 10am till 7pm.
After 7pm till 8am next day, he has not disconnected even once.

What could be the issue ?
Is it some kind of attack or some L2 loop or flooding of some customer who is normally live during those working hours and once his CPE is offline, then everything gets back to normal ?

Is there an easier to debug this situation ?

How to find the culprit ?
13403820_10154330113148939_3475555059040045863_o.jpg
13403325_10154330596423939_622379015969186398_o.jpg

What is 192.168.0.100? Problem is between that and next router. It was specially designed that way, so you can start finding problems. Also use traceroute.

It could be a client with a problematic ethernet. I have seen one deffective ethernet flood a network until the switch locked. You said this affects a restrict group of clients. They should be the ones served by the same tower. Could You look at the router on this particular tower?

There You should find the source of the problem - wich I believe is excessive traffic, or something that put a huge load on the router/AP.

Mac, that is his own Wireless router IP, I guess.
As normally the home routers have those IP 192.168.0.xx

Oh man, I exactly think the same. So, any customer with a bad ethernet port which is flapping can flood the whole network ?
Is there an easy way to find the culprit ? Wireshark or something can help ?
Actually, it affects random users, but yes, specific time hours. Means, during business hours, it happens, after business hours,it stops. Means some business customer has a bad port, I guess.
But the million dollar question is, how to find out who.

Is there a way to block a specific MAC address reaching the Microtik ?

I suspect 2-3 MAC address, which I think is trying to create the loop or broadcast.
I need to block them.
How shall I do that ?

Wireshark would allow You to see the traffic. The hard part is to know what to look for. I have zero experience with Mikrotik/Routerboard - just bought one, and still waiting to get it. But I have experience with network.

I would try something like this:

  1. Try to isolate the router/AP hit by the problem. AP would be better, since it probably will serve a smaller number of clients. Try to look at the CPU load, network traffic and/or rate of discarded packets. With 3k ms ping, someone, somewhere, is seriously overloaded.
  2. Now You can star looking into the clients connected to the problematic router/AP. You could just run a sniffer, and parse trough that mess. With luck the problem will stand out.
  3. But things are never easy. If step 2) didn’t solve your problem, due to sheer amount of traffic to parse, try to make it smaller. There are two very separated times: when it works, and when it don’t. You could create a list of connected clients to this very router/AP. Do it when it works, and when it doesn’t. Something like a script that writes to a txt file.

Pseudocode, in a UNIX world:
To get connected users when it works:
echo $MAC_addresses >> raw_list_good → IPs? Client IDs? Whatever floats your boat, and is easier to search for. Just remember: we will pass it trough a script later. It is important to NOT contain date, time, process numbers…

To get connected users when it doesn’t work:
echo $MAC_addresses >> raw_list_bad

Now, get only one match of each client, and put in order:
sort -u raw_list_good > final_list_good
sort -u raw_list_bad > final_list_bad

And find out who is on the bad list that isn’t on the good:
comm -2 -3 final_list_bad final_list_good

The short output (we hope it is short) should point a handfull of MACs to look into.
Now, this all assuming we ARE speaking about a problematic client.

Now You can run a sniffer, but using these IPs/MACs as a filter. That should keep to volume low.

Can I buy your time to help me with this debug, as its over my knowledge and I just need to find the culprit.

I can try to help You. As I said, my knowledge of Mikrotik and Routerboard is zero. There’s the network part.

Did You find the problematic router/AP?

How are you distributing to your customers. Do you have one central router. then a managed switch or wireless ap’s. if you have a managed switch look for a port with with errors.

Please message me your email or SKYPE or whatsapp ID.

Yes, one NAS router and then switch and then customers.

Is the switch managed switch or unmanaged. If it is managed log in and check the interfaces try locate which customer is causing the issue. Also if it is managed you may be able to use spanning tree or advanced detection to prevent loop backs and port errors from bring down your network

Already enabled Broadcast storm and Loop detection in each of the managed switches.

Did You find the switch port with errors?

Has loop detection and broadcast storm protection made any difference. Did you locate a port with an issue. You can also look at which customers are offline during the night and online during the day. see which ports they connect to and isolate them. adding one by one till issue re-appears

A. Are these devices on client side configured with bridges ?
B. Do these bridges have assigned unique MAC addresses ?
C. Do these bridgges have RSTP configured ?
D. Do they have priorities configured ? Which one is the root ?
E. Are there more clients connected during bussiness ours than since 7pm till 8am ?
F. How often bridges reconfigure your LAN ?
F. Are you sure that there is no switch working only during bussiness hours which is switched off after 7pm ? This switch could be badly wired and making loop which brings your lan to its knees.
G. Maybe someone has PC connected to LAN with two interfaces and makes loop ?

The G point actually seems to be problem. My team said, today they visited 3 clients and all 3 clients router cable was in LAN instead of WAN.
Now Im afraid, there might be many out of the the 1800 clients, who might have done the same. How would I find out ?

The G point actually seems to be problem. My team said, today they visited 3 clients and all 3 clients router cable was in LAN instead of WAN.

Now Im afraid, there might be many out of the the 1800 clients, who might have done the same. How would I find out ?

Well, You can be (almost) sure they will have DHCP servers running. Do a DHCP broadcast and see who responds. Remember to firewall the MAC of the machine used to do the test - otherwise your own DHCP server will answer.

Is the router at the client yours? If so, insert firewall rules to block unwanted traffic on the LAN ports and WAN ports. This way a wrong connected router will not route, and the client will contact support for help.

  1. Yes, 99.99% wireless routers have DHCP enabled, so yes, anyone who inserts cable to wrong port, they will be the culprit.
    How to do DHCP broadcast and see who responds ?
  2. The cheap routers dont have firewall rules features.