VPLS stops working 1-way

Well, color me stupid. I forgot (again) to get a supout before “fixing” it.

We have 13 VPLS tunnels, some of which are misbehaving. in that they will stop transmitting in one direction. Bouncing the offending interface will resolve the issue. No reboots, reconfigurations, or other interventions are required.

The hardware involved includes TILE, PPC, and MIPSBE. Software includes a range from 5.26 to 6.19. Yeah, we need to upgrade.

Some of these sites have been running great for 2+ years, the problem with VPLS is recent and we’ve been unable to figure out what’s causing it. There are no obvious issues with OSPF or interface MTUs, again, simply bouncing the VPLS interface restores communication.

Any ideas?

What’s the smallest MTU in the path?

One or two of those tunnels might be passing through a RB435G, so 1520. Most of the PTP links are UBNT Rockets with the MTU set to 1600. It’s not an MTU issue, full 1500 byte packets pass without issue.

The issue here, is that the traffic just up and stops flowing in one direction (Tx from a CCR-1036-12T-4S). The ROS is at version 6.5, as this unit was installed in fall 2013. Yes, it needs upgrading, but that doesn’t explain why MPLS/VPLS has been running flawlessly for the 17 months and is only acting up in the last few weeks.

Checking the logs on the three sites that were affected, I didn’t see any OSPF/routing issues or any other issues logged.

I’ve never managed to get any useful logging with regards to MPLS/VPLS, which makes troubleshooting VERY difficult to say the least.

You might want to do a few packet captures on your network if you have MTU as small as 1520 in an MPLS network handing off 1500 byte frames in VPLS. You are almost certainly fragmenting the frames to be able to pass 1500. While this can work for a while, eventually, it will cause issues due to load or some kind of MTU mismatch.

Take a look at this thread for info on MTU sizing:

http://forum.mikrotik.com/t/transparently-bridge-networks-using-eoip-or-mpls-vpls/84503/1

Here’s a quick diagram of the part of the network that is affected. There is a VPLS tunnel between the Gateway and each site carrying VLAN traffic. The smallest link, the RB433 between YORK and PTLO is only used if other links break.

Packet sniffing is a good idea, however, where this is breaking, there are no packets to sniff.

The problem isn’t that packets aren’t arriving, they are not being sent.

For example, the VPLS tunnel between the Gateway and TDLE was in a running state. On the gateway, traffic was coming in from TDLE, but it was not sending anything out to TDLE. Bouncing the VPLS interface instanty restored bi-directional traffic.

Thanks!

Hi,

Have you solved this one?

I am having similar problems but it is the first time I am establishing a VPLS so I could be doing something wrong. I practically followed wiki example.
Network: R1 <-- Nanobridge WDS link--> R2 <-- Airfiber link--> R3 (OSPF routable network)

I am able to establish VPLS betwen R1-R2 and R2-R3, but not from R1-R3.
On R1 side I get running status but not on R3.

Traceroute from R1 to R3

ADDRESS LOSS SENT LAST AVG BEST WORST STD-DEV STATUS

1 172.31.4.137 0% 11 1.7ms 2.3 1.4 3.5 0.8 MPLS:L=1270,E=0
2 10.0.0.1 0% 11 3.1ms 2.7 1.3 3.7 0.9

Any ideas?

I’ve not found a solution to this yet, but I’ve also not had this particular issue come up.

It does sound, however, like you’re having a much different problem. You might want to start a new thread and post the output from the following commands from R2 (the one in the middle):

/ip export
/mpls export
/routing export