2 BUGS with GRE keepalives

Hi,

I’ve been looking at how the GRE keepalives work and I believe I’ve discovered two bugs. Actually one bug and one design problem.

  1. The design problem – as far as I can tell from the Cisco documentation, the keepalive process is designed to be self-supporting, even on systems that don’t understand them. Linux is broken in this regard (more later), but the principle is that you send an encapsulated return packet to the other end … the other end unwraps and sends the packet back. Upon receipt of the return packet the system can reset the clock and consider the tunnel to be up.

The Mikrotik implementation seems to be broken … it doesn’t care about the “return” packet at all. The link is only considered up if the other side actually sends keepalive packets to it. This means it will only ever work with other systems that implement keepalive.

You can reproduce easily with two routers and a GRE tunnel between them … if you enable keepalive on one side, but not the other. (According to Cisco this should be fine, but only one end will be able to correctly identify tunnel state.) [You might need to disable neighbor discovery on the tunnel as this can also keep the tunnel up.]

In this case you will see the GRE keepalive being sent from the router with keepalives enabled, you will also see the unwrapped packet coming back, but the link will never be marked as running. Unless I’m missing something, this is the wrong behavior.

I have written a small daemon for Linux that works around the Linux incompatibility and correctly unwraps the keepalives and sends the result back to the Mikrotik router and you see the same thing … the tunnel is never marked as running.

As soon as I modify the Linux daemon to actually send a keepalive packet (rather than just respond) the tunnel is immediately marked as running.

This will break interoperability with any system that doesn’t support keepalives, or where keepalives are not enabled … which is completely against the Cisco design.

To quote the Cisco doc…

•The tunnel keepalive mechanism functions even if the far tunnel endpoint does not support keepalives.

  1. The bug – it’s fairly minor and doesn’t impact any functionality, but the keepalive “return” packet is sent on the wire much longer than needed. The IP packet is 0x18 length (as per the header) but the length on the wire is 0x2e, with the remaining data being zero padded.
 4 time=2.343 interface=main-bridge direction=rx data=
     0000: 45 00 00 18 00 00 00 00  fe 2f 7c f5 xx xx xx xx  E....... ./|...XX
     0010: xx xx xx xx 00 00 00 00  00 00 00 00 00 00 00 00  ..X..... ........
     0020: 00 00 00 00 00 00 00 00  00 00 00 00 00 00        ........ ......

I will raise these with support … but I’m interested if anyone else has a view?

Lee.

this bug still exists..! I can not get proper keepalive responses from tunnelbroker.ch and tunnel goes down.

could you please share here your correspondances with the mikrotik support