I'm an idiot, but so is the TCP/IP stack

I had to temporarily create an overlapping subnet (yes, yes, I know…) until the DHCP server cycles through. I have the following addresses on a single interface:

12.34.56.1/24
12.34.56.241/28

When sending an ARP request for 12.34.56.242, it’s asking for a reply to 12.34.56.1, which some devices will ignore because that IP is outside of it’s subnet.

I’m guessing this is a bug in the linux kernel, where it’s going on the first matching IP address rather than the best match. Disabling and re-enabling the 12.34.56.1/24 address pushes it to “the bottom of the list,” allowing the ARP request to ask for replies to be sent to the right IP. I do not know how this will behave after a reboot, but I’m guessing the results will be unpredictable.

Yes, I’m working on fixing these overlapping subnets. I wasn’t happy with doing them to begin with, but I did expect the kernel to deal with them gracefully. Then again, I’m pretty sure nobody has ever accused Linux of having a robust or sensible TCP/IP stack.

No, it’s working as expected. the /24 contains the /28.
You can not expect 12.34.56.240 to be both a network and an IP. Not to mention that you are not using the right address for the /28. So no, the Linux kernel isn’t failing, you are asking it to do something wrong, and it’s trying to work out how to do it. The fact that you have the /28 is almost irrelevant, because you have the /24. It’s the same thing as filling a cup of water, and saying that you have two bottoms and one top. No - you have one top, one bottom. Drawing a line on the cup doesn’t add a second bottom.

You are right, it is working as expected, it’s selecting the first matching address on the interface. The unpredictable part, is that we have no way of knowing or controlling which address is/was bound first at boot time. Presumably it’s in whatever order the config has been written, but we have no way to control that.

Routing, on the other hand, is 100% predictable. Everything else being equal, the most specific route wins. Period. That means that a /28 is preferred over a /24 and both are preferred over a /16. If I come back later and add a more specific route, then that route will take precedence. Period.

With regards to overlapping subnets on a given interface, I would have expected that all communication, including ARP, would use the best matching address rather than the first matching address.

Now that this has been said, it’s quite obvious that this is outside the scope of what MT can or should fix. Unless, of course, one of their developers should care enough to submit a patch to Linus for inclusion in the kernel. For me, I now know that this behavior is unpredictable and can take precautions against it.

BTW, how is that not the right address for a /28?

[admin@router] > /ip address print where interface=ether1.240
Flags: X - disabled, I - invalid, D - dynamic 
 #   ADDRESS            NETWORK         INTERFACE                                
 0   10.240.240.1/24    10.240.240.0    ether1.240                               
 1   10.240.240.241/28  10.240.240.240  ether1.240                               
[admin@router] > ip route print where dst-address in 10.240.240.0/24
Flags: X - disabled, A - active, D - dynamic, 
C - connect, S - static, r - rip, b - bgp, o - ospf, m - mme, 
B - blackhole, U - unreachable, P - prohibit 
 #      DST-ADDRESS        PREF-SRC        GATEWAY            DISTANCE
 0 ADC  10.240.240.0/24    10.240.240.1    ether1.240                0
 1 ADC  10.240.240.240/28  10.240.240.241  ether1.240                0

I suppose if you’re in the camp of using the last address as a gateway, then it is wrong. I always use the first. To each their own.

The rule you are referring to (most specific interface) applies only to IPs assigned to different interfaces and routes.
The solution would be virtual interfaces in this case, which will generate the proper routes and are supported by the kernel for ages.
But this seems not possible on ROS, and could be a feature request.
(If I remember correctly, there where virtual ethernet interfaces in old 4.x ROS, or ?)

Example on Debian Wheezy (kernel 3.2.x):

eth1 Link encap:Ethernet HWaddr a0:48:1c:b8:fc:8d
inet addr:192.168.74.1 Bcast:192.168.74.255 Mask:255.255.255.0
inet6 addr: fe80::a248:1cff:feb8:fc8d/64 Scope:Link
inet6 addr: x:x:x:x:1::1/64 Scope:Global
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:19274845 errors:0 dropped:0 overruns:0 frame:0
TX packets:16607617 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:3563576115 (3.3 GiB) TX bytes:3050583519 (2.8 GiB)
Interrupt:17

eth1:0 Link encap:Ethernet HWaddr a0:48:1c:b8:fc:8d
inet addr:192.168.74.100 Bcast:192.168.74.111 Mask:255.255.255.240
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
Interrupt:17

My error on the wrong address for the /28. I was thinking network, not IP. That said..

The address is predictable. It will default to the IP on the interface that contains the largest subnet. So it will pick .1 because the the /24 covers both .0 and .240 subnets. The problem that you are seeing is that the /28 network is .240, which is already considered to be an IP on the local .0 subnet.

I honestly can’t think of a solution that is going to work 100%. I suppose you could assign .241 to a different interface (if you have a spare) and plug it into the same switch.