DHCP server renew challenge

I am having a challenge with the Mikrotik dhcp server and some devices. Occasionally I get this when a few devices renew the dhcp lease. I found this while testing a new device, so to shorten the test, I set the dhcp server lease time to 10 minutes.

Here is the request from my client for a renewal:

07:39:55 dhcp,debug,packet dhcp3 received request with id 1831 from 192.168.2.250
07:39:55 dhcp,debug,packet flags = broadcast
07:39:55 dhcp,debug,packet ciaddr = 192.168.2.250
07:39:55 dhcp,debug,packet yiaddr = 192.168.2.250
07:39:55 dhcp,debug,packet siaddr = 192.168.2.1
07:39:55 dhcp,debug,packet chaddr = 00:AA:BB:CC:DE:02
07:39:55 dhcp,debug,packet Msg-Type = request
07:39:55 dhcp,debug,packet Client-Id = 01-00-AA-BB-CC-DE-02
07:39:55 dhcp,debug,packet Host-Name = “WIZnetCCDE02”
07:39:55 dhcp,debug,packet Address-Request = 192.168.2.250
07:39:55 dhcp,debug,packet Server-Id = 192.168.2.1
07:39:55 dhcp,debug,packet Parameter-List =
Router,Subnet-Mask,Domain-Server,Domain-Name,Renewal-Time

Here is the response from the Mikrotik dhcp server issuing another 10 minutes on this lease:

07:39:55 dhcp,debug,packet dhcp3 sending ack with id 1831 to 255.255.255.255
07:39:55 dhcp,debug,packet flags = broadcast
07:39:55 dhcp,debug,packet ciaddr = 192.168.2.250
07:39:55 dhcp,debug,packet yiaddr = 192.168.2.250
07:39:55 dhcp,debug,packet siaddr = 192.168.2.1
07:39:55 dhcp,debug,packet chaddr = 00:AA:BB:CC:DE:02
07:39:55 dhcp,debug,packet Msg-Type = ack
07:39:55 dhcp,debug,packet Server-Id = 192.168.2.1
07:39:55 dhcp,debug,packet Address-Time = 600
07:39:55 dhcp,debug,packet Router = 192.168.2.1
07:39:55 dhcp,debug,packet Subnet-Mask = 255.255.255.0
07:39:55 dhcp,debug,packet Domain-Server = 68.105.28.16

Here is the entry in “/ip dhcp-server lease” immediately after this renewal:

1 D address=192.168.2.250 mac-address=00:AA:BB:CC:DE:02
client-id=“1:0:aa:bb:cc> :de:> 2” server=dhcp3 status=bound
expires-after=4m35s last-seen=25s > active-address=192.168.2.250
active-mac-address=00:AA:BB:CC:DE:02
active-client-id=“1:0:aa:bb:cc> :de:> 2” active-server=dhcp3
host-name=“WIZnetCCDE02”

It shows 5 minutes, not the 10 minutes that was sent in the renewal (Address-Time = 600). This causes the lease to expire just before the next renewal, and causes log entries like this every 10 minutes:

07:24:54 dhcp,info dhcp3 assigned 192.168.2.250 to 00:AA:BB:CC:DE:02
07:34:54 dhcp,info dhcp3 deassigned 192.168.2.250 from 00:AA:BB:CC:DE:02
07:34:55 dhcp,info dhcp3 assigned 192.168.2.250 to 00:AA:BB:CC:DE:02

The client device successfully renews a second or two after the lease expires, but will not renew successfully (edit: on the dhcp server end only, the client thinks all is ok) at half the lease time.

Anyone here see that as ok, and not a fail? I have access to the firmware and driver for this ethernet device, so if someone sees a problem with my device request, that can be modified.

you can run this command:
ip dhcp-server lease print detail followto see what is the actual expires-after timer for the lease. Test setup i have set this to a correct value every time on the renew request from my Linux box.

edit: what happens if you use different dhcp client?

here are some details from RFC2131:

The client maintains two times, T1 and T2, that specify the times at
which the client tries to extend its lease on its network address.
T1 is the time at which the client enters the RENEWING state and
attempts to contact the server that originally issued the client’s
network address. T2 is the time at which the client enters the
REBINDING state and attempts to contact any server. T1 MUST be
earlier than T2, which, in turn, MUST be earlier than the time at
which the client’s lease will expire.

Most of my clients renew successfully at the T1 time. Just a few don’t, and those are the ones that concern me. The dhcp server is telling my client that it renewed ok, and it has another 10 minutes, when in fact, your dhcp server is going to expire and remove the lease in 5 minutes. That is my complaint.

If it just didn’t renew at the T1 time, ok. If it renewed at T1 and issued a new Address-Time of 5 minutes, ok. But it doesn’t. It issues 10 minutes, and expires in 5. ??

I see when the lease expires. The lease expires when the expires-after time reaches zero, just like it should, and the lease disappears from the “/ip dhcp-server lease” list. A couple seconds later, the new lease appears when the same client requests a renewal.

It isn’t the renewal when the lease has expired that is the trouble. It is the T1 renew time. That is half the lease time. In my test case, the lease is 10 minutes, and the renewal attempt that fails is at 5 minutes. Every other renew fails. It is always the renewal attempt at T1, never the one after the lease expires. It always renews ok. It is when there is “time on the clock” (expires-after > 0) that it fails.

edit: And what I mean by “fails” is the server issues and sends a successful renewal, but fails to store the new expires-after time in the dhcp server database. It does manage to update the last-seen value in the database on the failed renew, just not expires-after. :frowning:

what dhcp-clients are these? Do they have anything in common? what happens if you set lease time to 20 minutes (just to check weather dhcp-client has a problem that T1 is set incorrectly, and the lease renewal process is started too late that lease is timing out.

All my routers have NTP client running. The times in all the stuff I have posted and sent, including the emails to support, document the times and the response.

I don’t know about the few other devices that fail. I have access to one of those devices, and the source code for the firmware and library for that device. How lucky is that? :wink:

I started at 2 hours lease time, and the fails (not updating expires-after but updating last-seen) happen at 1 hour. I changed to 1 hour lease, and the fails happen at 30 minutes. I changed to 10 minutes lease, and the fails happen at 5 minutes.

Remember, this is only every other renewal time. When the lease does expire, the same device using basically the same renewal packet works fine.

@Janis: BTW, despite your email message to the contrary, I don’t believe that working half the time is ok. I like my stuff not broken.

a tried Linux (Kubuntu 13.04) and RouterOS dhcp-client and have not seen any issues with these clients renewing time and getting correct lease time. RB2011 is serving as DHCP-server.

expires-after is always set correctly to value of the lease time and counted down, and around half of the lease time it is renewed by client. (That is with latest build of RouterOS from new branch)

using this command:
/ip dhcp-server lease print detail followyou should see what exactly is set as a value of expires-after field is that set correctly to lease time value. If it is, then problem should be with your client, if on the other hand, value is incorrect i will want to get my hand son the DHCP client you are using so i can test locally with it and see why DHCP-server would set up lease wrong casing it to expire before active and connected client is renewing it.

Of course set lease time to something reasonable small, like these 10 minutes you started the topic.

That is what I have been doing. What I want to know is: if my client device is malfunctioning, why is your dhcp server saying “OK! You are good to go!”, and then doesn’t store the Address-Time in its database?

07:39:55 dhcp,debug,packet dhcp3 sending ack with id 1831 to 255.255.255.255

If the request was screwed up, why did your dhcp server not return NACK? It sends ACK and Address-Time=600. How would my client detect that as a fail?

Actually, I had noticed this happening to other devices before I got the current failing device, but had no way to test it correctly. Now that I have a device that I can thoroughly tweak, I can troubleshoot it. That is why I am such a pain in the bottom sometimes. You have my apology in advance! :smiley:

Thanks for taking the time with this. It is good to know someone up there cares. :smiley:

Add: The device is a Wiznet W5100 using the Arduino ethernet library. I had to debug it a bit, but all is working well now, except this renew thing.

to resolve this it would be very useful if you could create supout.rif file after the moment when wrong expire time is set up for that client lease as it is not clear why client could get that.

07:39:55 dhcp,debug,packet dhcp3 sending ack with id 1831 to 255.255.255.255

this is normal behavior when DHCP-server gets indication that unicast renew request has failed (now or previously) and new renew is sent via broadcast. This can happen and is nothing new/alarming.

I sent the supout with one of the emails as requested.

this is normal behavior when DHCP-server gets indication that > unicast renew request has failed > (now or previously) and new renew is sent via broadcast. This can happen and is nothing new/alarming.

I never get an indication that the renew fails. So it fails and tells the client via broadcast it worked? But just as a kicker, it lies about the lease time? Nothing new or alarming?

edit: I wanted to mention one more thing. I have tried the renewal request both unicast and broadcast with the same result.

I am about to let the MikroTik development team rummage around in my router. I feel so naked! :laughing:

MikroTik developers wandered around in my router today. It wasn’t too bad, except I did not hear what they found, nor did I get a “we’re finished”, but that is ok. The doctor says I will be able to sit without too much discomfort in a couple days. :laughing:

Add: I’m just being funny here. I think this was a really good thing they did.

The developers found a problem with the client renewal request. I want to thank them for their help and patience. I’ll contact the suppliers of the client equipment and see if I can get them to straighten it up.

Here is the result of the evaluation:

Hello,

FYI - next version v6.1 will have workaround. but if we will have some new
unexpected complains from other customers we will revert back to standard.

Regards,
Janis Megis

If you start having trouble with the dhcp server in v6.1, it is my fault. Insure to file a report if you have problems.

Hi Surfer or MT guys, can someone confirm if the last 6.1rc1 have this workaround for test? I have the same problem with a lot of tplink routers in my network, sometime with other vendors too.
Thanks!

yes, it is available in 6.1rc1 build

Hello Janis, i think the DHCP Server problem was solved in this version 6.1rc1, but many times in the day the network is down and need to reboot the RouterOS, this occur in a great bridged hotspot network with 2 routers, 1 PC x86 and 1 RB1100AH.
Now we are downgrade again to the 6.0 and the DHCP Server problem continue, can help me with this problem?
Thanks and regards!

do you have support output file from RB1100AH in state where it has to be rebooted? and what problems did you exactly have that forced you to reboot the router?

Hello Janis, i don´t have a suppout from the RB. The problem is, this RB have generally around 50Mb of traffic and when the clients start to calling because don´t have service, the general traffic of RB is around 10Mb o less and we need to reboot the RB and all start to run fine again by some hours.