Community discussions

 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 12:44 pm

Hi all,

We encounter since a while heavy problems with Intel NICs like the 82575EB using MTU>1504 :
  • ROS 5.11 on X86 machine (QuadCore Xeon).
  • L2TP server not working anymore
  • EoIP very unstable
  • Router crashes and reboots randomly when receiving L2TP or EoIP packets
First of all, 2 tickets were opened about this issue. One several months ago which haven't given any success, and a new one this morning with fresher information. Please find below this information, I hope the community have some clues about it, or any experience using hi MTUs with Intel NICs.

What we are trying to do is to support more than 1508bytes of L2MTU on
these chipsets, especially the 82575EB one. This chipset supports more than 9000bytes MTU, which as
been confirmed using ping without fragmentation.

What we are doing :
  • as L2MTU is conditioned by the IP MTU of the physical ethernet device,
    we are using the following rule : effective L2MTU=IP MTU+4 which is correct.
  • to enable qinq support at 1500bytes, we simply set IP MTU of ether1 to
    1504 to have an effective L2MTU to 1508bytes. That works without
    crashes, EoIP and L2TP server behave correctly.
  • to enable qinqinq or to use VPLS, we need to increase this L2MTU a bit
    more. That's where our problem begins. Setting IP MTU for ether1 to more
    than 1504bytes (1505bytes or more) causes L2TP to stop working, and
    receiving EoIP or some L2TP packets causes the router to crash.
We also see this kind of lines in the logs about l2tp :
10:12:04 l2tp,debug received invalid packet, dropping
Using packet sniffer and wireshark shows us that the L2TP packet IS VALID.
But we are unable to have anything working, only crashes.

See below for a PCI resource list :
 0 09:03.0  Promise Technology, Inc.                             20275
(rev: 1)                                             3
 1 09:01.0  ATI Technologies Inc                                 ES1000
(rev: 2)                                           15
 2 08:00.0  Intel Corporation                                    82574L
Gigabit Network Connection (rev: 0)                 5
 3 07:00.0  Intel Corporation                                    82574L
Gigabit Network Connection (rev: 0)                10
 4 06:00.0  Intel Corporation                                    82574L
Gigabit Network Connection (rev: 0)                15
 5 05:00.0  Intel Corporation                                    82574L
Gigabit Network Connection (rev: 0)                11
 6 04:00.0  Intel Corporation                                    82574L
Gigabit Network Connection (rev: 0)                 5
 7 03:00.0  Intel Corporation                                    82574L
Gigabit Network Connection (rev: 0)                10
 8 02:00.1  Intel Corporation                                    82575EB
Gigabit Network Connection (rev: 2)                5
 9 02:00.0  Intel Corporation                                    82575EB
Gigabit Network Connection (rev: 2)               10
10 00:1f.5  Intel Corporation                                    82801I
(ICH9 Family) 2 port SATA IDE Controlle...         15
11 00:1f.3  Intel Corporation                                    82801I
(ICH9 Family) SMBus Controller (rev: 2)            11
12 00:1f.2  Intel Corporation                                   
82801IR/IO/IH (ICH9R/DO/DH) 4 port SATA IDE Co...         15
13 00:1f.0  Intel Corporation                                    82801IR
(ICH9R) LPC Interface Controller (rev: 2)          0
14 00:1e.0  Intel Corporation                                    82801
PCI Bridge (rev: 146)                                0
15 00:1d.7  Intel Corporation                                    82801I
(ICH9 Family) USB2 EHCI Controller #1 (...         14
16 00:1d.1  Intel Corporation                                    82801I
(ICH9 Family) USB UHCI Controller #2 (r...         15
17 00:1d.0  Intel Corporation                                    82801I
(ICH9 Family) USB UHCI Controller #1 (r...         14
18 00:1c.5  Intel Corporation                                    82801I
(ICH9 Family) PCI Express Port 6 (rev: 2)          10
19 00:1c.4  Intel Corporation                                    82801I
(ICH9 Family) PCI Express Port 5 (rev: 2)           5
20 00:1c.3  Intel Corporation                                    82801I
(ICH9 Family) PCI Express Port 4 (rev: 2)          15
21 00:1c.2  Intel Corporation                                    82801I
(ICH9 Family) PCI Express Port 3 (rev: 2)          11
22 00:1c.1  Intel Corporation                                    82801I
(ICH9 Family) PCI Express Port 2 (rev: 2)          10
23 00:1c.0  Intel Corporation                                    82801I
(ICH9 Family) PCI Express Port 1 (rev: 2)           5
24 00:06.0  Intel Corporation                                    3210
Chipset Host-Secondary PCI Express Bridge...         10
25 00:01.0  Intel Corporation                                   
3200/3210 Chipset Host-Primary PCI Express Bri...         10
26 00:00.0  Intel Corporation                                   
3200/3210 Chipset DRAM Controller (rev: 1)                 0
And one picture of the screen during the crash (on 5.1, but it's the same on 5.11), showing a kernel panic when when packet is received :
crash1.png
Does anyone encounter the same issue ? How people do manage to use this chipset without any crashes with EoIP or L2TP ?
You do not have the required permissions to view the files attached to this post.
 
User avatar
macgaiver
Forum Guru
Forum Guru
Posts: 1721
Joined: Wed May 18, 2005 5:57 pm
Location: Sol III, Sol system, Sector 001, Alpha Quadrant

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 1:08 pm

/system console screen set line-count=50

make it crush again and capture whole panic and send it to support and add supout.rif file
With great knowledge comes great responsibility, because of ability to recognize id... incompetent people much faster.
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 1:32 pm

/system console screen set line-count=50

make it crush again and capture whole panic and send it to support and add supout.rif file
OK, I'll try this immediately.
supout has already been sent to support.
Last edited by MimiFleX on Thu Jan 12, 2012 1:38 pm, edited 1 time in total.
 
User avatar
normis
MikroTik Support
MikroTik Support
Posts: 24188
Joined: Fri May 28, 2004 11:04 am
Location: Riga, Latvia

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 1:36 pm

/system console screen set line-count=50

make it crush again and capture whole panic and send it to support and add supout.rif file
Thanks - Funny, I didn't know that it was possible to dump the screen directly. Also Funny that support never told me to do such a dump instead of sending pictures...
It doesn't dump anything. This command just decreases font size, so that more lines fit on the screen, and you can take a better photo
No answer to your question? How to write posts
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 1:45 pm

Is there a way to prevent kernel from rebooting ?
In my case the screen is immediately blanked by the BIOS. I've to take a movie to catch the crash... I hope that I'll be able to extract a still image that is readable.
 
User avatar
normis
MikroTik Support
MikroTik Support
Posts: 24188
Joined: Fri May 28, 2004 11:04 am
Location: Riga, Latvia

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 1:52 pm

disable watchdog in RouterOS. watchdog is the one who reboots router at crash
No answer to your question? How to write posts
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 3:17 pm

Maris from support has replied that the bug comes from the Intel kernel module, and that I have to wait for a fix directly into the vanilla linux kernel before to have it corrected in ROS... can take a long time without any more clues.
disable watchdog in RouterOS. watchdog is the one who reboots router at crash
OK normis, I'll try that. Thanks.

EDIT: disabling watchdog timer does not help at all.. the machine blanks and reboots.
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 3:50 pm

Good to know, Intel has released an new release of the igb kernel few days ago.
See : http://downloadcenter.intel.com/detail_ ... ldID=13663

Do you, Mikrotik guys, plan to integrate this new version of this module quickly ?

I'm still unable to catch the whole crash debug message :(

EDIT : It would be nice for me to be able to do a :
echo 0>/proc/sys/kernel/panic
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 12, 2012 4:47 pm

After several tries, I'm unable to get better than the picture below, extracted from a 25fps video :
crash3.png
You do not have the required permissions to view the files attached to this post.
 
User avatar
normis
MikroTik Support
MikroTik Support
Posts: 24188
Joined: Fri May 28, 2004 11:04 am
Location: Riga, Latvia

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Fri Jan 13, 2012 12:54 pm

Send a full size image, or the original video - to support. Also include supout.rif file and full description of the issue.
No answer to your question? How to write posts
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Fri Jan 13, 2012 1:31 pm

Send a full size image, or the original video - to support. Also include supout.rif file and full description of the issue.
Normis, as previously said 2 tickets are already opened for this issue with supout.rif files, and with screen shots. I'll send you the original video, but I'm sorry I don't have a 100fps video camera... so the image is very difficult to extract. Could you send me a custom npk file with the reboot on crash disabled (see previous post) ?

Ticket numbers are :
  • Ticket#2012011266000222
  • Ticket#2011072766000267
They have been handled my Maris on your side, and he came to the conclusion that Mikrotik will never integrate Third Party drivers like 'official' Intel ones, that are not distributed with the vanilla linux kernel version. I can understand this position, but as you can see this issues lasts for 6 months now. So we have to do something, like to contact the igb kernel module maintainer, and to give him some indications about the problem. I can't do this alone as I don't have your source code to run a debugger tool.

Maris also told that you have bought such an Intel NIC for your lab, so you should be able to have more details than me.
 
2400baud
newbie
Posts: 28
Joined: Tue Nov 15, 2011 1:04 am

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Sun Jan 15, 2012 11:48 pm

Intel doesn't make much of an effort to sync their drivers to upstream Linux kernel source.
What's in the Linux kernel is basically "enough to get you up and running".
Note that many distros use the drivers from Intel directly rather than from the Linux kernel.
I don't have nearly enough data to say, but just from the Oops, might it be:

http://www.mail-archive.com/e1000-devel ... 04584.html
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Mon Jan 16, 2012 12:17 pm

As I'm able to reproduce the crash in our lab when needed for L2TP, the only thing I need is to have a kernel with the auto-reboot on crash feature disabled in the kernel, to see :
  • If the crash occurs always in the same function
  • If the crash is the same depending on the MTU value
  • To have a clear log, then I'll be able to dig into the kernel source code and try to understand what could happens with a >1504byte frame.
Moreover, I insist, the crash does not occur at each packet, but one thing is sure, L2TP doesn't work AT ALL when MTU>1504. The logs show that the L2TP packet is dropped because it is invalid, while using the packet sniffer and then opening the capture with wireshark shows that the packet is valid, thus :
  • The bug doesn't affect the packet sniffer at all, so I think the driver is doing its job correctly with the raw data.
  • The bug affects L2TP and EoIP, so I think the bug could be somewhere in a common part of code used by L2TP and EoIP... perhaps caused by some weird meta-information stored into the sk_buff by the driver.
The bug report found by 2400baud, provides a patch at the pptp level, not at the driver level... but that might not be the same bug.
 
User avatar
omidkosari
Trainer
Trainer
Posts: 616
Joined: Fri Sep 01, 2006 4:18 pm
Location: Iran , Karaj
Contact:

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Mon Jan 16, 2012 5:08 pm

Good researcher like myself .
Go on and don't give up . I hope success for you .
MTCNA , MTCRE, MTCWE, Mikrotik Certified Trainer
 
User avatar
martini
Member Candidate
Member Candidate
Posts: 296
Joined: Tue Dec 21, 2004 12:13 am

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Mon Jan 16, 2012 5:28 pm

just change ethernet from intel to Broadcom or marvell.
Mikrotik need a long time to compile new IGB driver in their kernel )))
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Mon Jan 16, 2012 7:48 pm

just change ethernet from intel to Broadcom or marvell.
Mikrotik need a long time to compile new IGB driver in their kernel )))
Impossible. The NIC chipsets are parts of an industrial motherboard, and were selected for several facts :
  • MTU up to 9160 bytes, in many tested cases broadcom is limited to 1522bytes of L2MTU => No VPLS over QinQ for example.
  • 4 TX IRQ and 4 RX IRQ per port, to allow good performances on a quad-core CPU like the Xeon we are using
  • support for SFP ports
  • ...
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Mon Jan 16, 2012 7:53 pm

  • The bug doesn't affect the packet sniffer at all, so I think the driver is doing its job correctly with the raw data.
  • The bug affects L2TP and EoIP, so I think the bug could be somewhere in a common part of code used by L2TP and EoIP... perhaps caused by some weird meta-information stored into the sk_buff by the driver.
I omitted to add that, when setting the L2TP server to 'disabled', RouterOS stops to crash, even if the L2TP client is still there and sending requests, so I think that the crash is not caused by the driver directly, but by some part of the kernel L2TP code not behaving correctly in some cases.
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Jan 19, 2012 2:40 pm

Janis from support told me that I have to wait for RouterOS 6 for a new kernel and new intel drivers.
I hope that will come soon.
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Fri May 11, 2012 12:10 pm

Can someone from Mikrotik can confirm that new intel drivers arre available in ROS6 beta ?
If YES, I'll will be happy to try to reproduce the on 5.x seen bugs to see if that helps.
 
User avatar
sergejs
MikroTik Support
MikroTik Support
Posts: 6615
Joined: Thu Mar 31, 2005 3:33 pm
Location: Riga, Latvia
Contact:

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Mon May 14, 2012 2:23 pm

MimiFlex, you have to wait for the next 6.0beta3 release.
when it will be released there will be line in changelog,
*) upgraded drivers and kernel (to linux-3.3.5);
 
pospanko
Member Candidate
Member Candidate
Posts: 272
Joined: Sun Dec 18, 2005 4:23 pm

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Tue May 15, 2012 9:32 am

MimiFlex, you have to wait for the next 6.0beta3 release.
when it will be released there will be line in changelog,
*) upgraded drivers and kernel (to linux-3.3.5);
Small digression...
Any plans to implement kernel 3.4 when it came out. There would be synthetic drivers for hyper-v...
Internet, Mikrotik & Network solutions
http://www.pro-ping.hr
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Aug 30, 2012 12:08 pm

Tried 6.0beta3 yesterday, no luck, many bugs, oops and kernel panics, VLANs no working anymore.... :)

To be continued....
 
MimiFleX
newbie
Topic Author
Posts: 49
Joined: Tue Jun 13, 2006 2:36 pm
Location: France

Re: Issues with MTU>1504 bytes on Intel NICs like 82575EB

Thu Sep 13, 2012 5:09 pm

Good news, my L2TP issue is solved on the latest 6.0rc1 build !
However, I can't tell now if the random crashes for EoIP are still present without plugging in a production environment.

Who is online

Users browsing this forum: MSN [Bot] and 12 guests