Community discussions

MikroTik App
 
mhugo
Member Candidate
Member Candidate
Topic Author
Posts: 179
Joined: Mon Sep 19, 2005 11:48 am

General packetloss in ROS7 and multiple full BGP

Thu Jun 02, 2022 10:00 pm

Hi,

We are expericing some low amount of packeloss per node in our network running pure OSPF/BGP on darkfiber.

I believe this has to do with BGP updates affecting the routed traffic. The boxes 2216s, 1072s and 2004s are only running a couple of gigabit.

We have 2m routes in the tables from various sources transit and peering with 20 BGP speakers and 80 ospf speakers with close to 1000 LSAs internally in OSPF.

Pingloss is aprox 0,2-0,3% per hop and since we have a lot it adds up.

Previously I thought this was related to 2004s, but we see this now too when we have replaced some with 2216s to get rid of the issue.

Anyone else are seeing similar?

/Mikael
 
mhugo
Member Candidate
Member Candidate
Topic Author
Posts: 179
Joined: Mon Sep 19, 2005 11:48 am

Re: General packetloss in ROS7 and multiple full BGP

Fri Jun 10, 2022 12:37 pm

I have gotten the packetloss to go away by removing some of the BGP feeds and the routing table is now 1.3million.

MT has ticket and cannot reproduce but I have offered access to non-production routers and BGP feeds.

Clearly BGP in ROS7 needs some fine tuning in how it handles bigger bgp tables.
 
User avatar
StubArea51
Trainer
Trainer
Posts: 1739
Joined: Fri Aug 10, 2012 6:46 am
Location: stubarea51.net
Contact:

Re: General packetloss in ROS7 and multiple full BGP

Fri Jun 10, 2022 5:29 pm

Are you using hardware offload on the CCR2216?
 
mhugo
Member Candidate
Member Candidate
Topic Author
Posts: 179
Joined: Mon Sep 19, 2005 11:48 am

Re: General packetloss in ROS7 and multiple full BGP

Fri Jun 10, 2022 8:21 pm

Are you using hardware offload on the CCR2216?
No I loose connectivity with L3HW for Loopback0 so waiting for Mikrotik to fix.

The 2216 is in peering AS where we only take in peers and have rest in a 0.0.0.0/0 route from our other AS acting as upstream.

/ip/route/print count-only
676192
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

Re: General packetloss in ROS7 and multiple full BGP

Mon Jun 13, 2022 1:32 pm

Hi,

We are expericing some low amount of packeloss per node in our network running pure OSPF/BGP on darkfiber.

I believe this has to do with BGP updates affecting the routed traffic. The boxes 2216s, 1072s and 2004s are only running a couple of gigabit.

We have 2m routes in the tables from various sources transit and peering with 20 BGP speakers and 80 ospf speakers with close to 1000 LSAs internally in OSPF.

Pingloss is aprox 0,2-0,3% per hop and since we have a lot it adds up.

Previously I thought this was related to 2004s, but we see this now too when we have replaced some with 2216s to get rid of the issue.

Anyone else are seeing similar?

/Mikael
Hi Mikael,

we also planned replacing our CCR2004-1G-12+2XS with CCR2216 because of the packet loss we are getting. Hoped this might go away with the more powerfull CCR2216 and blamed packet loss to the massive amount of BGP sessions we have. Good or bad to know that CCR2216 might not be the cure for this by now.
We are operating a ring of 5 CCR2004. All connected to two internal bird route reflectors. Two are connected to one upstream provider (full IPv4 und IPv6 table), one is connected to another upstream provider and the AMS-IX (110 BGP connections) another one is connected to the DE-CIX (180 BGP connections). The last one is just serving our office and is only connected to the route reflectors. Our backbone is held together by OSPF.
Which is really strange: There are some days where everything works OK. But then especially the DE-CIX router sometimes starts spiking up CPU load which leads to massive packet loss. VoIP is just not working on those days. And I can't figure out the reason for it.
We already tried to replace one of the CCR2004 with one CCR2216 but the lead to unreachable adjacent routers. So for now we are planning further experiment with CCR2216, full table und adjacent routers.
Another strange thing: When our DE-CIX router reaches a certain amount of peers (which slightly differs) the routing process seem to start to behave odd. It constantly starts spitting out route changes to our route reflectors which are then distributed throughout our networks. Our BGP traffic then goes from the normal 150kBit/s up to constant 2MBit/s just for announcements. First we thought one of our peers is doing malicous things. But after enabling one peer after another it is total random, which peer triggers that behaviour. It seems to be a function of number of active peers and received routes. But I haven't figured out yet how to exactly trigger this problem. It is quite difficult to rebuild this in the lab because I do not have that amount of real peers in the lab. I could supply my lab with multiple full tables. But that does not simulate those massive amount of full table subsets with duplicate routes you get at a IXP.
How did you configure affinity on your BGP sessions? I haven't really figured out how exectly behaves this. Tried to use same numbers for sessions that should be operated in the same process. Some sessions are running in the same process but some are running in extra process. Seems it is kind of random. Only setting that works as expected is "alone". Even "remote as" is grouping different remote ASN together in one process.

Dennis
 
mhugo
Member Candidate
Member Candidate
Topic Author
Posts: 179
Joined: Mon Sep 19, 2005 11:48 am

Re: General packetloss in ROS7 and multiple full BGP

Mon Jun 13, 2022 1:44 pm

[/quote]
How did you configure affinity on your BGP sessions? I haven't really figured out how exectly behaves this. Tried to use same numbers for sessions that should be operated in the same process. Some sessions are running in the same process but some are running in extra process. Seems it is kind of random. Only setting that works as expected is "alone". Even "remote as" is grouping different remote ASN together in one process.
[/quote]

Hi Dennis!

We are running all processes in "alone". The other settings gave worse results.
 
User avatar
mrz
MikroTik Support
MikroTik Support
Posts: 7044
Joined: Wed Feb 07, 2007 12:45 pm
Location: Latvia
Contact:

Re: General packetloss in ROS7 and multiple full BGP

Mon Jun 13, 2022 1:56 pm

BGP sessions themselves or how affinity is set cannot influence whether interfaces are dropping packets. These processes just receives the routing information picks the best route and passes it to the FIB. So changing BGP affinities will not affect packet loss on the router.

Since there is no mention on what destinations packet loss occur, I assume that it might be to those destinations where BGP keeps receiving updates or withdraws to those specific destination, then there might be brief packet loss to those destinations.
 
mhugo
Member Candidate
Member Candidate
Topic Author
Posts: 179
Joined: Mon Sep 19, 2005 11:48 am

Re: General packetloss in ROS7 and multiple full BGP

Mon Jun 13, 2022 2:00 pm

It occurs to common destinations like 8.8.8.8.

Based on the fact that this becomes less when we prioritize down some bgp peers or remove some I have a feeling this has to do with destinations with multiple paths having same weight.

As stated in my ticket this is probably hard to simulate in lab but I have extra 2004s and 2216s I can set up with the same peers and give MT access to if it helps you in any way.
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

Re: General packetloss in ROS7 and multiple full BGP

Mon Jun 13, 2022 2:03 pm

BGP sessions themselves or how affinity is set cannot influence whether interfaces are dropping packets. These processes just receives the routing information picks the best route and passes it to the FIB. So changing BGP affinities will not affect packet loss on the router.

Since there is no mention on what destinations packet loss occur, I assume that it might be to those destinations where BGP keeps receiving updates or withdraws to those specific destination, then there might be brief packet loss to those destinations.
Hi mrz,

that might explain why we are getting massive packet loss when the constant route spitting is triggered. Because then random routes get updated all the time. Now it is interesting how to find out what is triggering this route spitting. I've already tried to make a pcap of BGP traffic. But I haven't found a pattern in the updates. Do you have any hint how I can help you to find the reason for this? supout? pcap of bgp stream? Special logging?

Dennis
 
User avatar
netzwerghh
Frequent Visitor
Frequent Visitor
Posts: 74
Joined: Sun Aug 07, 2011 4:23 pm
Location: Hamburg, DE
Contact:

Re: General packetloss in ROS7 and multiple full BGP

Mon Jun 13, 2022 2:09 pm


Hi Dennis!

We are running all processes in "alone". The other settings gave worse results.
That would make me uncomfortable if 100+ sessions start to eat up my 4 cores. Might lead to hold timer timeouts. If that happens to a full table stream, that would be bad. Might lead to even more packet loss, killing the OSPF.
 
nocnetlatin
just joined
Posts: 7
Joined: Tue Aug 22, 2023 2:27 am
Location: Argentina
Contact:

Re: General packetloss in ROS7 and multiple full BGP

Tue Aug 29, 2023 4:32 pm

Are you using hardware offload on the CCR2216?
No I loose connectivity with L3HW for Loopback0 so waiting for Mikrotik to fix.

The 2216 is in peering AS where we only take in peers and have rest in a 0.0.0.0/0 route from our other AS acting as upstream.

/ip/route/print count-only
676192
I am waiting for the same thing for the same problem, my 2216 (I have 4) fail with l3hw is active, it crashes if I activate l3hw.
 
User avatar
StubArea51
Trainer
Trainer
Posts: 1739
Joined: Fri Aug 10, 2012 6:46 am
Location: stubarea51.net
Contact:

Re: General packetloss in ROS7 and multiple full BGP

Fri Sep 01, 2023 12:40 pm

How many routes do you have?
 
nocnetlatin
just joined
Posts: 7
Joined: Tue Aug 22, 2023 2:27 am
Location: Argentina
Contact:

Re: General packetloss in ROS7 and multiple full BGP

Fri Sep 01, 2023 7:24 pm

How many routes do you have?
We handle 12k. We have now replaced the CCR2216 with a CRS504 and it works great, handles 12k routes and has L3HW active and handles 110gbps. Throughput

Who is online

Users browsing this forum: No registered users and 6 guests