Community discussions

MikroTik App
 
User avatar
Cha0s
Forum Veteran
Forum Veteran
Topic Author
Posts: 999
Joined: Tue Oct 11, 2005 4:53 pm

Traffic Flow Octets Counter wrap

Thu Nov 26, 2015 7:25 pm

Hello,

I am using Traffic Flow with pmacct (nfacct) to do IP Accounting.

I've noticed that if a flow exceeds ~4GBytes in less than a minute (which is my 'active flow timeout') the exported flow 'Octets' counter wraps around losing a significant amount of total data measured.

I believe this is an issue with the Octet counter being 32bit unsigned and if the traffic is over that threshold (4294967296) then the exporter wraps around the counter without first sending the flow to the collector (I am not sure how other vendors handle this).

This is quite serious since it results in very wrong traffic totals!

Here is my traffic flow configuration:
/ip traffic-flow
set active-flow-timeout=1m cache-entries=1k enabled=yes interfaces=sfp1
/ip traffic-flow target
add dst-address=X.X.X.X v9-template-refresh=60 v9-template-timeout=1m
And here are a couple of flow captures from wireshark.
Flow 3
    [Duration: 59.590000000 seconds (switched)]
    Packets: 5700194
    Octets: 4255323704
    InputInt: 16
    OutputInt: 0
    SrcAddr: 31.X.X.254
    DstAddr: 185.X.X.254
    Protocol: UDP (17)
    IP ToS: 0x00
    SrcPort: 2043 (2043)
    DstPort: 2299 (2299)
    NextHop: 185.X.X.X
    DstMask: 0
    SrcMask: 0
    TCP Flags: 0x00
    Destination Mac Address: Routerbo_XX:XX:XX (d4:ca:6d:XX:XX:XX)
    Post Source Mac Address: 00:00:00_00:00:00 (00:00:00:00:00:00)
    Post NAT Source IPv4 Address: 31.X.X.254
    Post NAT Destination IPv4 Address: 185.X.X.254
    Post NAPT Source Transport Port: 0
    Post NAPT Destination Transport Port: 0
Flow 3
    [Duration: 59.590000000 seconds (switched)]
    Packets: 5532208
    Octets: 4003344704
    InputInt: 16
    OutputInt: 0
    SrcAddr: 31.X.X.254
    DstAddr: 185.X.X.254
    Protocol: UDP (17)
    IP ToS: 0x00
    SrcPort: 2043 (2043)
    DstPort: 2299 (2299)
    NextHop: 185.X.X.X
    DstMask: 0
    SrcMask: 0
    TCP Flags: 0x00
    Destination Mac Address: Routerbo_XX:XX:XX (d4:ca:6d:XX:XX:XX)
    Post Source Mac Address: 00:00:00_00:00:00 (00:00:00:00:00:00)
    Post NAT Source IPv4 Address: 31.X.X.254
    Post NAT Destination IPv4 Address: 185.X.X.254
    Post NAPT Source Transport Port: 0
    Post NAPT Destination Transport Port: 0
At the time of those captures, a bandwidth test (UDP, 1500bytes, 1Gbit, receive) was running for quite some time.
So running at 1gbit for 60seconds (active flow timeout) it should have measured at least ~7864320000 Octets (~7.3GB)

If I reduce the bandwidth test to 460mbit then the exported flows seem to report the traffic properly since the Octets counter does not exceed the 32bit unsigned maximum.
Though I see quite a lot of overhead and I wonder why that is.
At 460mbit sustained traffic, in 60seconds it should measure ~3617587200 octets (=3.36GB).
But instead it measured 4269160500 (=3.9GB)
I am not sure where the extra ~600MB came from.
Flow 6
    [Duration: 59.590000000 seconds (switched)]
    Packets: 2846107
    Octets: 4269160500
    InputInt: 16
    OutputInt: 0
    SrcAddr: 31.X.X.254
    DstAddr: 185.X.X.254
    Protocol: UDP (17)
    IP ToS: 0x00
    SrcPort: 2058 (2058)
    DstPort: 2314 (2314)
    NextHop: 185.X.X.X
    DstMask: 0
    SrcMask: 0
    TCP Flags: 0x00
    Destination Mac Address: Routerbo_0d:95:72 (d4:ca:6d:XX:XX:XX)
    Post Source Mac Address: 00:00:00_00:00:00 (00:00:00:00:00:00)
    Post NAT Source IPv4 Address: 31.X.X.254
    Post NAT Destination IPv4 Address: 185.X.X.254
    Post NAPT Source Transport Port: 0
    Post NAPT Destination Transport Port: 0
But if I increase the bandwidth test to 480mbit for example, then the exported flow has its counter wrapped around losing a significant amount of data (ie: ~4GBytes of data)
Flow 3
    [Duration: 59.590000000 seconds (switched)]
    Packets: 2865308
    Octets: 2994704 <-- Only 2.8MB?! Even with 64byte packets, based on the measured packets above, it should have measured > 174MBytes of data!
    InputInt: 16
    OutputInt: 0
    SrcAddr: 31.X.X.254
    DstAddr: 185.X.X.254
    Protocol: UDP (17)
    IP ToS: 0x00
    SrcPort: 2055 (2055)
    DstPort: 2311 (2311)
    NextHop: 185.X.X.X
    DstMask: 0
    SrcMask: 0
    TCP Flags: 0x00
    Destination Mac Address: Routerbo_0d:95:72 (d4:ca:6d:XX:XX:XX)
    Post Source Mac Address: 00:00:00_00:00:00 (00:00:00:00:00:00)
    Post NAT Source IPv4 Address: 31.X.X.254
    Post NAT Destination IPv4 Address: 185.X.X.254
    Post NAPT Source Transport Port: 0
    Post NAPT Destination Transport Port: 0

The above tests were made on a CCR1036-8G-2S+ running version 6.32.1 (I cannot upgrade since this is a production system).

Doing the same tests on a x86 installation (running 6.29 - also cannot upgrade because it's in production) the results are even worse!
There it appears that the Octets counter wraps around at 2147483647 which suggests that either in versions < 6.32.1 or in non Tilera builds the Octets counter is 32bit Signed.

The whole situation is pretty much the same with when you monitor a Gbit interface with v1 SNMP (32bit counters).
The solution in SNMP is very simple. Use SNMP v2 that supports 64bit counters.
But I cannot find any solution for Netflow.

Can anyone else confirm this issue?
Does anyone know a workaround for it?
Is this a limitation of the netflow protocol or a bug in RouterOS?
How do other vendors handle this (I don't have any other equipment at the moment to test this out) ?

Thanks.
 
User avatar
Cha0s
Forum Veteran
Forum Veteran
Topic Author
Posts: 999
Joined: Tue Oct 11, 2005 4:53 pm

Re: Traffic Flow Octets Counter wrap

Thu Nov 26, 2015 11:40 pm

Looking up at Cisco's documentation on NetFlow v9 it mentions that the bytes counter is by default 32bit, but it is configurable and suggests to increase it to 64bit on core routers etc.

http://www.cisco.com/en/US/technologies ... a3db9.html
In some cases the size of a field type is fixed by definition, for example PROTOCOL, or IPV4_SRC_ADDR. However in other cases they are defined as a variant type. This improves the memory efficiency in the collector and reduces the network bandwidth requirement between the Exporter and the Collector. As an example, in the case IN_BYTES, on an access router it might be sufficient to use a 32 bit counter (N = 4), on a core router a 64 bit counter (N = 8) would be required.
All counters and counter-like objects are unsigned integers of size N * 8 bits.
So the protocol itself can support 64bit counters. It just seems that mikrotik's v9 template uses 32bit counters.

I just confirmed that by capturing the data template in wireshark.
FlowSet 1 [id=0] (Data Template): 256,257
    FlowSet Id: Data Template (V9) (0)
    FlowSet Length: 184
    Template (Id = 256, Count = 22)
        Template Id: 256
        Field Count: 22
        Field (1/22): LAST_SWITCHED
        Field (2/22): FIRST_SWITCHED
        Field (3/22): PKTS
        Field (4/22): BYTES
            Type: BYTES (1)
            Length: 4
        Field (5/22): INPUT_SNMP
        Field (6/22): OUTPUT_SNMP
        Field (7/22): IP_SRC_ADDR
        Field (8/22): IP_DST_ADDR
        Field (9/22): PROTOCOL
        Field (10/22): IP_TOS
        Field (11/22): L4_SRC_PORT
        Field (12/22): L4_DST_PORT
        Field (13/22): IP_NEXT_HOP
        Field (14/22): DST_MASK
        Field (15/22): SRC_MASK
        Field (16/22): TCP_FLAGS
        Field (17/22): DESTINATION_MAC
        Field (18/22): SOURCE_MAC
        Field (19/22): postNATSourceIPv4Address
        Field (20/22): postNATDestinationIPv4Address
        Field (21/22): postNAPTSourceTransportPort
        Field (22/22): postNAPTDestinationTransportPort
    Template (Id = 257, Count = 21)
        Template Id: 257
        Field Count: 21
        Field (1/21): IP_PROTOCOL_VERSION
        Field (2/21): IPV6_SRC_ADDR
        Field (3/21): IPV6_SRC_MASK
        Field (4/21): INPUT_SNMP
        Field (5/21): IPV6_DST_ADDR
        Field (6/21): IPV6_DST_MASK
        Field (7/21): OUTPUT_SNMP
        Field (8/21): IPV6_NEXT_HOP
        Field (9/21): PROTOCOL
        Field (10/21): TCP_FLAGS
        Field (11/21): IP_TOS
        Field (12/21): L4_SRC_PORT
        Field (13/21): L4_DST_PORT
        Field (14/21): FLOW_LABEL
        Field (15/21): IPV6_OPTION_HEADERS
        Field (16/21): LAST_SWITCHED
        Field (17/21): FIRST_SWITCHED
        Field (18/21): BYTES
            Type: BYTES (1)
            Length: 4
        Field (19/21): PKTS
        Field (20/21): DESTINATION_MAC
        Field (21/21): SOURCE_MAC
The bytes field have lenth 4.

So I guess the fix is rather easy by changing the template.

Mikrotik, can you confirm the issue an release a fix please?

Thanks.
 
User avatar
Cha0s
Forum Veteran
Forum Veteran
Topic Author
Posts: 999
Joined: Tue Oct 11, 2005 4:53 pm

Re: Traffic Flow Octets Counter wrap

Sun Nov 29, 2015 4:33 pm

Does anyone else experience this issue or is it just me?
 
User avatar
janisk
MikroTik Support
MikroTik Support
Posts: 6283
Joined: Tue Feb 14, 2006 9:46 am
Location: Riga, Latvia

Re: Traffic Flow Octets Counter wrap

Thu May 19, 2016 1:51 pm

noted, will look into this.
 
User avatar
Cha0s
Forum Veteran
Forum Veteran
Topic Author
Posts: 999
Joined: Tue Oct 11, 2005 4:53 pm

Re: Traffic Flow Octets Counter wrap

Fri May 20, 2016 1:47 am

Great! Thanks :)
 
GeberNehmer
just joined
Posts: 3
Joined: Sat Nov 26, 2016 9:23 pm

Re: Traffic Flow Octets Counter wrap

Sat Feb 01, 2020 9:49 pm

Hi,

Just noticed this while testing netflow using my RB4011, so there is still a 32bit limit.

Are there any plans to increase the size of the counter?
 
pe1chl
Forum Guru
Forum Guru
Posts: 6787
Joined: Mon Jun 08, 2015 12:09 pm

Re: Traffic Flow Octets Counter wrap

Tue Mar 10, 2020 1:07 pm

Unfortunately it still hasn't been fixed in version 6.46.x
I encountered this issue where someone must have downloaded a very large file according to network traffic statistics but this record could not be found in the traffic flow export.
After searching it appears there is a record with a large count but it has wrapped the 32-bit limit several times, even though the value in the IPFIX template is now 8 bytes (64 bits).

I reduced the active flow timeout to a low value so that this cannot occur (fortunately the inside LAN interfaces are 100 Mbit/s...) and now indeed I do see the correct results.

I think the internal counter should be increased to a 64-bit variable (as the export already has that field size) or when this is not practical, the flow should be output when the
counter nears the maximal value (independently from the active flow timeout).

Who is online

Users browsing this forum: Bing [Bot], DanMos79, Google [Bot] and 170 guests