on 09-26-2012 12:06 PM
This document decribes the ASR9000 netflow architecture.
It provides a basic configuration how to set up netflow and what the parameters are for scale and how netflow is implemented in the ASR9000/XR
The basic configuration for netflow consists of an
Flow monitor map
An exporter map
and a sampler map.
The Flow monitor MAP pulls in the Exporter map
On the interface you want to enable netflow on, you pull in the monitor map and the sampler map.
flow monitor-map FM
record ipv4
exporter FE
cache permanent
cache entries 10000
! cache timeouts define how frequently we export what, max of 1M per LC
cache timeout active 2
cache timeout inactive 2
!
flow exporter-map FE
version v9
options interface-table timeout 120
! these 2 define the exports of the sample map and interface table to theflow collector for sync'ing indexes
! to names etc.
options sampler-table timeout 120
!
transport udp 1963
destination 12.24.39.1
source <interfacename>
!
sampler-map FS
random 1 out-of 1
interface GigabitEthernet0/0/0/20
description Test PW to Adtech G4
ipv4 address 16.1.2.1 255.255.255.0
flow ipv4 monitor FM sampler FS ingress
!
Ø Trident: 100kpps/LC (total, that is in+out combined) Typhoon: 200kpps/LC Ø 1M records per LC (default cache size is 64k) Ø 50K Flows per sec export per LC Ø Sample intervals from 1:1 to 1:64k
Ø Up to 8 exporters per map, vrf aware
Netflow is not hardware accelerated in the ASR9000 or XR for that matter, but it is distributed.
What that means is that each linecard individually runs netflow by itself.
Resources are shared between the interfaces and NPU's on the linecard.
When you have 1 interface to one NPU on one linecard enabled for netflow, the full rate is available to that interface, which is 100k pps for trident and 200k for typhoon.
When you enable 2 interfaces on the same NPU on the same LC, then both interfaces share the 100k pps (trident) or 200k pps (typhoon)
When you enable 2 interfaces on 2 different NPU's, then both NPU's share the total rate of 100k/200k amongst them giving each NPU 50k or 100k depending on the LC type.
In IOS-XR platforms, it is the LC processor memory that holds the netflow cache.
NetFlow Cache is a Section of memory that stores flow entries before they are exported to external collector.
The ‘nfsvr’ process running on the linecard, manages the netflow cache.
Memory usage
The memory used can be monitored via this command:
show flow monitor FM cache internal location 0/0/CPU0
...
Memory used: 8127060
Total memory used can be verified by checking the process memory util of "NFSVR"
show processes memory location 0/0/CPU0 | inc nfsvr
257 139264 65536 73728 12812288 nfsvr
The memory used with the cache size of default 64k entries for ipv4 & MPLS is about 8MB & for ipv6 is about 11MB.
The memory used with the cache size of maximum 1M entries for ipv4 & MPLS is about 116 MB & for ipv6 is about 150MB.
The memory used with cache size of maximum 1M entries (default is 65535) is about 116 MB per ipv4 flow monitor .
If ‘n’ ipv4 flow monitors are used all with maximum 1M entries, the memory used would be n x 116 MB.
Configuration to set the cache entries to ten thousand looks as follows:
flow monitor-map FM
cache entries 10000
95% of configured cache size is the high watermark threshold. Once this threshold is reached, certain flows (longest idle ones etc) are aggressively
timed out. XR 4.1.1 attempts to expire 15% of the flows.
The show flow monitor FM cache internal location 0/0/cpu0 command will give you the data on that:
Cache summary for Flow Monitor :
Cache size: 65535
Current entries: 17
High Watermark: 62258
this syslog message means that we wanted to add more entries to the cache than what it could hold. There are a few different reasons and remediations for it:
- the cache size is too small, and by enlarging it we can hold more entries
- the inactive timeouts are too long, that is we hold entries too long in the cache not getting aged fast enough
- we have the right size cache, and we do export them adequately, but we are not getting the records out fast enough due to volume, in that case we can tune the rate limit of cache expiration entries via:
flow monitor <name> cache timeout rate-limit <time>
The permanent cache is very different from a normal cache and will be useful for accounting or security monitoring. The permanent cache will be a fixed size chosen by the user. After the permanent cache is full all new flows will be dropped but all flows in the cache will be continuously updated over time (i.e similar to interface counters).
Note that the permanent cache uses a different template when it comes to the bytes and packets.
When using this perm cache, we do not report fields 1 and 2, but instead use 85 and 86.
Fields 1 and 2 are “deltas” 85 and 86 are "running counters".
In your collector you need to "teach" it that 1 and 85, 2 and 86 are equivalent.
All packets subject to sampling, regardless or whethe they are forwarded or not are subject to netflow.
This includes packets dropped by ACL or QOS policing for instance!
A drop reason is reported to NF.. * ACL deny * unroutable * policer drop * WRED drop * Bad IP header checksum * TTL exceeded * Bad total length * uRPF drop
IPV4SrcAddr IPV4DstAddr L4SrcPort L4DestPort IPV4Prot IPV4TOS InputInterface ForwardStatus ByteCount PacketCount Dir 17.1.1.2 99.99.99.99 3357 3357 udp 0 Gi0/1/0/39 DropACLDeny 415396224 8654088 Ing
As described in the architecture section, the total sampling capability depends on the number of interfaces having netflow enabled.
It shaped up to be something like this table:
# of NPs Enabled for Netflow |
Policing Rate Per Trident NP (Unidirectional) |
Policing Rate Per Typhoon NP (Unidirectional) |
1 |
100kpps |
200kpps |
2 |
50kpps |
100kpps |
3 |
33kpps |
66kpps |
4 |
25kpps |
50kpps |
All packets that exceed this rate are dropped by the punt policer.
You can verify that by the controllers np counters command.
show controllers np counters all
Node: 0/0/CPU0:
----------------------------------------------------------------
Show global stats counters for NP0, revision v2
Read 67 non-zero NP counters:
Offset Counter FrameValue Rate (pps)
-------------------------------------------------------------------------------
....
934 PUNT_NETFLOW 18089731973 6247
935 PUNT_NETFLOW_EXCD 6245 0
...
The _EXCD depicts that the police rate had been exceeded.
This means that you likely have to increase your sampling interval.
sh flow monitor FM cache format table include layer4 tcp-flags ipv4 sour dest prot tos count pack byte location 0/0/CPU0
Mon Apr 19 09:31:19.589 EDT
Cache summary for Flow Monitor FM:
Cache size: 10000
Current entries: 1
High Watermark: 9500
Flows added: 1
Flows not added: 0
Ager Polls: 580
- Active timeout 0
- Inactive timeout 0
- TCP FIN flag 0
- Watermark aged 0
- Emergency aged 0
- Counter wrap aged 0
- Total 0
Periodic export:
- Counter wrap 0
- TCP FIN flag 0
Flows exported 0
IPV4SrcAddr IPV4DstAddr IPV4Prot IPV4TOS L4TCPFlags ByteCount PacketCount
16.1.2.2 16.1.1.2 tcp 0 S| 4282560 71 376
Matching entries: 1
Export occurs when data in the cache is removed which can occur in one of three ways.
The netflow exporter can be in a VRF, but can not be out of the Mgmt Interface.
Here’s why. The netflow runs off of the line card (LC interfaces and NP) and there is, by default, no forwarding between the LCs and the management Ethernet. This because the MGMT ether is designated out of band by LPTS (local packet transport services). More detail in the ASR9000 Local packet transport services document here on support forums).
Netflow records can be exported to any destination that may or may not be local to the LC where netflow is running. For example, LC in slot 1 & 2 are running netflow & the exporter may be connected to an interface reachble via LC in slot 3.
A total of 8 exporters per MAP is allowed.
.....
Ø DBNA
Ø Cisco netflow mib is not supported.
show flow exporter-map ..
show flow monitor-map ..
show sampler-map ..
show processes memory location <0/0/CPU0> | inc nfsvr
show flow monitor .. cache internal location <0/0/CPU0>
show flow exporter .. location <0/1/CPU0>
show flow platform producer statistics location <0/0/CPU0>
show flow platform nfea policer np <np_num> loc <node-id>
show controller np ports all location <0/Y/CPU0>
show controller np count np<number> loc <0/Y/CPU0>
Hi Xander,
I am wondering if there were some changes regarding Netflow config. requirements. I am observing and analyzing the netflow stats of the ASR9006, Cisco IOS XR Software, Version 4.3.4. Netflow source is the IP address of Loopback1, sampler “1 of 10000” and the traffic to the monitoring system is routed via mgmt interface. Currently only 1/10000 of traffic (stacked protocol) is seen at the monitoring system. Is it caused by Netflow traffic being routed via mgmt interface?
hi there,
routing out the mgmt interface is not supported, this because XR will block fabric forwarding through the mgmt interface, that you can override, but won't recommend it.
IF you set the 1/10000 then obviously you wont see every (short lived) flow, but if the flow is in the cache it should properly get exported also. You may want to check the flow exporter stats to see if records are getting dropped.
cheers!
xander
Hi Xander,
At this point I don’t find records being dropped but the monitoring system doesn’t support sampling rates. What performance increases to expect by setting the sampler to “ 1 out of 1”?
thanks,
elviragal
I would highly recommend a flow collector that can handle sampling as 1:1 at 10G rates will result in a lot of flow records that will overwhelm your collector at some point.
Running 1:1 on a9k is not an issue as long as your rates are within the set limits. that is 200k pps for typhoon, when there is only one npu that is running netflow.
If the rate exceeds that value, the NPU will replicate the packet for that 1:1 sampling, but it can't punt it to the LC CPU because LPTS will limit it. That will cause also inaccuracies in your flow data.
Same deal with the LC export, if the flow records are sent and not received becaue your collector cant handle it, your info will be inaccurate also.
So it can be done, but there are some dependencies.
cheers
xander
Hi Xander
Two questions:
We see enormous figures in the "Flows not added" counter in "show flow monitor cache..." In one scenario it is on a 1:1 sampling on a 10G interface (our own fault ;-) but the count is much, much bigger than the count in the punt-policer (also "not added" increases even at low packet rates, where the punt policer does not kick in)
The other scenario (and different customer) where we see high "not added" numbers is in a 1:00 sampling on a 10G interface, with a pps at around 2500.
What can be causing this? - "Emergency Exports" is zero in both scenarios.
The other Q:
Is there any reason why we don't have custom flow records like in FNF on IOS - or any other form of flow aggregation on the A9K?
If it's in the works, how far down the road?
/Mikkel
hi mikkel, the flows not added is seen when the cache nears its maximum size and the flows could not be added as the entries were not freed in time by aggressive export.
recommend tuning the cache size to mitigate this.
FNF is coming in I want to say 54 (need to reconfirm with marketing folks), it is definitely on the sw roadmap.
cheers!
xander
Hi Xander,
We run into an issue using the address defined at the loopback interface as exporter source while using the mgmt interface for day to day operational management.
The problem is due to the fact our flow collector does use the source address of the flow packets also to collect additional data via snmp from the router (like interface descriptions). The latter is only allowed via the mgmt interface.... Using different addresses for the same device for snmp and data collection is not supported by the collector.
So the question is, is there a way around this (besides moving all mgmt stuff out of mgmt vrf towards the main routing plane...).
regards, Andre
hi andre,
you can try to make this happen with
RP/0/RSP0/CPU0:A9K-BNG(config)#rp mgmtethernet forwarding
and defining a loopback on your system and set the exporter source to that and have the routing take care of reaching the netflow server via the mgmt ethernet.
Although I recommend against it, because this command I suggest is effectively taking the batteries out of the smoke detector and allows full routing between fabric/linecard-interfaces and mgmt ethernet and effectively defeats the purpose of what LPTS out of band tried to provide you :)
cheers!
xander
Hi Xander,
Under IOS there is an "export-stats" template option (v9) configured with: "ip flow-export template options export-stats". I was expecting the IOS-XR command for this somewhere under "flow exporter-map <map> => version v9". But unfortunately I was not able to find it.
So is there I way to export the statistics at IOS-XR and if how ?
regards, Andre
hi andre,
the statistics should be part of the template, note that using a permanent cache entries will automatically change the options template also.
this you can configure on the times it should export via flow monitor-map <name> version v9 template <...>
cheers!
xander
Xander, is netflow supported on BVI interface using Trident cards?
unfortunately not.
you could enable it on your core interface(s) to verify the traffic through and from BVI to and from core, but not between EFPs like that.
xander
Is there any way you can think of to enable it on the core and still filter it to BVI ingress/egress traffic? I only want relevant traffic.
ah, you're hinting on FNF (flexible netflow), the ability to use an ACL to define what is interesting traffic to be sampled.
this is not there today, XR6.
so right now, you'd have to sample on everything and then on your collector filter out the stuff you're not interested in...
so FNF is your ultimate solution for that, which is coming.
xander
Hello Alexander,
How to enable netflow for many interfaces at once?
I have PE router with a lot subinterfaces on Bundle-Ether. This Bundle-Ether is part of MC-LAG configuration with double tag. Count of the subinterfaces all time in increase. Periodically I need to see ingress traffic to Bundle-Ether from all subinterfaces.
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: