When it comes to monitor a distributed network, to get a picture of the Network traffic flowing through the uplinks or on critical Network segments, NetFlow like technologies are usually the answer.
nProbe Pro/Enterprise and nProbe Cento are software probes that can be used to build versatile sensors able to export flow information in many different formats, including NetFlow v5/v9/IPFIX, Kafka, Elasticsearch, ClickHouse, MySQL, CSV files, etc. All this at very high speed. nProbe Pro/Enterprise has been designed for low/mid rate (1/10 Gbps) while nProbe Cento has been designed to run at high speed (today we consider 100 Gbit a high-speed link).
This regardless of the collector, that can be a third-party NetFlow collector, or the ntop collector, ntopng, which takes care of traffic visualization, augmentation, behavioral analysis, alerting, and a myriad of other functionalities. By combining nProbe Cento with ntopng it is possible to build a fully fledged Network monitoring solution for 100 Gbit distributed networks that provides full visibility.
A frequent question that we get from those that are willing to use nProbe Cento at high speed is “What kind of hardware do I need to be able to process 100 Gbps full rate”? With this post we want to provide some guidelines about hardware selection.
In contrast with what happens when running n2disk at high speed, where FPGA adapters, like Napatech or Silicom/Fiberblaze, able to operate in segment mode, are mandatory to get the best dump performance, nProbe Cento does not really require expensive adapters. A 100 Gbit probe can be built using commodity, under 1K$, ASIC adapters. What is mandatory here is the support for symmetric RSS. RSS is used to spread the traffic load across multiple CPU cores by means of multiple data streams, splitting the physical interface into multiple logical interfaces where traffic is distributed according to an hash function computed on packet headers. Using RSS to scale, in combination with PF_RING ZC (Zero-Copy) drivers delivering max capture performance, guarantees no packet loss at full 100 Gbit when processing flows.
For this reason the list of recommended adapters to be used in combination with nProbe Cento at 100 Gbit includes:
- NVIDIA/Mellanox Connect-X 5/6
- Intel E810
Not all CPUs are alike. They have different frequency, number of cores, cache size, different level of caches, instruction set, etc. However, in our experience, we can say that a modern CPU (for example a Xeon Gold 6346 3 Ghz or AMD EPYC 9124) is usually able to handle more than 10 Mpps (Million packets per second) per CPU core. Considering the average Internet packet size, a 10 Gbit link usually has 1-3 Mpps. Considering the worst case scenario, a 10 Gbit link can have up to 14.88 Mpps. x10 at 100 Gbit.
This means that in order to handle 100 Gbps, worst case, we need a CPU with at least 16 cores, 3 Ghz. Less cores may be sufficient on CPUs with higher frequency and a big cache.
For instance, if we want to build an Intel-based system, we can use a Xeon Gold 6326 or 6346 or higher. If we want to build an AMD-based system, we can use a AMD EPYC 9124 or higher.
The RAM configuration for optimal performance mainly depends on the CPU itself:
- Number of modules: this should match the number of memory channels supported by the CPU (check the CPU specs for this)
- Intel Xeon Gold currently supports 8 memory channels
- AMD EPYC supports 12 memory channels for most of the models
- Speed: select the higher speed supported by the CPU (check the CPU specs for this)
- Size: considering the minimum size per module (8-16 GB), grabbing the smaller available size usually works just fine (8x 8GB = 64GB are more then enough for nProbe Cento)
Configuring nProbe Cento is really simple. The actual options to be provided to the command line (or configuration file) may change depending on the working mode and export format, however on the capture side it’s really straightforward. You should pay attention to 2 main options, the interfaces configuration (-i) and the CPU affinity (–processing-cores).
cento -i zc:eth1@0 -i zc:eth1@1 -i zc:eth1@2 -i zc:eth1@3 ...
You can also use a shortcut for this, which is convenient especially when running on 16+ RSS streams:
cento -i zc:eth1@[0-15]
If you are using a NVIDIA/Mellanox adapter, you can use a similar syntax:
cento -i mlx:mlx5_0@[0-15]
At this point, we just have to add the CPU affinity configuration, to make sure that nProbe Cento will use all the available cores by binding one thread per core (providing max scalability and overall performance). lstopo is a tool that is really useful in understanding the CPU topology and helps you selecting the right cores.
cento -i mlx:mlx5_0@[0-15] --processing-cores 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
At this point you just have to add the options to control the export format.
Now you have all the ingredients to build your 100 Gbit sensor.