Best Practices for Efficiently Running ntopng

Posted · Add Comment

The default ntopng configuration, is suitable for most of our users who deploy it on a home network or small enterprise network (typically a /24 network) with link speed <= 100 Mbit. This does NOT mean that ntopng cannot operate on faster/larger networks, but that it cannot be used without any configuration.

The first thing to modify are the -x/-X settings. You need to set them to double the max size you expect on your network. Example if you expect to have (including both local and remote hosts) at most 35000 active hosts, you need to set -x at no less than 70000. Better to have a larger value than a smaller one: small values mean that you will not be able to see all hosts, and also the performance will be poor as ntopng was not tuned properly. Larger values require ntopng to use more memory, but if you have plenty of RAM it is not a good argument to use extremely large values (e.g. -x  1000000 in the previous example) as you will waste resources for no reason.

Another parameter to setup is -m that specifies the list of local networks. Please make sure you set the real networks you plan to use. Some users are lazy and set it to 0.0.0.0/0: this is not a good idea as ntopng will save stats for all the hosts and thus you will exhaust disk space quickly.

Flow persistency is setup via -F. When flows are saved to MySQL or ElasticSearch, ntopng has to do extra work, and if the database is not fast enough this will introduce a bottleneck. Please pay attention to optimising this aspect in particular if the DB runs on the same ntopng box, where resources are shared.

Packet capture in ntopng has been designed to be as efficient as possible. We decided to have one processing thread per interface configured in ntopng.  Depending on a) the CPU power b) number of hosts/flows, and c) packet capture technology, the number of packets-per-second ntopng can process can change. On a x86 server with PF_RING (non ZC) you can expect to process about 1 Mpps/interface, with PF_RING ZC at least 2/3 Mpps/interface (usually much more but typically not more than 5 Mpps). This means than if you want to monitor a 10 Gbit interface (or even worse a 40 Gbit), you need:

  • Use PF_RING ZC to accelerate packet capture.
  • Use RSS to virtualise the NIC in virtual queues. For instance you can set RSS=4 to virtualise the 10 Gbit interface into 4 virtual 10G interfaces.
  • You need to start ntopng polling packets from all virtual interfaces.

Example suppose you have a 10/40 Gbit interface named eth1, and suppose to use RSS=4 with PF_RING ZC. Then you need to start ntopng as ntopng -i zc:eth1@0  -i zc:eth1@1  -i zc:eth1@2  -i zc:eth1@3

Note that in this case ntopng will try to bind threads on different cores, but as computer architectures can change based on use of NUMA and differences on CPUs, we advise to set -g to the exact cores where you want to bind each interface polling thread. Make sure (only on multi-CPU systems) you use physical cores on the same NUMA node where the network interface has been plugged in. Of course you can use interface views (see the User’s Guide) to merge all the virtual interfaces into a single interface (please understand that if you have million of hosts this might become a bottleneck).

Hope this post will help you to optimise ntopng. If you have questions and suggestions, let’s start a discussion on github.