ntopng Disk Requirements for Timeseries and Flows

Posted · Add Comment

Being able to do a priori estimations of the space that ntopng is going to use in a production environment is fundamental for the provisioning of the storage.

In this post we try to estimate the space used by ntopng to store timeseries and flows.

Timeseries

The number of timeseries generated by ntopng depends almost exclusively on the number of local hosts. Other timeseries generated, including those for the interfaces or SNMP devices, are generally orders of magnitude less than those generated for local hosts. For this reason, it is safe to only take into account local hosts timeseries when doing the math.

For every local host, ntopng generates a timeseries for the traffic and an extra series of Layer-7 application protocol timeseries, one for each application protocol. These timeseries can be disabled from the preferences but clearly we need them enabled to write this post.

In the remainder of this section we discuss the space required by ntopng to store timeseries, as a function of the number of local hosts, both for RRDs and InfluxDB. One can either choose to use RRDs or InfluxDB from the ntopng preferences page. We refer the interested reader to the Appendix to see how these numbers are calculated.

 

RRD

RRD files are fixed in size, this means that they won’t grow as new data points arrive. ntopng creates one RRD per timeseries. The space required to store data for every local host depends on what timeseries are enabled and is highlighted in the following table.

 

Enabled Timeseries Disk Space
Host Timeseries “Light” (Default) 92 K / Local Host (2 RRDs / host)
Host Timeseries “Full” and Layer-7 Applications “None” 1.3 MB / Local Host (Approx. 25 RRDs / host)
Host Timeseries “Full” and Layer-7 Applications “Per Category” 1.6MB / Local Host (Approx. 30 RRDs / host)
Host Timeseries” Full” and Layer-7 Applications “Per Application” Max. 13.8 MB / Local Host
(50 K * 250 applications + 1.3 MB)
Host Timeseries “Full” and Layer-7 Applications “Both Max. 14.1 MB / Local Host
(50K * 250 applications + 1.6 MB)

InfluxDB

Contrary to RRD, InfluxDB timeseries grow in size as the time goes by. For this reason, the following estimation is not only function of the local hosts, but also of the number of days of monitoring. In addition, as InfluxDB allows to choose the monitoring resolution, we give the space required at two different resolutions, namely 10- and 60-seconds.

InfluxDB 10-seconds Resolution 60-seconds Resolution
Timeseries storage 450 KB / Local Host / Day 75 KB / Local Host / Day

Flows

As we are going to announce soon, we have designed and implemented an high-speed/capacity, special purpose database for the storage of flows. With this database, we are able to dump to disk tens of thousands flows per second. The space used to store each flow is shown in the following table.

Flow Index
Flows storage 11 Bytes / Flow

The above value is an average value based on IPv4 traffic with some IPv6 flows. It can increase if you mostly have IPv6 traffic and long metadata strings stored in flows.

Appendix

In this appendix we discuss the math we have done to calculate the estimations above.

InfluxDB

To do the estimations of InfluxDB we have considered an ntopng running in production on a real environment, monitoring a SPAN port at an average traffic of 444.84 Mbps, with an average of 22,323 hosts inclusive of approximately 4,000 local hosts. ntopng is monitoring Layer-7 Applications and dumping timeseries data points with a 10-seconds resolution.

Data is obtained as follows:

  • InfluxDB Storage: 154.14 GB as shown in the ntopng runtime status page.
  • Time of monitoring: 3 months as obtained from the ntopng interface stats page.

Math is the following:

  • KB / Local Host / Day @ 10s = 154.14 GB / 3 Months / 4,000 local hosts = ((154.14 * 1024 * 1024) / 4000 / 90) = 450 KB / Local Host / Day
  • KB / Local Host / Day @ 60s = (KB / Local Host / Day @ 10s) / 6 = 75 KB / Local Host / Day

RRD

To do the estimations of RRD we have used an ntopng running in a production system that is collecting sFlow from nProbe. The system has seen approximately 2,000 local hosts and has Layer-7 timeseries generation enabled.

Data is obtained as follows:

  • Number of local hosts: /var/lib/ntopng/0/rrd $ find . -name "bytes.rrd" | wc -l = 1,989
  • Number of RRDs: /var/lib/ntopng/0/rrd $ find . -name "*.rrd" | wc -l = 25,506
  • Total size of RRDs: /var/lib/ntopng/0/ $ du -hs rrd/ = 989M

Math is the following:

  • 989 M / 1,989 Local Hosts = (989 / 1989) * 1024 = 500 KB / Local Host

Flow Index

Flow index estimations have been done using the very same host used for InfluxDB. To compute the number of bytes used by each flow stored in the flow database, we have done the following math.

First, we have counted the number of flows over an hour

$ ./nindex -d /var/lib/ntopng/0/flows/ -b 1544713200 -e 1544716800 -l 0
14/Dec/2018 21:29:05 [nindex.cpp:346] Search time range [Thu Dec 13 15:00:00 2018 -> Thu Dec 13 16:00:00 2018]
14/Dec/2018 21:29:05 [nindex.cpp:356] Performing record count (-l 0)
14/Dec/2018 21:29:05 [nindex.cpp:393] Query completed in 2.1 msec, with 16'962'614 hits returned

Then, we have counted the disk space used to store data for that particular hour

$ du -hs /var/lib/ntopng/0/flows/2018/12/13/15
175M /var/lib/ntopng/0/flows/2018/12/13/15

Finally, we have obtained the Bytes / Flow as = 175M / 16962614 = (175 * 1024 * 1024) / 16962614 = 11 B / Flow.

PS. Note that we have scan ~17 million records in ~2 msec. Not too bad for a low end system using a SATA drive where ntopng is writing in them meantime.