In this blog post we want to shave our experience squeezing ntopng memory usage to fit into small OT monitoring devices manufactured by our partner Endian. Just to give you an idea of the work we did look at these two images taken on the same network at the same time of the day, before and after our work.
As you can see we managed to squeeze the memory from 4 GB to 1.3 GB. Below we describe how we did it.
The challenge was to reduce memory usage while preserving the same functionalities of ntopng. The ntopng code (and of other ntop components such as nDPI) is automatically tested with nightly testing suites, automatic GitHub actions and Google fuzzy testing. The chance to have a memory leak is very low but before starting our activities we have double checked and this was not our case (fortunately). The architecture of ntopng is a bit complex as the engine is written in C++ while periodic activities (e.g. minute checks or timeseries write) and the web interface is written on Lua. This means that ntopng continuously spawns Lua virtual machines to execute these scripts and terminate. In Lua there is no chance to have a memory leak but give the complexity of ntopng, every time we start a VM we have to load several modules that take some memory. One of the most resource intensive Lua script is the one used to show alerts in the screen that was taking about 4 MB of RAM at every run. With Lua we have split scripts in smaller modules, removed potential circular dependencies (module A includes module B, that includes module A) and loaded only the minimum dependencies in order the script to run. This has allowed to decrease the memory usage of resource intensive scripts to 1.3 MB. Please note that we spawn several VMs simultaneously at specific times (e.g. very hour we execute hourly, 5 minute, minute and second scripts) so multiply the individual VM memory usage for the number of VMs. This has decreased the resource usage (both memory and CPU) but not yet at the level we expected.
Another issue that we have tackled is called heap fragmentation that is a situation where the free memory space in a computer’s heap becomes scattered or divided into small, non-contiguous blocks. The heap is a region of a computer’s memory used for dynamic memory allocation, where programs can request and release memory as needed during runtime. As ntopng continuously spawns VMs, memory is continuously allocated/deallocated in small chunks and this promotes fragmentation. Heap memory fragmentation can lead to inefficient memory usage, as it may become challenging for the memory allocator to find suitable contiguous blocks for allocation requests. In severe cases, it may cause the program to fail due to an inability to allocate the required memory. In short fragmentation is a severe problem, as severe as a memory leak. In order to address this problem we have combined two techniques:
- Avoid small Lua memory allocations by enlarging the memory allocations at at least 16 bytes, and making sure that the allocated block was a power of two. This has simplified the work of the memory manager as blocks are easier to compact and reclaim.
- We have replaced the standard memory allocation (malloc/free/realloc) with a more efficient one to overcome these limitations. The best we have found are jemalloc and tcmalloc that are more efficient and responsive than the original one. Please note in the above pictures that with the new memory manager the memory usage is more bursty than before where after a while the memory usage was stable (but bore than double). This said, we have to acknowledge Apple for their great compressed memory allocator that comes with macOS our of the box, as it did not suffer of fragmentation (as in Linux and Windows).
Finally we have reworked and optimised some C++ classes to make wiser memory usage. In addition we have compressed some hash tables used internally in ntopng by hashing the string key in a way that we guarantee not to have false positives while avoiding to keep in memory large string-based keys that were taking more space.
We hope that you have enjoyed this post. The ntopng code is on GitHub so you can see yourself what we did in details. We have some ideas for further improving the memory usage. Stay tuned!
If you want to test this resource savvy version of ntopng, just update to the latest version on the dev branch. When the next stable release will be cut, these changes will also be incorporated in the stable branch.
Enjoy !