When ntopng receives flows from nProbe (NetFlow collector) or nProbe Cento (100 Gbit probe) over ZMQ or Kafka, each flow must go through several processing stages before it is finally stored in the database. These stages include metadata enrichment, classification, analytics, behavioural checks, and additional internal operations. While this processing pipeline is essential for ntopng’s real-time monitoring, it naturally adds latency between the moment a flow arrives and when it becomes queryable in the (ClickHouse) storage backend. In large deployments ingesting thousands or hundreds of thousands of flows per second, the memory footprint of flows waiting in the processing queue can grow significantly. Also, in some cases immediate access to raw flow records is often more important than waiting for ntopng’s full processing cycle. For this reason we decided to implement a new optimized flow dump mode, the Direct Flows Dump, which fundamentally changes when and how flows are written to ClickHouse.
How It Works
The direct dump mode can be enabled with the --direct-flows-dump command-line option. When this is enabled:
- Flows received from ZMQ or Kafka are written directly to the dump destination, typically ClickHouse, before any further processing or enrichment steps occur.
- The flow still proceeds through ntopng’s normal processing pipeline for statistics, alerts, and real-time monitoring, but this happens independently of the dump operation.
- Direct dump mode applies only to flows coming from external collectors. Packet capture based flows continue following the standard processing path.
Standard Mode
Flow Received → Processing & Enrichment → Database Dump
Direct Dump Mode
Flow Received → Database Dump (immediate)
→ Processing & Enrichment
Advantages of Direct Dump Mode
1. Near-Instant Availability for Historical Queries
Flows become visible in ClickHouse almost immediately, enabling extremely fast forensic analysis and compliance queries.
2. Smaller Memory Footprint
By dumping flows upon arrival, ntopng avoids storing large numbers of unprocessed flows for additional time in memory. This is especially beneficial in high-volume or memory-constrained deployments.
3. Higher Throughput
Decoupling the dump operation from ntopng’s processing pipeline allows both tasks can execute in parallel, improving overall system throughput.
4. Better Use of ClickHouse Strengths
ClickHouse is engineered for high-volume ingestion and efficient compression. Feeding it raw flows as early as possible maximizes its performance benefits while letting ntopng focus on real-time analytics.
5. Superior Scalability for Large Deployments
Environments with distributed collectors and centralized storage benefit from reduced per-flow overhead, enabling ntopng to scale horizontally across very large infrastructures.
When to Use Direct Dump Mode
Direct dump mode is ideal for:
- High-volume environments processing 100K+ flows per second
- Resource-constrained systems where memory usage must be minimized
- Compliance and forensic workflows requiring immediate access to raw historical data
- Distributed architectures with multiple nProbe instances feeding a central ntopng
- ClickHouse-centric analytics where the database is the primary query engine
Getting Started
Enabling direct dump mode is simple, just add the --direct-flows-dump option to your ntopng launch command. Example:
ntopng -i "zmq://127.0.0.1:5556" -F "clickhouse" --direct-flows-dump
You can verify that direct mode is active in the ntopng web UI under Interface Statistics, where a Direct Mode indicator shows the current status.
The new Direct Flows Dump mode enables ntopng to support increasingly demanding monitoring environments while preserving its powerful real-time visibility capabilities. To learn more about ntopng’s ClickHouse integration, refer to the User’s Guide.
Enjoy!
