9. ClickHouse Timeseries

This page describes the ClickHouse timeseries driver in detail. For a high-level overview of all timeseries drivers, see Timeseries.

Note

ClickHouse timeseries support requires an Enterprise M or better license and the HAVE_CLICKHOUSE compile-time flag. It is not available on Windows.

9.1. Overview

The ClickHouse timeseries driver stores all traffic metrics produced by ntopng (interface traffic, per-host statistics, application protocol breakdowns, etc.) in a ClickHouse columnar database. This provides high write throughput, flexible time-range queries, automatic data retention via TTL, and optional cluster/cloud deployment.

All timeseries data lands in a single table (timeseries) whose rows carry a schema name, a timestamp, a tag map (key-value pairs used for filtering), and a metric map (the actual numeric measurements). The driver is a drop-in replacement for the RRD and InfluxDB drivers — the same ntopng charts and API calls work regardless of which driver is active.

9.2. Configuration

9.2.1. Deployment Modes

The ClickHouse timeseries driver is selected via the -F option, which also configures the connection parameters. Please check the Flows Dump section for configuring ClickHouse.

Note

The same -F flag is used for ClickHouse flow dump (historical flows) and ClickHouse timeseries. When -F clickhouse (or a variant) is set, both feature sets share the same ClickHouse server connection and database.

9.2.2. Timeseries Driver Preference

The active timeseries driver can also be changed at runtime from the ntopng preferences UI under TimeseriesDriver. Setting it to clickhouse activates this driver for all new time-series writes.

9.2.3. Data Retention

Retention is controlled by the Timeseries Data Retention preference (in days, default 365). The value is applied as a ClickHouse TTL clause on the timeseries table. When retention changes, the table DDL is re-issued with the new TTL and ClickHouse handles background cleanup automatically — no manual purge is required.

9.3. Database Schema

9.3.1. Table Definition

On first startup (or when the retention period changes), ntopng creates the following table in the configured ClickHouse database (default: ntopng):

CREATE TABLE IF NOT EXISTS `ntopng`.`timeseries`
(
    `schema_name`  LowCardinality(String),
    `tstamp`       DateTime CODEC(Delta, ZSTD),
    `tags`         Map(LowCardinality(String), String),
    `metrics`      Map(LowCardinality(String), Float64)
)
ENGINE = MergeTree()
PARTITION BY toYYYYMM(tstamp)
ORDER BY (schema_name, tstamp)
TTL tstamp + toIntervalDay(<retention_days>)

9.3.2. Column Details

schema_name (LowCardinality(String))

Identifies the type of measurement, e.g. iface:traffic, host:traffic, interface:ndpi. Using LowCardinality avoids redundant string storage for the small set of distinct schema identifiers.

tstamp (DateTime CODEC(Delta, ZSTD))

Unix timestamp of the data point. The Delta codec stores differences between consecutive timestamps (reducing entropy for regular time series), then ZSTD compresses the result.

tags (Map(LowCardinality(String), String))

Key-value pairs used to filter and group data. Keys are bounded-cardinality identifiers such as ifid, host, or protocol. Values are the corresponding entity identifiers. Example: {'ifid': '0', 'host': '192.168.1.1'}.

metrics (Map(LowCardinality(String), Float64))

The numeric measurements. Keys are metric names (bytes, packets, duration_ms, etc.); values are Float64 measurements. Example: {'bytes': 1048576.0, 'packets': 1024.0}.

9.3.3. Partitioning and Ordering

Data is partitioned monthly (toYYYYMM(tstamp)), which allows ClickHouse to prune entire partitions when querying a specific time range. Within each partition, rows are sorted by (schema_name, tstamp), optimising the most common query pattern: filter by schema, scan a time range.

9.4. Query Behaviour

9.4.1. Counter vs. Gauge Metrics

Each timeseries schema declares whether its metrics are counters (monotonically increasing totals, e.g. cumulative bytes) or gauges (instantaneous values, e.g. CPU load percentage).

  • Counter metrics: the driver fetches one extra time bucket before the requested range, computes per-bucket deltas using a lag() window function, and divides by the bucket width to produce rates (e.g., bytes/second).

  • Gauge metrics: the driver aggregates (avg, max, or min) directly within each time bucket.

9.5. Comparison with Other Drivers

Aspect

RRD

InfluxDB

ClickHouse

Storage

Per-entity .rrd files

InfluxDB server (line protocol)

Centralised columnar table

Write model

Synchronous file write per entity

HTTP line-protocol POST

Async in-memory queue → batch INSERT (TCP)

Flush interval

Each periodic activity

Each periodic activity

Every 5 seconds (up to 2 000 rows/batch)

Retention model

Fixed RRA archives (1 year total)

Retention policies

TTL clause, configurable in days

Top-K queries

Scan many files; slow at scale

InfluxDB continuous queries

Single SQL query; fast at scale

Cluster/HA

None (local filesystem)

InfluxDB Enterprise / OSS cluster

ClickHouse native cluster replication

License required

Community+

Community+

Enterprise M+