MongoDB

Designing a Real-Time Analytics Dashboard

Build a real-time analytics pipeline using Kafka, Apache Flink, and ClickHouse to power dashboards with sub-second query latency.

S

srikanthtelkalapally888@gmail.com

Designing a Real-Time Analytics Dashboard

Real-time analytics pipelines process event streams and serve query results with sub-second latency.

Requirements

  • Ingest 1M events/second
  • Query results in <1 second
  • Support aggregations: count, sum, percentiles
  • Time-range queries (last 1h, 24h, 7d)

Architecture

Event Sources (Apps, APIs)
        ↓
      Kafka
        ↓
   Apache Flink (stream processing)
        ↓
   ClickHouse (analytical DB)
        ↓
   Dashboard API + UI

Kafka Layer

  • Ingests raw events
  • Provides durability and replay
  • Topics by event type: page_views, clicks, purchases

Flink Processing

// Count events per user in 5-minute windows
stream
  .keyBy(event -> event.userId)
  .window(TumblingProcessingTimeWindows.of(Time.minutes(5)))
  .aggregate(new CountAggregator())
  .addSink(clickhouseSink);

ClickHouse

Columnar OLAP database — optimized for analytical queries.

-- 1 billion rows, returns in <100ms
SELECT toHour(timestamp) as hour,
       count() as page_views
FROM events
WHERE date >= today() - 7
GROUP BY hour
ORDER BY hour

Why ClickHouse: 100-1000x faster than PostgreSQL for analytics.

Pre-Aggregation

Store pre-computed aggregates:

CREATE MATERIALIZED VIEW hourly_stats AS
SELECT toHour(ts) as hour, count() as cnt
FROM events GROUP BY hour

API Layer

  • Cache popular queries in Redis (TTL 30s)
  • WebSocket push for live counters
  • REST API for historical queries

Conclusion

Kafka → Flink → ClickHouse is the modern real-time analytics stack. Pre-aggregation and columnar storage deliver sub-second query latency at massive scale.

Share this article