MongoDB
Designing a Real-Time Analytics Dashboard
Build a real-time analytics pipeline using Kafka, Apache Flink, and ClickHouse to power dashboards with sub-second query latency.
S
srikanthtelkalapally888@gmail.com
Designing a Real-Time Analytics Dashboard
Real-time analytics pipelines process event streams and serve query results with sub-second latency.
Requirements
- Ingest 1M events/second
- Query results in <1 second
- Support aggregations: count, sum, percentiles
- Time-range queries (last 1h, 24h, 7d)
Architecture
Event Sources (Apps, APIs)
↓
Kafka
↓
Apache Flink (stream processing)
↓
ClickHouse (analytical DB)
↓
Dashboard API + UI
Kafka Layer
- Ingests raw events
- Provides durability and replay
- Topics by event type: page_views, clicks, purchases
Flink Processing
// Count events per user in 5-minute windows
stream
.keyBy(event -> event.userId)
.window(TumblingProcessingTimeWindows.of(Time.minutes(5)))
.aggregate(new CountAggregator())
.addSink(clickhouseSink);
ClickHouse
Columnar OLAP database — optimized for analytical queries.
-- 1 billion rows, returns in <100ms
SELECT toHour(timestamp) as hour,
count() as page_views
FROM events
WHERE date >= today() - 7
GROUP BY hour
ORDER BY hour
Why ClickHouse: 100-1000x faster than PostgreSQL for analytics.
Pre-Aggregation
Store pre-computed aggregates:
CREATE MATERIALIZED VIEW hourly_stats AS
SELECT toHour(ts) as hour, count() as cnt
FROM events GROUP BY hour
API Layer
- Cache popular queries in Redis (TTL 30s)
- WebSocket push for live counters
- REST API for historical queries
Conclusion
Kafka → Flink → ClickHouse is the modern real-time analytics stack. Pre-aggregation and columnar storage deliver sub-second query latency at massive scale.