Real-Time Backend
Time-Series data
Grafana shows pretty graphs. Behind them sit specialized databases that do what PostgreSQL cannot handle at millions of points per second.
- Tesla stores telemetry from millions of cars in InfluxDB: speed, battery charge, autopilot data: more than 1000 metrics per second from every car
- Prometheus was created at SoundCloud in 2012 and became the Kubernetes monitoring standard: every pod exports /metrics, Prometheus scrapes and stores time-series
- Exchanges and fintech use TimescaleDB to store NYSE/NASDAQ ticks: millions of events per second with SQL analytics on top
- The Grafana + InfluxDB stack is deployed at thousands of companies for infrastructure monitoring: CPU, latency, error rate with retention policies and downsampling
What time-series data is
Time-series is a sequence of values bound to timestamps. Each record answers the question "what happened and when". This is not just a table with a `created_at` column. In a time-series DB, time is the primary dimension that drives storage, indexing, and compression.
Hallmarks of time-series data: records are almost never updated (append-only), old data is downsampled or deleted (retention policy), queries are always time-range aggregations (avg, sum, percentile). Classical relational databases handle this pattern poorly: a timestamp index degrades, and storing billions of rows uncompressed is wasteful.
- **Infrastructure metrics**: CPU, memory, service latency (Grafana + InfluxDB/Prometheus)
- **IoT telemetry**: Tesla collects more than 1000 metrics from every car every second into InfluxDB
- **Financial ticks**: NYSE/NASDAQ quotes: millions of events per second, stored for years
- **APM traces**: span duration, error rate, throughput for every endpoint
How does a time-series DB fundamentally differ from a relational DB when storing metrics?
InfluxDB: the metrics store
InfluxDB is a specialized time-series DB built from scratch for metrics and events. Data is organized into measurements (table equivalents), tags (indexed strings for filtering), and fields (numeric values). Storage uses TSM (Time-Structured Merge Tree), a variant of LSM trees optimized for time data with aggressive compression.
Tesla uses InfluxDB to collect telemetry from its cars: speed, battery charge, component temperatures, autopilot data: more than 1000 metrics per car every second. With millions of cars that is petabytes of data. InfluxDB handles it through built-in downsampling (Continuous Queries in v1, Tasks in v2) and retention policies.
**Cardinality explosion** is the main InfluxDB trap. Tags are indexed, and the number of unique tag combinations is called cardinality. Add userId as a tag in a metric with a million users and you get a million index entries. That kills performance and memory. Rule: only put filterable fields in tags (host, env, region). User identifiers belong in fields.
A team added userId as a tag in an InfluxDB request metric. What happens with 1M active users?
TimescaleDB and Prometheus
TimescaleDB is a PostgreSQL extension for time-series data. The table is automatically partitioned into time chunks (hypertable), which enables partition pruning: instead of scanning the full table, only the chunks for the required time range are scanned. Full SQL, JOINs with regular tables, pg_extensions: everything works as in plain PostgreSQL.
Prometheus is a pull-based monitoring system created at SoundCloud in 2012 (open-sourced in 2015). The server scrapes /metrics endpoints every N seconds and stores data in its own TSDB time-series format. It keeps data locally and is not designed for long-term retention, typically 15 days. For long-term storage, data is exported to InfluxDB, Thanos, or Cortex.
- **TimescaleDB**: pick when you need SQL, JOINs with business tables, or you already have PostgreSQL infrastructure
- **InfluxDB**: pick for pure metrics without JOINs, very high write throughput, and built-in downsampling
- **Prometheus**: the standard for Kubernetes/microservices monitoring, not for long-term storage
- **Financial data**: TimescaleDB is popular for storing market ticks: JOINs with instrument tables, SQL analytics
A team wants to store service metrics for 2 years and JOIN them with a users table for business analytics. What to pick?
Downsampling and retention
Downsampling replaces raw data with period aggregates. Grafana + InfluxDB for infrastructure monitoring typically stores: raw data for 7 days (1-second resolution), 5-minute aggregates for 30 days, hourly aggregates for 1 year. This lets you see yesterday's incident in detail (raw) and quarterly trends (hourly avg) at a reasonable storage footprint.
Retention policies delete data older than a configured age automatically. In InfluxDB v2 that is a Task with bucket retention. In TimescaleDB it is `add_retention_policy('metrics', INTERVAL '90 days')`. In Prometheus it is `--storage.tsdb.retention.time=15d` at startup. Without retention, a time-series store grows forever, and disk fills up within weeks at high throughput.
**Storage math:** 1 metric at 1-second resolution = 86,400 points/day. 1,000 metrics = 86.4M points/day. At 8 bytes per point that is ~690 MB/day uncompressed. TimescaleDB delivers 90-95% compression on old chunks, so ~35-70 MB/day. A year totals 12-25 GB instead of 250 GB.
A time-series DB is only needed for infrastructure metrics
The time-series pattern fits anywhere data is bound to time: IoT, financial ticks, user events, logs with aggregation
Any data with a "what happened and when" pattern benefits from time-series optimizations. TimescaleDB stores market ticks; InfluxDB powers EV telemetry. The boundary is not the industry but the access pattern: append-only, time-range queries, aggregation.
A system stores 1-second metrics with no retention policy. In 3 months the disk fills up. What is the right approach?
Key takeaways
- Time-series data is append-only with time as the primary dimension; specialized DBs (InfluxDB, TimescaleDB) store it 10-100x more efficiently than PostgreSQL
- Cardinality explosion in InfluxDB: tags are indexed, so high-cardinality fields (userId, requestId) must be fields, not tags
- Downsampling + retention = managed growth: raw 7 days for diagnostics, aggregates 90 days for trends, auto-purge via retention policy
- TimescaleDB when you need SQL and JOINs with business data; InfluxDB for pure metrics with high throughput; Prometheus for scrape-based microservice monitoring
Related topics
Time-series data is part of a wider realtime storage and processing ecosystem:
- Redis Streams — A buffer for time-series events before writing to InfluxDB/TimescaleDB; used as an intermediate layer at high write throughput
- Apache Kafka — Kafka stores timestamped events and can serve as a source for time-series DBs via Kafka Connect or consumer services
- WebSocket and SSE — Real-time delivery of time-series data to clients: Grafana uses WebSocket for live chart updates from InfluxDB
Вопросы для размышления
- Which data in your project is time-series in nature? Could it benefit from a specialized store?
- If you had to add monitoring for 500 microservices with 1-year retention, would you pick Prometheus, InfluxDB, or TimescaleDB and why?
- How does downsampling shift the trade-off between data precision and storage cost? Where does the reasonable line sit?