The Analytics Stack

Collection, storage, and reporting — the three-part analytics pipeline

The Three-Part Pipeline

A basic analytics pipeline has three parts:

  1. Collection — Receives data from client-side scripts (beacons, events) or ingests server logs. This is the entry point for all analytics data.
  2. Storage — A database optimized for time-series queries: how many page views last week? What was the error rate yesterday? Typical choices include PostgreSQL, ClickHouse, or cloud data warehouses.
  3. Reporting — Dashboards and visualizations that transform raw data into actionable insight. This is where numbers become decisions.
  ┌──────────┐        ┌────────────┐        ┌──────────┐        ┌───────────┐
  │  Client  │──────▶ │ Collection │──────▶ │ Storage  │──────▶ │ Reporting │
  │ (Browser)│  HTTP  │  (Server)  │ INSERT │(Database)│ SELECT │(Dashboard)│
  └──────────┘  POST  └─────────── ┘        └──────────┘        └───────────┘
       │                    │                    │                    │
   JS events,         Receives &          Time-series         Charts, tables,
   performance        validates            queries              alerts, exports
   timing, errors     payloads

Building Each Part

In this course, you build each part yourself. The collector receives HTTP POST requests containing analytics events, and we collect data via HTTP access logs. We will aggregate this data into the storage layer, so the reporting layer can be used to form queries on the data and present it visually.