Data Collection Methods

Server logs, network capture, and client-side scripts — three approaches to gathering analytics data

Three Approaches

There are three fundamental approaches to collecting analytics data, each capturing data at a different point in the client-server communication:

  1. Server logs — The web server automatically records every request: IP address, URL, status code, User-Agent, timestamp. In theory, any HTTP packet, and that could include extra data provided using Client Hints. A motivating aspect of logging is that no client code changes are needed. While this is the oldest method, it is also the most reliable one, and complete analytics solutions leverage logs.
  2. Network capture (packet sniffing) — Intercepting traffic between client and server to inspect the full request/response. Requires network access but no server or client changes. Largely obsolete for content analysis since HTTPS, unless within a datacenter, where you could install a certificate on the device.
  3. Client-side scripts — JavaScript code running in the browser or other native code in a mobile application that captures events (clicks, scrolls, errors, timing) and sends them to a collector. This is the most flexible method and collects a wealth of information, but it requires code deployment and depends on JavaScript being enabled.
  ┌──────────┐                                           ┌──────────┐
  │  Client  │─────────────── Network ──────────────────▶│  Server  │
  │ (Browser)│                                           │          │
  └──────────┘                                           └──────────┘
       │                        │                             │
  Client-side              Network capture              Server logs
  scripts capture:         captures:                    capture:
  • DOM events             • Request/response           • IP address
  • Scroll depth             headers                    • URL path
  • Click coordinates      • Payload content            • Status code
  • JS errors                (HTTP only, not            • User-Agent
  • Performance timing       HTTPS content)             • Timestamp
  • Viewport size          • Connection metadata        • Response size

Comparison of Methods

Aspect Server Logs Network Capture Client-Side Scripts
What it captures HTTP requests received by server All traffic on the wire Any browser event or state
Requires code changes? No — built into web servers No — passive observation Yes — must add JS to pages
Captures client events? No — only sees requests No — only sees network traffic Yes — clicks, scrolls, errors
Works with HTTPS? Yes — runs on the server Metadata only — content encrypted Yes — runs in the browser
Performance impact Minimal — logging is routine Variable — depends on volume Variable — adds JS payload
Privacy concerns Moderate — IP, paths High — deep packet inspection Very High — can capture anything
Example tools Apache/Nginx logs, GoAccess, AWStats Wireshark, tcpdump Google Analytics, custom beacons