What Can Be Collected

Data from HTTP headers, URLs, servers, and JavaScript — the full analytics data landscape

Four Data Sources

Analytics data comes from four sources, each providing different information automatically or with code:

From HTTP Headers (Automatic)

  1. IP address — geographic location, ISP, rough identity
  2. User-Agent — browser, OS, device type
  3. Referer — where the user came from (previous page or search engine)
  4. Accept-Language — user's language preferences
  5. Cookies — session identifiers, tracking IDs, preferences
  6. Other Headers — if logs are extended any header can be collected, including those set by script code or added because of Client Hints — see Section 5 (Enriching Server Logs) for details on log formats, Client Hints, and script-to-header techniques

From the URL (Automatic)

  • Path — which page or resource was requested
  • Query parameters — search terms, filter state, pagination
  • UTM codes — campaign tracking (utm_source, utm_medium, utm_campaign)

From the Server (Automatic)

  • Timestamp — when the request arrived
  • Status code — success, redirect, client error, server error
  • Response size — bytes transferred
  • Processing time — how long the server took to respond

From JavaScript (Requires Code)

  • Viewport dimensions — actual visible area, not screen resolution
  • Scroll depth — how far down the page users read
  • Click coordinates and targets — what users click on
  • Mouse movement — where attention and hesitation occur
  • Performance timing API — DNS, TCP, TTFB, DOM load, paint times
  • JavaScript errors — exceptions, stack traces, failed resource loads
  • DOM state — form values, element visibility, dynamic content
  • Custom events — add-to-cart, video play, tab switch, really anything you define.