Request Rate
1,234
req/sec
Web server logs are your primary window into server behavior. Without proper logging, debugging issues becomes guesswork. With good logging, you can answer questions like:
Web servers typically maintain two types of logs:
| Log Type | Purpose | When to Check |
|---|---|---|
| Access Log | Records every request: who, what, when | Traffic analysis, debugging, security audits |
| Error Log | Records problems and failures | When something breaks, startup issues |
Here's a typical access log entry. Let's break it down:
| Field | Value | Meaning |
|---|---|---|
| Client IP | 192.168.1.100 |
Who made the request |
| Identity | - |
RFC 1413 ident (rarely used) |
| User | john |
Authenticated username (or -) |
| Timestamp | [15/Nov/2024:14:32:18 +0000] |
When the request completed |
| Request | "GET /products/widget HTTP/1.1" |
Method, path, protocol |
| Status | 200 |
HTTP response code |
| Size | 4523 |
Response body size in bytes |
| Referer | "https://example.com/" |
Page that linked here |
| User-Agent | "Mozilla/5.0..." |
Client software identification |
The web has standardized log formats that tools understand universally:
The original Apache format—minimal but universal:
%h %l %u %t "%r" %>s %b
CLF plus referer and user-agent—the most common format:
%h %l %u %t "%r" %>s %b "%{Referer}i" "%{User-Agent}i"
Modern systems prefer structured JSON for easier parsing:
{
"timestamp": "2024-11-15T14:32:18.123Z",
"client_ip": "192.168.1.100",
"method": "GET",
"path": "/products/widget",
"status": 200,
"bytes": 4523,
"duration_ms": 45,
"referer": "https://example.com/",
"user_agent": "Mozilla/5.0..."
}
JSON logs are easier to parse programmatically, can include arbitrary fields, and integrate better with modern log aggregation tools like Elasticsearch, Loki, and Datadog.
# Define custom log format
log_format combined_extra '$remote_addr - $remote_user [$time_local] '
'"$request" $status $body_bytes_sent '
'"$http_referer" "$http_user_agent" '
'rt=$request_time uct=$upstream_connect_time '
'uht=$upstream_header_time urt=$upstream_response_time';
# JSON format
log_format json_logs escape=json '{'
'"timestamp":"$time_iso8601",'
'"client_ip":"$remote_addr",'
'"method":"$request_method",'
'"path":"$uri",'
'"query":"$args",'
'"status":$status,'
'"bytes":$body_bytes_sent,'
'"duration":$request_time,'
'"upstream_time":$upstream_response_time,'
'"referer":"$http_referer",'
'"user_agent":"$http_user_agent",'
'"request_id":"$request_id"'
'}';
server {
# Use the custom format
access_log /var/log/nginx/access.log combined_extra;
access_log /var/log/nginx/access.json.log json_logs;
# Conditional logging (skip health checks)
access_log /var/log/nginx/access.log combined if=$loggable;
# Error log with level
error_log /var/log/nginx/error.log warn;
}
# Define custom log format
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%h %l %u %t \"%r\" %>s %b" common
# Add response time
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %D" combined_time
# JSON format
LogFormat "{ \"timestamp\":\"%{%Y-%m-%dT%H:%M:%S}t\", \"client_ip\":\"%a\", \"method\":\"%m\", \"path\":\"%U\", \"query\":\"%q\", \"status\":%>s, \"bytes\":%B, \"duration\":%D, \"referer\":\"%{Referer}i\", \"user_agent\":\"%{User-Agent}i\" }" json
<VirtualHost *:80>
ServerName example.com
# Access log with format name
CustomLog /var/log/apache2/access.log combined
CustomLog /var/log/apache2/access.json.log json
# Error log
ErrorLog /var/log/apache2/error.log
LogLevel warn
# Conditional logging
SetEnvIf Request_URI "^/health" no_log
CustomLog /var/log/apache2/access.log combined env=!no_log
</VirtualHost>
const express = require('express');
const morgan = require('morgan');
const fs = require('fs');
const app = express();
// Create write stream for file logging
const accessLogStream = fs.createWriteStream(
'/var/log/node/access.log',
{ flags: 'a' }
);
// Combined format to file
app.use(morgan('combined', { stream: accessLogStream }));
// Dev format to console
app.use(morgan('dev'));
// Custom format
morgan.token('request-id', (req) => req.headers['x-request-id']);
app.use(morgan(':method :url :status :response-time ms - :request-id'));
// Custom JSON logger
app.use((req, res, next) => {
const start = Date.now();
res.on('finish', () => {
const log = {
timestamp: new Date().toISOString(),
method: req.method,
path: req.path,
query: req.query,
status: res.statusCode,
duration: Date.now() - start,
ip: req.ip,
userAgent: req.headers['user-agent'],
requestId: req.headers['x-request-id']
};
console.log(JSON.stringify(log));
});
next();
});
Error logs capture problems—they're what you check first when something breaks:
# Nginx error log examples 2024/11/15 14:32:18 [error] 1234#0: *5678 open() "/var/www/html/missing.html" failed (2: No such file or directory) 2024/11/15 14:32:19 [crit] 1234#0: *5679 connect() to 127.0.0.1:3000 failed (111: Connection refused) 2024/11/15 14:32:20 [warn] 1234#0: *5680 upstream server temporarily disabled
| Level | Nginx | Apache | Use For |
|---|---|---|---|
| emerg | emerg | emerg | System unusable |
| alert | alert | alert | Immediate action needed |
| crit | crit | crit | Critical conditions |
| error | error | error | Error conditions |
| warn | warn | warn | Warning conditions |
| notice | notice | notice | Normal but significant |
| info | info | info | Informational |
| debug | debug | debug | Debugging (verbose!) |
Never run debug level in production—it generates enormous amounts of data and can impact performance. Use it only for specific troubleshooting, then turn it off.
When a request touches multiple services, how do you trace it? Correlation IDs (also called request IDs or trace IDs) solve this:
# Generate request ID if not provided
map $http_x_request_id $request_id {
default $http_x_request_id;
"" $request_id; # Nginx generates one
}
server {
# Include in logs
log_format with_id '$remote_addr - [$time_local] "$request" '
'$status request_id=$request_id';
location / {
# Pass to upstream
proxy_set_header X-Request-ID $request_id;
# Also add to response for debugging
add_header X-Request-ID $request_id;
proxy_pass http://app;
}
}
const { v4: uuidv4 } = require('uuid');
// Middleware to handle request IDs
app.use((req, res, next) => {
// Use incoming ID or generate new one
req.requestId = req.headers['x-request-id'] || uuidv4();
// Include in response
res.setHeader('X-Request-ID', req.requestId);
// Make available for logging
req.log = {
info: (msg, data) => console.log(JSON.stringify({
timestamp: new Date().toISOString(),
level: 'info',
requestId: req.requestId,
message: msg,
...data
})),
error: (msg, data) => console.error(JSON.stringify({
timestamp: new Date().toISOString(),
level: 'error',
requestId: req.requestId,
message: msg,
...data
}))
};
next();
});
// Usage in routes
app.get('/api/users/:id', async (req, res) => {
req.log.info('Fetching user', { userId: req.params.id });
try {
const user = await db.getUser(req.params.id);
req.log.info('User found', { userId: user.id });
res.json(user);
} catch (err) {
req.log.error('User lookup failed', { error: err.message });
res.status(500).json({ error: 'Internal error' });
}
});
Logs grow forever—without rotation, they'll fill your disk. Log rotation archives old logs and optionally compresses or deletes them:
# /etc/logrotate.d/nginx
/var/log/nginx/*.log {
daily # Rotate daily
missingok # Don't error if log missing
rotate 14 # Keep 14 days
compress # Gzip old logs
delaycompress # Don't compress most recent backup
notifempty # Don't rotate empty files
create 0640 nginx adm # Permissions for new file
sharedscripts # Run postrotate once for all logs
postrotate
# Signal Nginx to reopen logs
if [ -f /var/run/nginx.pid ]; then
kill -USR1 `cat /var/run/nginx.pid`
fi
endscript
}
/var/log/nginx/*.log {
size 100M # Rotate when reaches 100MB
rotate 10 # Keep 10 rotations
compress
missingok
notifempty
}
Nginx (and many servers) hold log files open. After rotation, you must signal the process to close and reopen its log files, or it will continue writing to the renamed/deleted file.
Beyond logs, monitoring metrics gives you real-time insight into server health:
1,234
req/sec
45
ms
0.1
%
847
conn
| Metric | What It Measures | Warning Signs |
|---|---|---|
| Request Rate | Requests per second | Sudden drops or spikes |
| Latency P50/P95/P99 | Response time distribution | P99 > 1s, growing gap between P50/P99 |
| Error Rate | % of 5xx responses | > 1% sustained |
| Active Connections | Current open connections | Near max limits |
| Upstream Response Time | How long backends take | Increasing trend |
| Bandwidth | Bytes in/out | Approaching network limits |
Nginx provides a built-in status page for basic metrics:
# Enable stub_status module
server {
listen 8080;
server_name localhost;
location /nginx_status {
stub_status;
allow 127.0.0.1; # Only local access
deny all;
}
}
# Output:
# Active connections: 291
# server accepts handled requests
# 16630948 16630948 31070465
# Reading: 6 Writing: 179 Waiting: 106
| Field | Meaning |
|---|---|
| Active connections | Current client connections |
| accepts | Total accepted connections |
| handled | Total handled connections (should equal accepts) |
| requests | Total requests (can exceed connections due to keep-alive) |
| Reading | Reading client request headers |
| Writing | Writing response to client |
| Waiting | Keep-alive connections waiting for request |
Modern systems distinguish between three pillars of observability:
| Pillar | What | When to Use | Tools |
|---|---|---|---|
| Logs | Discrete events with context | Debugging specific issues | ELK, Loki, Splunk |
| Metrics | Numeric measurements over time | Dashboards, alerting, trends | Prometheus, Datadog, CloudWatch |
| Traces | Request flow across services | Distributed debugging | Jaeger, Zipkin, X-Ray |
For most web servers, good logs with correlation IDs and basic metrics are sufficient. Add distributed tracing when you have multiple services and need to understand request flow across them.
Common command-line tools for analyzing logs:
# Recent 500 errors
grep '" 500 ' /var/log/nginx/access.log | tail -20
# Count errors by status code
awk '{print $9}' /var/log/nginx/access.log | sort | uniq -c | sort -rn
# 5xx errors in last hour
awk -v date="$(date -d '1 hour ago' '+%d/%b/%Y:%H')" '$4 ~ date && $9 ~ /^5/' access.log
# Top 10 IP addresses
awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10
# Top requested paths
awk '{print $7}' access.log | sort | uniq -c | sort -rn | head -10
# Requests per minute
awk '{print $4}' access.log | cut -d: -f1-3 | uniq -c
# Bandwidth by path
awk '{sum[$7]+=$10} END {for (path in sum) print sum[path], path}' access.log | sort -rn | head
# Slow requests (using combined_extra format with request_time)
grep 'rt=' access.log | awk -F'rt=' '{print $2}' | awk '{print $1}' | sort -rn | head -20
# Average response time
awk -F'rt=' '{sum+=$2; count++} END {print sum/count}' access.log
# P95 latency (simple approximation)
grep 'rt=' access.log | awk -F'rt=' '{print $2}' | awk '{print $1}' | sort -n | \
awk '{all[NR]=$1} END {print all[int(NR*0.95)]}'