The Concurrency Challenge
A web server must handle many clients simultaneously. While one client waits for a database query, another is uploading a file, and a third is downloading an image. How a server manages these concurrent connections defines its architecture—and determines its performance characteristics under load.
There are three fundamental approaches, each with distinct trade-offs:
- Process-per-request — Fork a new process for each connection
- Thread-per-request — Spawn a thread for each connection
- Event-driven — One thread handles many connections via async I/O
Process-Per-Request
The simplest model: when a request arrives, the server forks a new process to handle it. Each process has its own memory space and runs independently.
┌─────────────────────────────────────────────────────────────┐
│ Master Process │
│ (listens on port 80) │
└──────────────────────────┬──────────────────────────────────┘
│ fork()
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Worker │ │ Worker │ │ Worker │
│ Process 1 │ │ Process 2 │ │ Process 3 │
│ │ │ │ │ │
│ Request A │ │ Request B │ │ Request C │
└───────────┘ └───────────┘ └───────────┘
Apache's prefork MPM uses this model. It pre-forks a pool of worker processes, and each incoming connection is handed to an available worker.
# Apache prefork configuration
<IfModule mpm_prefork_module>
StartServers 5 # Initial workers
MinSpareServers 5 # Minimum idle workers
MaxSpareServers 10 # Maximum idle workers
MaxRequestWorkers 150 # Max concurrent requests
MaxConnectionsPerChild 0 # Requests before worker restarts
</IfModule>Advantages
- Complete isolation—one crash doesn't affect others
- Works with non-thread-safe code (old PHP)
- Simple mental model
- Easy debugging (one process = one request)
Disadvantages
- High memory usage (~30MB per process)
- Process creation overhead
- Hard limit on concurrent connections
- Context switching between processes is expensive
Thread-Per-Request
An improvement on process-per-request: threads share memory within a single process, reducing overhead while still providing parallelism.
┌─────────────────────────────────────────────────────────────┐ │ Server Process │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ Thread │ │ Thread │ │ Thread │ │ Thread │ │ │ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │ │ │ │ │ │ │ │ │ │ │ │ │ Req A │ │ Req B │ │ Req C │ │ (idle) │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ │ Shared Memory Space │ └─────────────────────────────────────────────────────────────┘
Apache's worker MPM combines processes and threads: multiple processes each run multiple threads. This balances isolation with efficiency.
# Apache worker configuration
<IfModule mpm_worker_module>
StartServers 2 # Initial processes
MinSpareThreads 25 # Minimum idle threads (across all processes)
MaxSpareThreads 75 # Maximum idle threads
ThreadsPerChild 25 # Threads per process
MaxRequestWorkers 150 # Max concurrent requests
</IfModule>Advantages
- Lower memory than process-per-request
- Faster context switching
- Shared caches and connection pools
- Better resource utilization
Disadvantages
- Thread safety issues (race conditions, deadlocks)
- One crash can kill all threads in the process
- Still limited by thread count
- Stack memory per thread (~1MB default)
Event-Driven (Async I/O)
The modern approach: instead of dedicating a thread to each connection, a single thread manages thousands of connections using non-blocking I/O and an event loop.
┌─────────────────────────────────────────────────────────────┐ │ Event Loop Thread │ │ │ │ ┌─────────────────────────────────────────────────────┐ │ │ │ epoll / kqueue │ │ │ │ │ │ │ │ Ready: [conn_42, conn_187, conn_5, conn_891] │ │ │ └─────────────────────────────────────────────────────┘ │ │ │ │ │ ┌─────────────┴─────────────┐ │ │ ▼ ▼ │ │ ┌─────────────────────┐ ┌─────────────────────┐ │ │ │ Handle conn_42 │ │ Handle conn_187 │ │ │ │ (read request) │ │ (send response) │ │ │ └─────────────────────┘ └─────────────────────┘ │ │ │ │ 10,000+ connections managed by one thread │ └─────────────────────────────────────────────────────────────┘
Nginx and Node.js use this model. The server never blocks waiting for a single connection—it continuously processes whichever connections are ready.
# Nginx worker configuration
worker_processes auto; # One worker per CPU core
worker_connections 10000; # Connections per worker
# With 8 cores: 8 × 10,000 = 80,000 concurrent connectionsAdvantages
- Massive concurrency (tens of thousands)
- Very low memory per connection
- No context switching overhead
- Excellent for I/O-bound workloads
Disadvantages
- CPU-bound work blocks all connections
- Complex programming model (callbacks, promises)
- Debugging is harder (no stack per request)
- Must avoid blocking operations
The blocking trap: In an event-driven server, a single
blocking operation (synchronous file read, CPU-intensive computation)
stalls all connections. Node.js developers learn this the hard way
when they accidentally use fs.readFileSync() in a request handler.
The C10K Problem
In 1999, Dan Kegel posed the "C10K problem": how can a single server handle 10,000 concurrent connections? At the time, this seemed impossibly ambitious. The answer was the event-driven architecture.
| Architecture | Memory for 10K connections | Practical limit |
|---|---|---|
| Process-per-request | ~300 GB (30MB × 10K) | ~500 connections |
| Thread-per-request | ~10 GB (1MB × 10K) | ~2,000 connections |
| Event-driven | ~100 MB (~10KB × 10K) | 100,000+ connections |
Today we talk about C100K and C1M—handling hundreds of thousands or millions of concurrent connections. Event-driven architecture makes this possible, but the choice of architecture depends on your workload, not just connection count.
Choosing the Right Architecture
There's no universally "best" architecture. The right choice depends on your workload characteristics:
| Workload | Best Architecture | Why |
|---|---|---|
| High concurrency, I/O-bound | Event-driven (Nginx) | Minimal overhead per connection |
| CPU-intensive processing | Thread/process pool | Utilize multiple cores without blocking |
| Legacy non-thread-safe code | Process-per-request | Process isolation prevents race conditions |
| Mixed workload | Hybrid (Nginx + app server) | Right tool for each job |
The Hybrid Approach
Most production systems combine architectures. A common pattern:
Internet
│
▼
┌─────────────────────────────────────────────────────────────┐
│ Nginx (Event-driven) │
│ │
│ • TLS termination • Static file serving │
│ • Compression • Rate limiting │
│ • Load balancing • Connection pooling │
└──────────────────────────┬──────────────────────────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
▼ ▼ ▼
┌───────────┐ ┌───────────┐ ┌───────────┐
│ Node.js │ │ Node.js │ │ Node.js │
│ Process │ │ Process │ │ Process │
│ │ │ │ │ │
│ App Logic │ │ App Logic │ │ App Logic │
└───────────┘ └───────────┘ └───────────┘
Nginx excels at connection handling, TLS, and static files. The application server (Node.js, Python, Ruby) handles business logic. Each component does what it's best at.
Why not just Node.js alone? You can run Node.js directly, but you lose Nginx's battle-tested defaults for rate limiting, request size limits, timeout handling, and security hardening. More on this in Tutorial 13: The Node.js Model.
What's Next
Understanding architecture helps you make informed decisions about server selection and configuration. The next tutorial covers the practical side: configuring servers to serve your sites using virtual hosts, document roots, and MIME types.