Server Architecture Patterns - Web Server Tutorials

The Concurrency Challenge

A web server must handle many clients simultaneously. While one client waits for a database query, another is uploading a file, and a third is downloading an image. How a server manages these concurrent connections defines its architecture—and determines its performance characteristics under load.

There are three fundamental approaches, each with distinct trade-offs:

Process-per-request — Fork a new process for each connection
Thread-per-request — Spawn a thread for each connection
Event-driven — One thread handles many connections via async I/O

Process-Per-Request

The simplest model: when a request arrives, the server forks a new process to handle it. Each process has its own memory space and runs independently.

┌─────────────────────────────────────────────────────────────┐
│                     Master Process                          │
│                    (listens on port 80)                     │
└──────────────────────────┬──────────────────────────────────┘
                           │ fork()
         ┌─────────────────┼─────────────────┐
         │                 │                 │
         ▼                 ▼                 ▼
   ┌───────────┐     ┌───────────┐     ┌───────────┐
   │  Worker   │     │  Worker   │     │  Worker   │
   │ Process 1 │     │ Process 2 │     │ Process 3 │
   │           │     │           │     │           │
   │ Request A │     │ Request B │     │ Request C │
   └───────────┘     └───────────┘     └───────────┘

Apache prefork MPM: each request gets its own process

Apache's prefork MPM uses this model. It pre-forks a pool of worker processes, and each incoming connection is handed to an available worker.

# Apache prefork configuration
<IfModule mpm_prefork_module>
    StartServers             5      # Initial workers
    MinSpareServers          5      # Minimum idle workers
    MaxSpareServers         10      # Maximum idle workers
    MaxRequestWorkers      150      # Max concurrent requests
    MaxConnectionsPerChild   0      # Requests before worker restarts
</IfModule>

Advantages

Complete isolation—one crash doesn't affect others
Works with non-thread-safe code (old PHP)
Simple mental model
Easy debugging (one process = one request)

Disadvantages

High memory usage (~30MB per process)
Process creation overhead
Hard limit on concurrent connections
Context switching between processes is expensive

Thread-Per-Request

An improvement on process-per-request: threads share memory within a single process, reducing overhead while still providing parallelism.

┌─────────────────────────────────────────────────────────────┐
│                      Server Process                         │
│                                                             │
│   ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐   │
│   │ Thread  │   │ Thread  │   │ Thread  │   │ Thread  │   │
│   │    1    │   │    2    │   │    3    │   │    4    │   │
│   │         │   │         │   │         │   │         │   │
│   │ Req A   │   │ Req B   │   │ Req C   │   │ (idle)  │   │
│   └─────────┘   └─────────┘   └─────────┘   └─────────┘   │
│                                                             │
│                   Shared Memory Space                       │
└─────────────────────────────────────────────────────────────┘

Apache worker/event MPM: threads within processes

Apache's worker MPM combines processes and threads: multiple processes each run multiple threads. This balances isolation with efficiency.

# Apache worker configuration
<IfModule mpm_worker_module>
    StartServers             2      # Initial processes
    MinSpareThreads         25      # Minimum idle threads (across all processes)
    MaxSpareThreads         75      # Maximum idle threads
    ThreadsPerChild         25      # Threads per process
    MaxRequestWorkers      150      # Max concurrent requests
</IfModule>

Advantages

Lower memory than process-per-request
Faster context switching
Shared caches and connection pools
Better resource utilization

Disadvantages

Thread safety issues (race conditions, deadlocks)
One crash can kill all threads in the process
Still limited by thread count
Stack memory per thread (~1MB default)

Event-Driven (Async I/O)

The modern approach: instead of dedicating a thread to each connection, a single thread manages thousands of connections using non-blocking I/O and an event loop.

┌─────────────────────────────────────────────────────────────┐
│                     Event Loop Thread                       │
│                                                             │
│   ┌─────────────────────────────────────────────────────┐   │
│   │                    epoll / kqueue                    │   │
│   │                                                      │   │
│   │   Ready: [conn_42, conn_187, conn_5, conn_891]       │   │
│   └─────────────────────────────────────────────────────┘   │
│                            │                                │
│              ┌─────────────┴─────────────┐                  │
│              ▼                           ▼                  │
│   ┌─────────────────────┐     ┌─────────────────────┐      │
│   │   Handle conn_42    │     │   Handle conn_187   │      │
│   │   (read request)    │     │   (send response)   │      │
│   └─────────────────────┘     └─────────────────────┘      │
│                                                             │
│   10,000+ connections managed by one thread                 │
└─────────────────────────────────────────────────────────────┘

Nginx/Node.js: event-driven, non-blocking I/O

Nginx and Node.js use this model. The server never blocks waiting for a single connection—it continuously processes whichever connections are ready.

# Nginx worker configuration
worker_processes auto;           # One worker per CPU core
worker_connections 10000;        # Connections per worker

# With 8 cores: 8 × 10,000 = 80,000 concurrent connections

Advantages

Massive concurrency (tens of thousands)
Very low memory per connection
No context switching overhead
Excellent for I/O-bound workloads

Disadvantages

CPU-bound work blocks all connections
Complex programming model (callbacks, promises)
Debugging is harder (no stack per request)
Must avoid blocking operations

The blocking trap: In an event-driven server, a single blocking operation (synchronous file read, CPU-intensive computation) stalls all connections. Node.js developers learn this the hard way when they accidentally use fs.readFileSync() in a request handler.

The C10K Problem

In 1999, Dan Kegel posed the "C10K problem": how can a single server handle 10,000 concurrent connections? At the time, this seemed impossibly ambitious. The answer was the event-driven architecture.

Architecture	Memory for 10K connections	Practical limit
Process-per-request	~300 GB (30MB × 10K)	~500 connections
Thread-per-request	~10 GB (1MB × 10K)	~2,000 connections
Event-driven	~100 MB (~10KB × 10K)	100,000+ connections

Today we talk about C100K and C1M—handling hundreds of thousands or millions of concurrent connections. Event-driven architecture makes this possible, but the choice of architecture depends on your workload, not just connection count.

Choosing the Right Architecture

There's no universally "best" architecture. The right choice depends on your workload characteristics:

Workload	Best Architecture	Why
High concurrency, I/O-bound	Event-driven (Nginx)	Minimal overhead per connection
CPU-intensive processing	Thread/process pool	Utilize multiple cores without blocking
Legacy non-thread-safe code	Process-per-request	Process isolation prevents race conditions
Mixed workload	Hybrid (Nginx + app server)	Right tool for each job

The Hybrid Approach

Most production systems combine architectures. A common pattern:

                        Internet
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│              Nginx (Event-driven)                           │
│                                                             │
│   • TLS termination        • Static file serving           │
│   • Compression            • Rate limiting                  │
│   • Load balancing         • Connection pooling             │
└──────────────────────────┬──────────────────────────────────┘
                           │
         ┌─────────────────┼─────────────────┐
         │                 │                 │
         ▼                 ▼                 ▼
   ┌───────────┐     ┌───────────┐     ┌───────────┐
   │  Node.js  │     │  Node.js  │     │  Node.js  │
   │  Process  │     │  Process  │     │  Process  │
   │           │     │           │     │           │
   │ App Logic │     │ App Logic │     │ App Logic │
   └───────────┘     └───────────┘     └───────────┘

Nginx handles connections, Node.js handles application logic

Nginx excels at connection handling, TLS, and static files. The application server (Node.js, Python, Ruby) handles business logic. Each component does what it's best at.

Why not just Node.js alone? You can run Node.js directly, but you lose Nginx's battle-tested defaults for rate limiting, request size limits, timeout handling, and security hardening. More on this in Tutorial 13: The Node.js Model.

What's Next

Understanding architecture helps you make informed decisions about server selection and configuration. The next tutorial covers the practical side: configuring servers to serve your sites using virtual hosts, document roots, and MIME types.