URL Anatomy

Understanding the structure and components of web addresses

What is a URL?

A URL (Uniform Resource Locator) is a standardized address that points to a specific resource on the web. Every time you click a link, type an address in your browser, or embed an image, you're using a URL.

URLs were invented by Tim Berners-Lee in 1990 as part of the original World Wide Web specification. They remain one of the fundamental technologies that make the web work.

The Five Components

A complete URL can have up to five main components. Let's examine this example:

https://www.example.com:8080/products/widgets?color=blue&size=large#reviews
A URL with all five components visible

The five components are:

  1. Scheme — The protocol (https)
  2. Authority — The host and optional port (www.example.com:8080)
  3. Path — The resource location (/products/widgets)
  4. Query — Parameters (?color=blue&size=large)
  5. Fragment — The anchor (#reviews)

Not every URL has all five parts. The minimum required for a web URL is scheme and authority (e.g., https://example.com).

Scheme (Protocol)

The scheme identifies the protocol used to access the resource. It appears at the beginning of the URL, followed by ://.

Common web schemes include:

  • https — Secure HTTP (encrypted connection)
  • http — Standard HTTP (unencrypted)
  • ftp — File Transfer Protocol
  • file — Local file system access

The scheme tells your browser which protocol rules to follow when requesting the resource. Using the wrong scheme (like http:// when a server requires https://) will cause the request to fail or redirect.

Authority (Host and Port)

The authority section identifies the server hosting the resource. It consists of the host (domain name or IP address) and an optional port number.

Host

The host is typically a domain name like www.example.com or an IP address like 192.168.1.1. Domain names are resolved to IP addresses by the Domain Name System (DNS).

Domain names can have multiple parts:

  • example.com — The main domain
  • www.example.com — A subdomain of example.com
  • blog.example.com — Another subdomain
  • api.v2.example.com — Nested subdomains

Port

The port number follows the host, separated by a colon. It specifies which "door" on the server to connect to.

Default ports are implied and don't need to be specified:

  • http defaults to port 80
  • https defaults to port 443
  • ftp defaults to port 21

You only need to include the port if it differs from the default: https://localhost:3000 specifies port 3000 instead of the default 443.

Path

The path identifies the specific resource on the server. It starts with a forward slash and can include multiple segments:

  • / — The root path (home page)
  • /about — A single segment
  • /products/widgets/blue — Multiple segments representing a hierarchy

Paths often mirror a file system structure, though modern web applications frequently use paths that don't correspond to actual files. For example, /users/42 might represent user data from a database, not a file named "42".

Path Conventions

  • Paths are case-sensitive on most servers (/About/about)
  • Trailing slashes may or may not matter (/products vs /products/)
  • File extensions are optional (/page vs /page.html)

Query String

The query string provides additional parameters to the server. It starts with a question mark (?) and contains key-value pairs separated by ampersands (&).

?search=widgets&category=electronics&page=2

This query string has three parameters:

  • search = widgets
  • category = electronics
  • page = 2

Query strings are commonly used for:

  • Search queries (?q=javascript+tutorials)
  • Filters and sorting (?sort=price&order=asc)
  • Pagination (?page=3&limit=20)
  • Tracking parameters (?utm_source=newsletter)

Fragment Identifier

The fragment (also called hash or anchor) starts with # and identifies a specific section within the resource. Unlike other URL components, the fragment is handled entirely by the browser—it is never sent to the server.

https://example.com/docs/api#authentication

Common uses of fragments:

  • Page sections: Link to a heading (#introduction)
  • Tab content: Show a specific tab (#settings)
  • Single-page apps: Route to different views (#/users/profile)

URL Encoding

URLs can only contain a limited set of characters. Special characters must be percent-encoded (also called URL encoding). This replaces characters with a % followed by their hexadecimal ASCII code.

Common encodings:

  • Space → %20 (or + in query strings)
  • &%26
  • ?%3F
  • #%23
  • /%2F
  • =%3D

For example, to search for "C++ programming":

?search=C%2B%2B%20programming

Reserved vs Unreserved Characters

Unreserved characters don't need encoding:

  • Letters: A-Z, a-z
  • Numbers: 0-9
  • Special: -, _, ., ~

Reserved characters have special meaning in URLs and must be encoded when used as data:

  • Delimiters: :, /, ?, #, [, ], @
  • Sub-delimiters: !, $, &, ', (, ), *, +, ,, ;, =

Putting It Together

Let's parse a complete URL step by step:

https://api.example.com:443/v2/users/search?name=John%20Doe&active=true#results
Component Value Purpose
Scheme https Use secure HTTP protocol
Host api.example.com Connect to the API subdomain
Port 443 Default HTTPS port (could be omitted)
Path /v2/users/search API version 2, users resource, search action
Query name=John%20Doe&active=true Search for active user named "John Doe"
Fragment results Scroll to results section (client-side)