Docs / System Design Interview / Case Studies

Case Studies

Each of the systems below has appeared in dozens of interviews. The point isn't to memorize a "right answer" — it's to see how the building blocks compose into a real design, and to internalize the tradeoffs.

Design a URL shortener (bit.ly)

Core requirements: short URL ↔ long URL mapping, redirect, analytics.

Sketch:

  • Hash the long URL or use a base-62 encoded counter for the short code.
  • Store mappings in a key-value store (DynamoDB, Cassandra) — read-heavy, eventually consistent is fine.
  • Cache hot URLs in Redis.
  • Async write to a separate analytics pipeline (Kafka → batch processor) for click tracking.

Tradeoffs to mention: collision handling for hashes, whether codes are reusable, how to expire URLs.

Design a news feed (Twitter/Facebook)

Core requirements: post a tweet, follow users, see your feed in (near) real-time.

Sketch:

  • Fanout-on-write: when user A tweets, push to every follower's feed. Fast reads, expensive writes — bad for celebrities.
  • Fanout-on-read: when user A loads their feed, query each followee. Slow reads, cheap writes.
  • Hybrid: fanout-on-write for normal users, fanout-on-read for celebrities. Most real systems do this.

Building blocks: Cassandra for tweets, Redis for feeds, Kafka for fanout, ML service for ranking.

Design a chat app (WhatsApp/Messenger)

Core requirements: send messages, deliver in real-time, work offline, support group chats.

Sketch:

  • WebSocket or long-poll for real-time delivery; fall back to push notifications when offline.
  • Per-user message queue. Fan out to every group member on send.
  • Store messages in a wide-column store (Cassandra) keyed by (conversation_id, timestamp).
  • For end-to-end encryption: clients hold keys, server only routes opaque payloads.

Tradeoffs: read receipts vs privacy, message ordering in groups (vector clocks), media attachments via object storage + CDN.

Design a video streaming service (YouTube/Netflix)

Core requirements: upload, transcode, stream at multiple resolutions.

Sketch:

  • Upload to object storage (S3).
  • Async transcoding pipeline (Kafka → workers) outputs HLS/DASH segments at multiple bitrates.
  • Serve via CDN with adaptive bitrate streaming.
  • Recommendations and search via separate services backed by Elasticsearch and an ML pipeline.

Tradeoffs: storage cost vs catalog size, hot vs cold content tiers, regional CDN strategy.

Design a ride-sharing service (Uber)

Core requirements: match riders to nearby drivers, real-time ETA, payments.

Sketch:

  • Geospatial index (Geohash, Quadtree, or Redis GEO) for nearby driver lookup.
  • WebSocket connections for both riders and drivers, pushing location updates.
  • Dispatch service runs the matching algorithm — usually some variant of Hungarian or greedy with constraints.
  • Payments via a separate service with idempotent operations and a write-ahead log.

Tradeoffs: matching latency vs match quality, handling driver churn during a ride, surge pricing fairness.

Design a rate limiter

Core requirements: limit each user to N requests per minute, work across multiple servers, low latency.

Sketch:

  • Token bucket is the most common algorithm — each user has a bucket of N tokens, refilled at a fixed rate. Each request consumes one.
  • For multi-server, store buckets in Redis with atomic decrement (DECR) and TTL.
  • Fail open vs fail closed during a Redis outage is a tradeoff worth raising explicitly.

Tradeoffs to mention: precision (fixed window vs sliding window vs token bucket), distributed clock skew, what happens to in-flight requests during a config change.

Design a notification system

Core requirements: send push, email, and SMS to millions of users; support batching and templating; honor user preferences.

Sketch:

  • Producers publish notification events to Kafka.
  • A fanout service expands an event into per-user, per-channel deliveries.
  • Each channel has its own worker pool that calls the underlying provider (APNs, FCM, SES, Twilio).
  • A user-preference service is consulted before fanout; failed deliveries go to a retry queue with exponential backoff.

Tradeoffs to mention: at-least-once vs exactly-once delivery, throttling per provider, deduplication of identical notifications, handling provider outages gracefully.

How to study case studies

Don't read passively. Pick one, set a 45-minute timer, and design it from scratch on a whiteboard. Then read the canonical solution and diff it against yours. The diff is where the learning happens.