Building Blocks of Large-Scale Systems

The good news: nearly every large system you'll be asked to design is built from the same dozen primitives. The skill is knowing what each one does, when to reach for it, and what tradeoffs it forces.

Load balancers

Sit in front of your service and distribute traffic. Strategies:

Round-robin — simple, fine for stateless services.
Least-connections — better when request times vary.
Consistent hashing — same key always hits the same server. Critical for caches and stateful services.

Mention health checks, sticky sessions (for stateful services), and L4 vs L7 differences if asked.

Caches

Cheaper and faster than the database. The questions you should always answer:

Where? Client, CDN, in-memory app cache, or distributed cache (Redis, Memcached).
What's the eviction policy? LRU is the default; LFU and TTL also common.
What's the invalidation strategy? Write-through (consistent, slow), write-behind (fast, eventual), or TTL-only (simplest, accepts staleness).

A cache hit ratio of 80%+ is what you're aiming for in most read-heavy systems.

Databases

The first decision: SQL or NoSQL?

SQL (Postgres, MySQL) — strong consistency, joins, transactions. Scales vertically until it doesn't.
NoSQL — splits into four flavors:
- Key-value (Redis, DynamoDB) — simplest, fastest.
- Document (MongoDB) — flexible schema.
- Column-family (Cassandra) — high write throughput, eventual consistency.
- Graph (Neo4j) — for relationship-heavy data.

Sharding by user ID is the default for horizontal scaling. Mention replication (primary-replica, multi-primary) for read scaling and durability.

Message queues

Decouple producers from consumers, smooth out load spikes, enable async work.

Kafka — log-based, high throughput, durable, replayable. Use for event streaming.
RabbitMQ / SQS — task-queue style, simpler semantics. Use for job processing.

Mention at-least-once vs exactly-once delivery, idempotent consumers, and dead-letter queues.

CDNs and object storage

CDN pushes static assets to the edge — closer to users, cheaper than serving from origin. Object storage (S3, GCS) holds large blobs cheaply with high durability.

The pattern: store the blob in object storage, serve it through a CDN, keep the metadata in a regular database.

Search

Don't try to use SQL LIKE for real search. Reach for Elasticsearch / OpenSearch for full-text search and aggregations. Index asynchronously from your primary store via a message queue.

Putting them together

The skeleton of most systems looks like:

Client → CDN → Load Balancer → App Servers → Cache → Database
                                          ↓
                                    Message Queue → Workers → Object Storage / Search

Once you can sketch that diagram from memory, you can adapt it to almost any system design question.