System Design Components | Senior Interview Guide

How to use this page

For each component below, there are three levels: Can mention it, Can design with it, and Can go deep on it. These map loosely to mid-level, senior, and staff expectations in an interview.

The goal is not to teach you these topics. It is to tell you the specific vocabulary and concepts you need to have ready. For each one you feel weak on, Hello Interview, DDIA, or a targeted search will fill the gap faster than re-reading this page.

At senior level, you need to be solidly in the "Can design with it" tier for all of these, and "Can go deep" on at least the components most relevant to your target companies. At staff level, expect follow-up questions that push into the "Can go deep" tier for almost everything.

Kafka / Message Queues

Level	What to know
Can mention it	Distributed message broker. Producers write to topics, consumers read from them. Decouples services so they do not need to call each other directly.
Can design with it	Topics, partitions, consumer groups, offsets. Partition count drives parallelism — more partitions means more consumers can process in parallel. Why Kafka over a direct API call: durability, replay, fan-out, absorbing traffic spikes. Kafka vs. SQS/SNS: SQS is simpler and managed, Kafka is better for high throughput and replay. Redis Pub/Sub: not durable, fine for ephemeral notifications.
Can go deep on it	Replication factor and ISR (in-sync replicas). Leader election on broker failure. At-least-once vs. exactly-once delivery semantics (and how idempotent producers help). Log compaction for maintaining latest state per key. CDC (change data capture) with Debezium: streams row-level changes from a database into Kafka without polling. Consumer lag monitoring. Why you do not want too many partitions (overhead per partition).

Redis / Caching

Level	What to know
Can mention it	In-memory key-value store. Used to cache frequently read data so you do not hit the database on every request.
Can design with it	Cache-aside pattern (application checks cache first, falls back to DB on miss, then populates cache). TTL for expiry. Write-through (write to cache and DB together) vs. write-behind (write to cache, flush to DB async). When caching helps: read-heavy workloads with low write rates. When it hurts: high write rates or where consistency is critical. Know that cache invalidation is where most bugs live.
Can go deep on it	Redis data structures: sorted sets for leaderboards and rate limiting, hashes for object storage, pub/sub for ephemeral notifications, streams as a lightweight alternative to Kafka. Eviction policies: LRU, LFU, volatile-ttl. Redis Sentinel vs. Redis Cluster (Sentinel for HA failover, Cluster for horizontal sharding). Distributed locking with SETNX and expiry (and why it is imperfect). Hot key problem: one key getting hammered, solutions include local in-process caching and key sharding.

Load Balancer

Level	What to know
Can mention it	Distributes incoming traffic across multiple server instances to improve throughput and availability.
Can design with it	Algorithms: round-robin, least connections, IP hash (for sticky sessions). Health checks to remove unhealthy instances. Layer 4 (TCP/UDP, faster, simpler) vs. Layer 7 (HTTP-aware, can route by path or header). Why sticky sessions are a problem: they break horizontal scaling; prefer stateless services with session state in Redis instead.
Can go deep on it	Global load balancing via DNS with geo-routing (Route 53 latency-based routing). SSL/TLS termination at the load balancer to offload crypto from app servers. Connection draining during deployments: let in-flight requests finish before pulling an instance. Anycast for routing requests to the nearest data center. Difference between a load balancer and an API gateway (LB is infrastructure-level traffic distribution; API gateway understands application semantics like auth, routing by endpoint, and rate limiting).

CDN

Level	What to know
Can mention it	Content delivery network. Caches static assets at edge nodes geographically close to users to reduce latency.
Can design with it	What to CDN: static files, images, video, public API responses that are the same for everyone. What not to CDN: personalized content, anything requiring auth checks per request. Pull CDN (edge fetches from origin on first miss) vs. push CDN (you push content to edge proactively). Cache-control headers and TTL. Cache invalidation: harder than it sounds, often requires versioned URLs or explicit purge APIs.
Can go deep on it	Edge computing (running logic at the CDN edge, e.g., Cloudflare Workers). CDN for video streaming: range requests, adaptive bitrate. Why CDNs also provide DDoS protection (traffic is absorbed across many edge nodes before reaching origin). The tradeoff between long TTLs (better performance, harder to update) and short TTLs (easier to update, more origin load). Multi-CDN strategies for redundancy.

SQL Databases

Level	What to know
Can mention it	Relational database with ACID guarantees. Structured schema, joins, transactions. Default choice for most transactional data.
Can design with it	Indexes: B-tree for range queries and equality, hash for equality only. Read replicas for scaling reads. Connection pooling (PgBouncer). When to reach for SQL vs. NoSQL: SQL for complex queries, joins, transactions; NoSQL for massive scale with known access patterns. Primary key vs. unique index vs. composite index. The N+1 query problem and how to solve it with eager loading.
Can go deep on it	MVCC (multi-version concurrency control): how databases allow concurrent reads and writes without locking. Isolation levels: read uncommitted, read committed, repeatable read, serializable — and which anomalies each prevents (dirty reads, non-repeatable reads, phantom reads). WAL (write-ahead log): the basis of durability and replication. Sharding: horizontal partitioning by hash or range key. Why you avoid sharding until necessary: it eliminates cross-shard joins and transactions. Avoid saying "just shard it" without acknowledging the operational cost.

NoSQL (DynamoDB / MongoDB)

Level	What to know
Can mention it	Schema-flexible databases optimized for specific, high-volume access patterns. Trade query flexibility for scale and performance.
Can design with it	DynamoDB: partition key (determines shard), sort key (enables range queries within a partition). GSIs (global secondary indexes) for querying by non-key attributes. Eventual consistency vs. strong consistency (strong reads cost more). Single-table design: model multiple entity types in one table using composite key patterns. MongoDB: documents as JSON objects, collections, flexible schema. When to use NoSQL: you know your access patterns upfront and they are simple; you need to scale writes horizontally beyond what a single SQL primary can handle.
Can go deep on it	Hot partition problem in DynamoDB: too many writes to one partition key causes throttling. Solutions: add random suffix to key (write sharding), use a high-cardinality key, or use DynamoDB's adaptive capacity. DynamoDB Streams for CDC into Kafka or Lambda. MongoDB replica sets, write concern (how many replicas must acknowledge before write returns), read preference (primary vs. secondary reads). Wide-column stores (Cassandra, HBase) vs. document stores vs. key-value: when each is appropriate. CAP theorem: in a partition, you choose consistency or availability — most NoSQL systems choose availability.

Blob Storage (S3)

Level	What to know
Can mention it	Object storage for unstructured data: images, videos, documents, backups. Essentially infinitely scalable, cheap, and durable.
Can design with it	Pre-signed URLs: generate a short-lived URL that lets a client upload directly to S3 without routing through your server. This is the correct pattern for file upload — do not proxy large files through your application. Multipart upload for files over a few hundred MB. Storage classes: Standard, Infrequent Access, Glacier — use lifecycle policies to move objects between them automatically. Bucket policies and IAM roles for access control.
Can go deep on it	S3's strong read-after-write consistency (guaranteed since late 2020 — no longer eventual). Event notifications: S3 can trigger Lambda or SQS on object creation, useful for async processing pipelines (e.g., image uploaded → resize job queued). Versioning and object lock for compliance use cases. Cross-region replication for disaster recovery. Difference between S3 and a database: S3 is not queryable, there is no indexing, you can only get objects by key. If you need to search across metadata, maintain a separate index in a database.

Rate Limiting

Level	What to know
Can mention it	Throttles request rates to protect services from abuse and prevent individual clients from overwhelming shared resources.
Can design with it	Token bucket (allows bursting up to a limit, then refills at a fixed rate) vs. fixed window (counts requests in a time window, resets at boundary — vulnerable to burst at window edge) vs. sliding window (smoother, no burst at boundary). Implementation: Redis INCR + EXPIRE for distributed rate limiting across multiple application nodes. Where to apply: API gateway, application layer, or both. Rate limit by user ID, by IP, or by API key depending on what you are protecting.
Can go deep on it	The boundary burst problem with fixed windows: a client can make 2x the intended limit by sending requests at the end of one window and the start of the next. Sliding window log (precise but memory-intensive) vs. sliding window counter (approximation, more efficient). Rate limit headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After. Graceful degradation vs. hard rejection: returning cached or degraded responses instead of 429s for non-critical endpoints. Distinguishing between per-user limits (fairness) and global limits (protection).

Search (Elasticsearch)

Level	What to know
Can mention it	Full-text search engine built on an inverted index. Used when SQL LIKE queries are too slow or too limited for the search requirements.
Can design with it	Inverted index: maps terms to the documents that contain them, enabling fast full-text lookup. Why not use SQL for search: LIKE queries do not use indexes, cannot rank by relevance, and do not handle stemming or synonyms. The pattern: primary data lives in your main database, a sync process (CDC or async event) keeps Elasticsearch in sync. Elasticsearch is not your source of truth — it is a read-optimized replica. Documents, indices, fields. Relevance scoring (BM25 by default).
Can go deep on it	Shards and replicas in an ES cluster: primary shards for write distribution, replicas for read scaling and HA. Near-real-time search: there is roughly a 1-second delay between indexing a document and it being searchable (due to segment refresh). Index refresh vs. flush (refresh makes docs visible, flush persists to disk). Why ES is not a good primary database: no ACID transactions, schema changes are painful, and consistency guarantees are weaker. When to use a vector database instead (for semantic/similarity search using embeddings rather than keyword matching).

API Gateway

Level	What to know
Can mention it	Single entry point for external API traffic. Handles cross-cutting concerns so individual services do not have to implement them independently.
Can design with it	What an API gateway handles: authentication/authorization, rate limiting, request routing, SSL termination, request/response transformation, logging. The difference from a load balancer: a load balancer distributes traffic based on network-level criteria; an API gateway understands application semantics (routes /users to the user service, /orders to the order service, validates JWTs). Examples: AWS API Gateway, Kong, nginx as an API gateway.
Can go deep on it	BFF (backend for frontend) pattern: a separate API gateway per client type (mobile, web, third-party) that aggregates and transforms data for that client's specific needs, rather than forcing clients to make many microservice calls. Service mesh vs. API gateway: a service mesh (Istio, Linkerd) handles service-to-service communication inside the cluster (mTLS, retries, circuit breaking); an API gateway handles external-to-internal traffic. Circuit breaker pattern: if a downstream service is failing, the gateway stops sending requests to it for a period rather than cascading failures. GraphQL federation as an alternative gateway pattern for complex service graphs.

← Previous System Design Next → Interview Process

Last updated: May 2026