How to use this page
For each component below, there are three levels: Can mention it,
Can design with it, and Can go deep on it. These
map loosely to mid-level, senior, and staff expectations in an interview.
The goal is not to teach you these topics. It is to tell you the specific vocabulary
and concepts you need to have ready. For each one you feel weak on, Hello Interview,
DDIA, or a targeted search will fill the gap faster than re-reading this page.
At senior level, you need to be solidly in the "Can design with it" tier for all of
these, and "Can go deep" on at least the components most relevant to your target
companies. At staff level, expect follow-up questions that push into the "Can go deep"
tier for almost everything.
Kafka / Message Queues
| Level | What to know |
| Can mention it | Distributed message broker. Producers write to topics, consumers read from them. Decouples services so they do not need to call each other directly. |
| Can design with it | Topics, partitions, consumer groups, offsets. Partition count drives parallelism — more partitions means more consumers can process in parallel. Why Kafka over a direct API call: durability, replay, fan-out, absorbing traffic spikes. Kafka vs. SQS/SNS: SQS is simpler and managed, Kafka is better for high throughput and replay. Redis Pub/Sub: not durable, fine for ephemeral notifications. |
| Can go deep on it | Replication factor and ISR (in-sync replicas). Leader election on broker failure. At-least-once vs. exactly-once delivery semantics (and how idempotent producers help). Log compaction for maintaining latest state per key. CDC (change data capture) with Debezium: streams row-level changes from a database into Kafka without polling. Consumer lag monitoring. Why you do not want too many partitions (overhead per partition). |
Redis / Caching
| Level | What to know |
| Can mention it | In-memory key-value store. Used to cache frequently read data so you do not hit the database on every request. |
| Can design with it | Cache-aside pattern (application checks cache first, falls back to DB on miss, then populates cache). TTL for expiry. Write-through (write to cache and DB together) vs. write-behind (write to cache, flush to DB async). When caching helps: read-heavy workloads with low write rates. When it hurts: high write rates or where consistency is critical. Know that cache invalidation is where most bugs live. |
| Can go deep on it | Redis data structures: sorted sets for leaderboards and rate limiting, hashes for object storage, pub/sub for ephemeral notifications, streams as a lightweight alternative to Kafka. Eviction policies: LRU, LFU, volatile-ttl. Redis Sentinel vs. Redis Cluster (Sentinel for HA failover, Cluster for horizontal sharding). Distributed locking with SETNX and expiry (and why it is imperfect). Hot key problem: one key getting hammered, solutions include local in-process caching and key sharding. |
Load Balancer
| Level | What to know |
| Can mention it | Distributes incoming traffic across multiple server instances to improve throughput and availability. |
| Can design with it | Algorithms: round-robin, least connections, IP hash (for sticky sessions). Health checks to remove unhealthy instances. Layer 4 (TCP/UDP, faster, simpler) vs. Layer 7 (HTTP-aware, can route by path or header). Why sticky sessions are a problem: they break horizontal scaling; prefer stateless services with session state in Redis instead. |
| Can go deep on it | Global load balancing via DNS with geo-routing (Route 53 latency-based routing). SSL/TLS termination at the load balancer to offload crypto from app servers. Connection draining during deployments: let in-flight requests finish before pulling an instance. Anycast for routing requests to the nearest data center. Difference between a load balancer and an API gateway (LB is infrastructure-level traffic distribution; API gateway understands application semantics like auth, routing by endpoint, and rate limiting). |
CDN
| Level | What to know |
| Can mention it | Content delivery network. Caches static assets at edge nodes geographically close to users to reduce latency. |
| Can design with it | What to CDN: static files, images, video, public API responses that are the same for everyone. What not to CDN: personalized content, anything requiring auth checks per request. Pull CDN (edge fetches from origin on first miss) vs. push CDN (you push content to edge proactively). Cache-control headers and TTL. Cache invalidation: harder than it sounds, often requires versioned URLs or explicit purge APIs. |
| Can go deep on it | Edge computing (running logic at the CDN edge, e.g., Cloudflare Workers). CDN for video streaming: range requests, adaptive bitrate. Why CDNs also provide DDoS protection (traffic is absorbed across many edge nodes before reaching origin). The tradeoff between long TTLs (better performance, harder to update) and short TTLs (easier to update, more origin load). Multi-CDN strategies for redundancy. |
SQL Databases
| Level | What to know |
| Can mention it | Relational database with ACID guarantees. Structured schema, joins, transactions. Default choice for most transactional data. |
| Can design with it | Indexes: B-tree for range queries and equality, hash for equality only. Read replicas for scaling reads. Connection pooling (PgBouncer). When to reach for SQL vs. NoSQL: SQL for complex queries, joins, transactions; NoSQL for massive scale with known access patterns. Primary key vs. unique index vs. composite index. The N+1 query problem and how to solve it with eager loading. |
| Can go deep on it | MVCC (multi-version concurrency control): how databases allow concurrent reads and writes without locking. Isolation levels: read uncommitted, read committed, repeatable read, serializable — and which anomalies each prevents (dirty reads, non-repeatable reads, phantom reads). WAL (write-ahead log): the basis of durability and replication. Sharding: horizontal partitioning by hash or range key. Why you avoid sharding until necessary: it eliminates cross-shard joins and transactions. Avoid saying "just shard it" without acknowledging the operational cost. |
NoSQL (DynamoDB / MongoDB)
| Level | What to know |
| Can mention it | Schema-flexible databases optimized for specific, high-volume access patterns. Trade query flexibility for scale and performance. |
| Can design with it | DynamoDB: partition key (determines shard), sort key (enables range queries within a partition). GSIs (global secondary indexes) for querying by non-key attributes. Eventual consistency vs. strong consistency (strong reads cost more). Single-table design: model multiple entity types in one table using composite key patterns. MongoDB: documents as JSON objects, collections, flexible schema. When to use NoSQL: you know your access patterns upfront and they are simple; you need to scale writes horizontally beyond what a single SQL primary can handle. |
| Can go deep on it | Hot partition problem in DynamoDB: too many writes to one partition key causes throttling. Solutions: add random suffix to key (write sharding), use a high-cardinality key, or use DynamoDB's adaptive capacity. DynamoDB Streams for CDC into Kafka or Lambda. MongoDB replica sets, write concern (how many replicas must acknowledge before write returns), read preference (primary vs. secondary reads). Wide-column stores (Cassandra, HBase) vs. document stores vs. key-value: when each is appropriate. CAP theorem: in a partition, you choose consistency or availability — most NoSQL systems choose availability. |
Blob Storage (S3)
| Level | What to know |
| Can mention it | Object storage for unstructured data: images, videos, documents, backups. Essentially infinitely scalable, cheap, and durable. |
| Can design with it | Pre-signed URLs: generate a short-lived URL that lets a client upload directly to S3 without routing through your server. This is the correct pattern for file upload — do not proxy large files through your application. Multipart upload for files over a few hundred MB. Storage classes: Standard, Infrequent Access, Glacier — use lifecycle policies to move objects between them automatically. Bucket policies and IAM roles for access control. |
| Can go deep on it | S3's strong read-after-write consistency (guaranteed since late 2020 — no longer eventual). Event notifications: S3 can trigger Lambda or SQS on object creation, useful for async processing pipelines (e.g., image uploaded → resize job queued). Versioning and object lock for compliance use cases. Cross-region replication for disaster recovery. Difference between S3 and a database: S3 is not queryable, there is no indexing, you can only get objects by key. If you need to search across metadata, maintain a separate index in a database. |
Rate Limiting
| Level | What to know |
| Can mention it | Throttles request rates to protect services from abuse and prevent individual clients from overwhelming shared resources. |
| Can design with it | Token bucket (allows bursting up to a limit, then refills at a fixed rate) vs. fixed window (counts requests in a time window, resets at boundary — vulnerable to burst at window edge) vs. sliding window (smoother, no burst at boundary). Implementation: Redis INCR + EXPIRE for distributed rate limiting across multiple application nodes. Where to apply: API gateway, application layer, or both. Rate limit by user ID, by IP, or by API key depending on what you are protecting. |
| Can go deep on it | The boundary burst problem with fixed windows: a client can make 2x the intended limit by sending requests at the end of one window and the start of the next. Sliding window log (precise but memory-intensive) vs. sliding window counter (approximation, more efficient). Rate limit headers: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, Retry-After. Graceful degradation vs. hard rejection: returning cached or degraded responses instead of 429s for non-critical endpoints. Distinguishing between per-user limits (fairness) and global limits (protection). |
Search (Elasticsearch)
| Level | What to know |
| Can mention it | Full-text search engine built on an inverted index. Used when SQL LIKE queries are too slow or too limited for the search requirements. |
| Can design with it | Inverted index: maps terms to the documents that contain them, enabling fast full-text lookup. Why not use SQL for search: LIKE queries do not use indexes, cannot rank by relevance, and do not handle stemming or synonyms. The pattern: primary data lives in your main database, a sync process (CDC or async event) keeps Elasticsearch in sync. Elasticsearch is not your source of truth — it is a read-optimized replica. Documents, indices, fields. Relevance scoring (BM25 by default). |
| Can go deep on it | Shards and replicas in an ES cluster: primary shards for write distribution, replicas for read scaling and HA. Near-real-time search: there is roughly a 1-second delay between indexing a document and it being searchable (due to segment refresh). Index refresh vs. flush (refresh makes docs visible, flush persists to disk). Why ES is not a good primary database: no ACID transactions, schema changes are painful, and consistency guarantees are weaker. When to use a vector database instead (for semantic/similarity search using embeddings rather than keyword matching). |
API Gateway
| Level | What to know |
| Can mention it | Single entry point for external API traffic. Handles cross-cutting concerns so individual services do not have to implement them independently. |
| Can design with it | What an API gateway handles: authentication/authorization, rate limiting, request routing, SSL termination, request/response transformation, logging. The difference from a load balancer: a load balancer distributes traffic based on network-level criteria; an API gateway understands application semantics (routes /users to the user service, /orders to the order service, validates JWTs). Examples: AWS API Gateway, Kong, nginx as an API gateway. |
| Can go deep on it | BFF (backend for frontend) pattern: a separate API gateway per client type (mobile, web, third-party) that aggregates and transforms data for that client's specific needs, rather than forcing clients to make many microservice calls. Service mesh vs. API gateway: a service mesh (Istio, Linkerd) handles service-to-service communication inside the cluster (mTLS, retries, circuit breaking); an API gateway handles external-to-internal traffic. Circuit breaker pattern: if a downstream service is failing, the gateway stops sending requests to it for a period rather than cascading failures. GraphQL federation as an alternative gateway pattern for complex service graphs. |