Scalability¶
Scalability is the ability to handle increased load. Vertical scaling (scale up) — bigger machine. Horizontal scaling (scale out) — more machines. Key strategies: stateless services, database replication (read replicas), sharding (partition data), caching, async processing. Design for horizontal scaling — it has no ceiling.
Key Concepts¶
Deep Dive: Vertical vs Horizontal Scaling
| Vertical (Scale Up) | Horizontal (Scale Out) | |
|---|---|---|
| How | Bigger machine (CPU, RAM) | More machines |
| Limit | Hardware ceiling | Virtually unlimited |
| Downtime | Usually required | Zero downtime |
| Cost | Expensive at scale | Commodity hardware |
| Complexity | Simple | More complex (distributed) |
Deep Dive: Database Replication
Primary-Replica (Master-Slave):
Benefits: - Read traffic distributed across replicas - High availability (promote replica if primary fails) - Backups from replica without impacting primary
Challenges: - Replication lag — replicas may be slightly behind - Write bottleneck — all writes go to one primary - Consistency — read-after-write may see stale data
Deep Dive: Database Sharding
Split data across multiple databases. Each shard holds a subset.
Sharding strategies: | Strategy | How | Pros | Cons | |----------|-----|------|------| | Range-based | By key range (A-M, N-Z) | Simple | Uneven distribution | | Hash-based | hash(key) % N | Even distribution | Resharding is hard | | Directory-based | Lookup table maps key → shard | Flexible | Lookup table = bottleneck |
Challenges: - Cross-shard queries (JOINs across shards) - Resharding when adding/removing shards - Hot spots (popular shards get more traffic)
Consistent hashing minimizes data movement when adding/removing shards.
Deep Dive: Stateless vs Stateful Services
Stateless — no session data stored on the server. Any server can handle any request.
Stateful — session tied to a specific server.
Stateless is essential for horizontal scaling.
Where to store state: - Database (persistent) - Redis/Memcached (fast, shared) - JWT tokens (client-side)
Deep Dive: Rate Limiting
Protect services from being overwhelmed.
Algorithms: | Algorithm | Description | |-----------|-------------| | Token Bucket | Tokens added at fixed rate, request consumes a token | | Sliding Window Log | Track timestamps of requests in a window | | Sliding Window Counter | Approximate count in sliding window | | Fixed Window Counter | Count requests in fixed time windows |
Implementation:
Common Interview Questions
- What is the difference between vertical and horizontal scaling?
- What is database sharding? What are the strategies?
- What is consistent hashing?
- What is database replication? What is replication lag?
- Why are stateless services important for scaling?
- What is rate limiting? Name some algorithms.
- How would you scale a system from 1K to 1M users?