Case Study: Rate Limiter¶
A rate limiter controls the number of requests a user/client can make within a time window. Protects against abuse, DDoS, and ensures fair usage. Common algorithms: Token Bucket (most popular — smooth traffic), Sliding Window (precise), Fixed Window (simple but bursty at edges). Typically implemented at the API Gateway layer using Redis for distributed counting.
System Design¶
Step 1: Requirements
Functional: - Limit requests per user/IP/API key - Configurable rules (e.g., 100 req/min, 1000 req/hour) - Return 429 (Too Many Requests) when exceeded - Include rate limit headers in response
Non-Functional: - Low latency (must not slow down requests) - Distributed (works across multiple servers) - Fault-tolerant (if limiter fails, allow traffic)
Step 2: Algorithms
Token Bucket:
Bucket: max capacity = 10 tokens
Refill rate = 1 token/second
Request arrives:
tokens > 0? → Allow, decrement
tokens = 0? → Reject (429)
Sliding Window Log:
Store timestamp of each request in sorted set
On request: remove entries older than window
Count remaining: < limit → allow, else reject
Fixed Window Counter:
Window: 1 minute (e.g., 12:00-12:01)
Counter per window: increment on each request
Counter >= limit → reject
Sliding Window Counter (hybrid):
Step 3: Distributed Implementation (Redis)
public boolean isAllowed(String userId) {
String key = "rate_limit:" + userId;
long count = redis.incr(key);
if (count == 1) {
redis.expire(key, 60); // 60 second window
}
return count <= LIMIT;
}
Token bucket in Redis (Lua script for atomicity):
local tokens = tonumber(redis.call("get", KEYS[1]) or ARGV[2])
local last_refill = tonumber(redis.call("get", KEYS[2]) or ARGV[3])
local now = tonumber(ARGV[3])
local elapsed = now - last_refill
tokens = math.min(ARGV[2], tokens + elapsed * ARGV[4]) -- refill
if tokens >= 1 then
tokens = tokens - 1
redis.call("set", KEYS[1], tokens)
redis.call("set", KEYS[2], now)
return 1 -- allowed
end
return 0 -- rejected
Step 4: Response Headers
Step 5: Where to Place
- API Gateway: Most common, centralized
- Middleware: Per-service rate limiting
- Client-side: Unreliable (can be bypassed)
Common Follow-up Questions
- How do you handle rate limiting in a distributed system?
- What happens when the rate limiter (Redis) goes down?
- How do you implement different limits for different API tiers?
- How would you rate-limit by IP vs by user?
- What is the difference between rate limiting and throttling?