NoSQL Overview¶
NoSQL databases handle unstructured/semi-structured data and scale horizontally. Four types: Key-Value (Redis — caching), Document (MongoDB — flexible schemas), Column-Family (Cassandra — time-series, high write throughput), Graph (Neo4j — relationships). Choose based on data model, query patterns, and consistency vs availability trade-offs (CAP theorem).
Key Concepts¶
NoSQL Types¶
| Type | Database | Data Model | Best For |
|---|---|---|---|
| Key-Value | Redis, DynamoDB | key → value | Caching, sessions, counters |
| Document | MongoDB, CouchDB | key → JSON document | Flexible schemas, CMS |
| Column-Family | Cassandra, HBase | row → column families | Time-series, IoT, logs |
| Graph | Neo4j, ArangoDB | nodes + edges | Social networks, recommendations |
Deep Dive: CAP Theorem
A distributed system can guarantee at most 2 out of 3:
- Consistency — all nodes see the same data at the same time
- Availability — every request gets a response
- Partition Tolerance — system works despite network failures
In practice, network partitions are inevitable, so you choose between: - CP (Consistency + Partition Tolerance): MongoDB, HBase, Redis Cluster - AP (Availability + Partition Tolerance): Cassandra, DynamoDB, CouchDB - CA doesn't exist in distributed systems (no partition tolerance = single node)
Real-world choice: - Banking: CP (consistency > availability) - Social media feed: AP (availability > consistency)
Deep Dive: SQL vs NoSQL
| Feature | SQL | NoSQL |
|---|---|---|
| Schema | Fixed, predefined | Flexible, schema-less |
| Scaling | Vertical (bigger server) | Horizontal (more servers) |
| Relationships | JOINs, foreign keys | Embedded docs / denormalized |
| Transactions | Full ACID | Eventual consistency (usually) |
| Query Language | SQL (standardized) | Database-specific |
| Best for | Complex queries, ACID needs | Large scale, flexible data |
When to use SQL: - Complex relationships and JOINs - ACID transactions required - Data structure is well-defined and stable
When to use NoSQL: - Flexible or evolving schema - High write throughput - Horizontal scaling needed - Simple query patterns
Deep Dive: MongoDB (Document Store)
// Document (JSON-like)
{
"_id": ObjectId("..."),
"name": "John",
"email": "john@test.com",
"orders": [
{ "item": "Laptop", "total": 1200 },
{ "item": "Mouse", "total": 25 }
]
}
Key characteristics: - Schema-flexible — different documents can have different fields - Embedded documents — denormalized by default (no JOINs) - Horizontal scaling via sharding - Rich query language
When to use: Content management, user profiles, product catalogs.
Deep Dive: Redis (Key-Value Store)
SET user:1:name "John" → Simple string
HSET user:1 name "John" age 25 → Hash
LPUSH queue:orders "order123" → List (queue)
SADD tags:post:1 "java" "spring" → Set
ZADD leaderboard 100 "player1" → Sorted set (priority queue)
Use cases:
- Caching — store frequently accessed data in-memory
- Session store — fast session lookups
- Rate limiting — count requests per time window
- Pub/sub — real-time messaging
- Leaderboards — sorted sets
- Distributed locks — SET key value NX EX 30
Spring integration:
Common Interview Questions
- What is NoSQL? How is it different from SQL?
- Explain the CAP theorem.
- What are the four types of NoSQL databases?
- When would you choose MongoDB over PostgreSQL?
- What is Redis? What are its common use cases?
- What is eventual consistency?
- How does MongoDB handle relationships?
- What is sharding?
- When would you use a graph database?
- Can you use both SQL and NoSQL in the same application?