NoSQL Overview¶

NoSQL databases handle unstructured/semi-structured data and scale horizontally. Four types: Key-Value (Redis — caching), Document (MongoDB — flexible schemas), Column-Family (Cassandra — time-series, high write throughput), Graph (Neo4j — relationships). Choose based on data model, query patterns, and consistency vs availability trade-offs (CAP theorem).

Key Concepts¶

NoSQL Types¶

Type	Database	Data Model	Best For
Key-Value	Redis, DynamoDB	key → value	Caching, sessions, counters
Document	MongoDB, CouchDB	key → JSON document	Flexible schemas, CMS
Column-Family	Cassandra, HBase	row → column families	Time-series, IoT, logs
Graph	Neo4j, ArangoDB	nodes + edges	Social networks, recommendations

Deep Dive: CAP Theorem

A distributed system can guarantee at most 2 out of 3:

Consistency — all nodes see the same data at the same time
Availability — every request gets a response
Partition Tolerance — system works despite network failures

      Consistency
       /       \
    CP           CA
   /               \
Partition ──AP── Availability
Tolerance

In practice, network partitions are inevitable, so you choose between: - CP (Consistency + Partition Tolerance): MongoDB, HBase, Redis Cluster - AP (Availability + Partition Tolerance): Cassandra, DynamoDB, CouchDB - CA doesn't exist in distributed systems (no partition tolerance = single node)

Real-world choice: - Banking: CP (consistency > availability) - Social media feed: AP (availability > consistency)

Deep Dive: SQL vs NoSQL

Feature	SQL	NoSQL
Schema	Fixed, predefined	Flexible, schema-less
Scaling	Vertical (bigger server)	Horizontal (more servers)
Relationships	JOINs, foreign keys	Embedded docs / denormalized
Transactions	Full ACID	Eventual consistency (usually)
Query Language	SQL (standardized)	Database-specific
Best for	Complex queries, ACID needs	Large scale, flexible data

When to use SQL: - Complex relationships and JOINs - ACID transactions required - Data structure is well-defined and stable

When to use NoSQL: - Flexible or evolving schema - High write throughput - Horizontal scaling needed - Simple query patterns

Deep Dive: MongoDB (Document Store)

// Document (JSON-like)
{
    "_id": ObjectId("..."),
    "name": "John",
    "email": "john@test.com",
    "orders": [
        { "item": "Laptop", "total": 1200 },
        { "item": "Mouse", "total": 25 }
    ]
}

Key characteristics: - Schema-flexible — different documents can have different fields - Embedded documents — denormalized by default (no JOINs) - Horizontal scaling via sharding - Rich query language

When to use: Content management, user profiles, product catalogs.

Deep Dive: Redis (Key-Value Store)

SET user:1:name "John"             → Simple string
HSET user:1 name "John" age 25     → Hash
LPUSH queue:orders "order123"      → List (queue)
SADD tags:post:1 "java" "spring"   → Set
ZADD leaderboard 100 "player1"     → Sorted set (priority queue)

Use cases: - Caching — store frequently accessed data in-memory - Session store — fast session lookups - Rate limiting — count requests per time window - Pub/sub — real-time messaging - Leaderboards — sorted sets - Distributed locks — SET key value NX EX 30

Spring integration:

@Cacheable(value = "users", key = "#id")
public User findById(Long id) { ... }

Common Interview Questions

What is NoSQL? How is it different from SQL?
Explain the CAP theorem.
What are the four types of NoSQL databases?
When would you choose MongoDB over PostgreSQL?
What is Redis? What are its common use cases?
What is eventual consistency?
How does MongoDB handle relationships?
What is sharding?
When would you use a graph database?
Can you use both SQL and NoSQL in the same application?